DOA-Informed Self-Supervised Learning Method for SoundSource Enhancement

Citation:

Gao S, Wu X, Qu T. DOA-Informed Self-Supervised Learning Method for SoundSource Enhancement, in the AES 156th Convention. Madrid, Spain; 2024:10683.

Date Presented:

15-17 June 2024

摘要:

The multiple-channel[1] sound source enhancement methods have made a great progress in recent years, especially when combined with the learning-based algorithms. However, the performance of these techniques is limited by the completeness of the training dataset, which may degrade in mismatched environments. In this paper, we propose a reconstruction Model based Self-supervised Learning (RMSL) method for sound source enhancement. A reconstruction module is used to integrate the estimated target signal and noise components to regenerate the multi-channel mixed signals, and it is connected with a separating model to form a closed loop.In this case, the optimization of the separation model can be achieved by continuously iterating the separation-reconstruction process. We use the separation error, the reconstruction error, and the signal-noise independence error as lossfunctions in the self-supervised learning process. This method is applied to the state-of-the-art sound source separation model (ADL-MVDR) and evaluated under different scenarios. Experimental results demonstrate that the proposed method can improve the performance of ADL-MVDR algorithm under different number of sound sources, bringing about 0.5 dB to 1 dB Si-SNR gain, while maintaining good clarity and intelligibility in practical application.