Leveraging Sound Source Trajectories for Universal Sound Separation

Citation:

Wu D, Wu X, Qu T. Leveraging Sound Source Trajectories for Universal Sound Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2025;33:2337-2348.

摘要:

Existing methods utilizing spatial information for sound source separation require prior knowledge of the direction of arrival (DOA) of the source or utilize estimated but imprecise localization results, which impairs the separation performance, especially when the sound sources are moving. In fact, sound source localization and separation are interconnected problems, that is, sound source localization facilitates sound separation while sound separation contributes to refined source localization. This paper proposes a method utilizing the mutual facilitation mechanism between sound source localization and separation for moving sources. The proposed method comprises three stages. The first stage is initial tracking, which tracks each sound source from the audio mixture based on the source signal envelope estimation. These tracking results may lack sufficient accuracy. The second stage involves mutual facilitation: Sound separation is conducted using preliminary sound source tracking results. Subsequently, sound source tracking is performed on the separated signals, thereby refining the tracking precision. The refined trajectories further improve separation performance. This mutual facilitation process can be iterated multiple times. In the third stage, a neural beamformer estimates precise single-channel separation results based on the refined tracking trajectories and multi-channel separation outputs. Simulation experiments conducted under reverberant conditions and with moving sound sources demonstrate that the proposed method can achieve more accurate separation based on refined tracking results.

Qu Tianshu

北京大学智能学院National Key Laboratory of General Artificial Intelligence;School of Intelligence Science and Technology 副教授博士生导师

Leveraging Sound Source Trajectories for Universal Sound Separation

Citation:

摘要:

成果类型

成果概览

最新科研成果

Qu Tianshu

北京大学智能学院National Key Laboratory of General Artificial Intelligence;School of Intelligence Science and Technology 副教授 博士生导师

Citation:

摘要:

成果类型

成果概览

最新科研成果

北京大学智能学院National Key Laboratory of General Artificial Intelligence;School of Intelligence Science and Technology 副教授博士生导师