A Time-domain End-to-End Method for Sound Source Localization Using Multi-Task Learning

Citation:

Huang Y, Wu X, Qu T. A Time-domain End-to-End Method for Sound Source Localization Using Multi-Task Learning, in 2019 IEEE 2nd International Conference on Information Communication and Signal Processing (ICICSP). Weihai, China; 2019:52-56.

摘要:

In recent years, many researches focus on sound source localization based on neural networks, which is an appealing but difficult problem. In this paper, a novel time-domain end-to-end method for sound source localization is proposed, where the model is trained by two strategies with both cross entropy loss and mean square error loss. Based on the idea of multi-task learning, CNN is used as the shared hidden layers to extract features and DNN is used as the output layers for each task. Compared with SRP-PHAT, MUSIC and a DNN-based method, the proposed method has better performance.