科研成果

2019
Zhang M, Qiao Y, Wu X, Qu T. Distance-dependent Modeling of Head-related Transfer Functions, in international conference on acoustics speech and signal processing(ICASSP). Brighton, United Kingdom ; 2019:276-280.Abstract
In this paper, a method for modeling distance dependent head-related transfer functions is presented. The HRTFs are first decomposed by spatial principal component analysis. Using deep neural networks, we model the spatial principal component weights of different distances. Then we realize the prediction of HRTFs in arbitrary spatial distances. The objective and subjective experiments are conducted to evaluate the proposed distance model and the distance variation function model, and the results have shown that the proposed model has less spectral distortions than distance variation function model, and the virtual sound generated by the proposed model has better performance in terms of distance localization.
Ge Z, Wu X, Qu T. Improvements to the matching projection decoding method for Ambisonic system with irregular loudspeaker layouts, in international conference on acoustics speech and signal processing(ICASSP). Brighton, United Kingdom; 2019:121-125.Abstract
The Ambisonic technique has been widely used for soundfield recording and reproduction recently. However, the basicAmbisonic decoding method will break down when the play-back loudspeakers distribute unevenly. Various methods havebeen proposed to solve this problem. This paper introducesseveral improvements to a recently proposed Ambisonic de-coding method, the matching projection method, for unevenloudspeaker layouts. The first improvement is energy preserv-ing; the second is introducing the “in-phase” weight, and thethird is introducing partial projection coefficients. To eval-uate the improved method, we compared it with the origi-nal one and the all-round Ambisonic decoding method witha 2-dimension unevenly arranged loudspeaker array. The re-sult shows our method greatly improves the original methodwhere the loudspeaker arranges very sparsely or densely.
Zhang S, Wu X, Qu T. Sparse Autoencoder Based Multiple Audio Objects Coding Method, in 146 AES Convention. Dublin, Ireland; 2019:10172. 访问链接Abstract
The traditional multiple audio objects codec extracts the parameters of each object in the frequency domain and produces serious confusion because of high coincidence degree in subband among objects. This paper uses sparse domain instead of frequency domain and reconstruct audio object using the binary mask from the down-mixed signal based on the sparsity of each audio object. In order to overcome high coincidence degree of subband among different audio objects, the sparse autoencoder neural network is established. On this basis, a multiple audio objects codec system is built up. To evaluate this proposed system, the objective and subjective evaluation are carried on and the results show that the proposed system has the better performance than SAOC.
Huang Q, Liu T, Wu X, Qu T. A Generative Adervasarial Net-based Bandwidth Extension Method for Audio Compression. J. Audio Eng. Soc.,. 2019;67(12):986-993.Abstract
To reduce the burden of storing and transmitting audio signals, they are often compressed with a lossy single-channel code. Because the high-frequency components are effectively truncated when using a low bitrate encoder, listeners may experience the sound as being uncomfortable, muffled, or dull. To compensate for the perceived degradation, bandwidth extension technology can be used to regenerate the missing high frequencies from the low-frequency components during the decoding process. In this paper the authors propose a bandwidth extension method based on Generative Adversarial Networks (GAN), which is used to estimate the relationship between the MDCT spectrum in the high-frequency part and the low-frequency part. It is evaluated by a discriminant network in the GAN to get a more natural result. A complete audio coding system was built by using AAC Low Complex as the single-channel core encoder with the proposed bandwidth extension method. To evaluate the audio quality decoded by the new system, a subjective evaluation experiment was carried out using the HE-AAC as the baseline system with the MUSHRA experimental method.
曲天书, 吴玺宏, 黄炎坤.; 2019. 一种基于多任务学习的端到端声源定位方法及系统. China patent CN ZL201910043338.8.
曲天书, 吴玺宏, 彭超.; 2019. 一种基于波束成形的多说话者语音分离方法及系统. China patent CN ZL201910001150.7.
2018
Huang Q, Wu X, Qu T. A Parametric Spatial Audio Coding Method Based on Convolutional Neural Networks, in 145 AES Convention. New York, USA; 2018:10126. 访问链接
Ge Z, Qiao Y, Wang S, Wu X, Qu T. Subjective Evaluation of Virtual Room Auralization System Based on the Ambisonics Matching Projection Decoding Method, in 145 AES Convention. New York, USA; 2018:10124. 访问链接
Huang Q, Wu X, Qu T. Bandwidth Extension Method Based on Generative Adversarial Nets for Audio Compression., in 144 AES Convention. Milan, Italian; 2018:9954. 访问链接
Gao S, Wu X, Qu T. High order Ambisonics encoding method using differential microphone array, in 144 AES Convention. Milan Italian; 2018:10020. 访问链接
Qu T, Huang Z, Qiao Y, Wu X. Matching Projection Decoding Method for Ambisonics System, in International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). Calgary, Alberta, Canada; 2018:561-565. 访问链接
曲天书, 吴玺宏, 张梦帆.; 2018. 一种基于深度神经网络的个性化头相关传输函数建模方法. China patent CN ZL201810182617.8.Abstract
本发明公开了一种基于深度神经网络的个性化头相关传输函数建模方法。本方法是基于空间主成分分析对HRTF数据进行分解,将分解得到空间主成分、空间主成分系数和平均空间函数分别用神经网络建模,其中,空间主成分和平均空间函数只与空间方向有关,空间主成分系数是频率和被试个性化特征参数的函数;本发明用深层神经网络对空间主成分,平均空间函数和双耳时间差分别建模,将水平角及仰角等空间方向信息引入网络输入层;同时,用神经网络基于人体测量参数对空间主成分系数建模。基于上述模型,可根据被试少量的人体测量参数,得到其在空间任意方向个性化的HRTF。
曲天书, 吴玺宏, 葛钟书.; 2018. 一种基于扬声器阵列的虚拟听觉环境可听化实现方法及系统. China patent CN ZL201810066540.8.
曲天书, 吴玺宏, 高山.; 2018. 一种应用于多声源环境的分频定位方法. China patent CN ZL201810004440.2.
2017
曲天书, 吴玺宏, 黄炎坤.; 2017. 一种基于神经网络的声源定位方法. China patent CN ZL201711428934.5.
Gao S, Wu X, Qu T. The microphone array arrangement method for high order Ambisonics recordings. 7th International Conference on Intelligence Science and Big Data Engineering. 2017;10559:3-10.
曲天书, 吴玺宏, 黄庆博.; 2017. 一种面向频带扩展的生成式对抗网络训练方法及音频编码、解码方法. China patent CN ZL201710992311.4.
Gao Y, Wang Q, Ding Y, Wang C, Li H, Wu X, Qu T, Li L. Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under “Cocktail-Party”Listening Conditions. Frontiers in Human Neuroscience. 2017;11:Artical 34.
曲天书, 吴玺宏, 宋涛.; 2017. 一种基于声传递函数的声源定位方法. China patent CN ZL201710198420.9.
曲天书, 吴玺宏, 黄智超.; 2017. 一种面向不规则扬声器摆放的Ambisonics匹配投影解码方法. China patent CN ZL201710283323.X.

Pages