科研成果 by Type: 期刊论文

2023
Gao S, Wu X, Qu T. A Physical Model-Based Self-Supervised Learning Method for Signal Enhancement Under Reverberant Environment. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2023;31:2100-2110.Abstract
In a reverberant environment, interferences such as reflections and background noise can degrade the perception of the sound source signal. Although the DNN-based methods have made a tremendous breakthrough in addressing this issue, the performance of these models is highly dependent on the completeness of the training dataset, which will limit its generalization under unknown environments. In this article, we propose a physical model-based self-supervised learning (PMSSL) method to realize the DNN model optimization under unknown scenarios. This method incorporates a room reverberation physical model into the sound source enhancement model optimization process, realizing the self-learning of the DNN model under physical constraints. In this process, the time-frequency characteristics of the input signal and the spatial feature of the reverberation environment are utilized for parameter optimization, improving the adaptability of the DNN model under unknown scenarios. Experimental results based on simulated and measured data prove that the proposed method can obtain much more accurate source signal enhancement results compared with the pre-trained models, verifying its effectiveness and adaptability in new environments.
2022
Wang C, Wang Z, Xie B, Shi X, Yang P, Liu L, Qu T, Qin Q, Xing Y, Zhu W, et al. Binaural processing deficit and cognitive impairment in Alzheimer's disease. Alzheimer's & dementia : the journal of the Alzheimer's Association. 2022;28(6):1085-1099.
Gao S, Lin J, Wu X, Qu T. Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process. IEEE/ACM Transactions on Audio, Speech, and Language Processing [Internet]. 2022;30:1124-1135. 访问链接
2021
Ge Z, Li L, Qu T. Partially Matching Projection Decoding Method Evaluation Under Different Playback Conditions. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2021;29:1411-1423.
Fan L, Kong L, Li L, Qu T. Sensitivity to a break in interaural correlation in frequency-gliding noises. Front. Psychol. - Perception Science. 2021.
2020
Zhang M, Ge Z, Liu T, Wu X, Qu T. Modeling of Individual HRTFs Based on Spatial Principal Component Analysis. IEEE Transactions on Audio Speech and Language Processing. 2020;28:785-797.
2019
Huang Q, Liu T, Wu X, Qu T. A Generative Adervasarial Net-based Bandwidth Extension Method for Audio Compression. J. Audio Eng. Soc.,. 2019;67(12):986-993.Abstract
To reduce the burden of storing and transmitting audio signals, they are often compressed with a lossy single-channel code. Because the high-frequency components are effectively truncated when using a low bitrate encoder, listeners may experience the sound as being uncomfortable, muffled, or dull. To compensate for the perceived degradation, bandwidth extension technology can be used to regenerate the missing high frequencies from the low-frequency components during the decoding process. In this paper the authors propose a bandwidth extension method based on Generative Adversarial Networks (GAN), which is used to estimate the relationship between the MDCT spectrum in the high-frequency part and the low-frequency part. It is evaluated by a discriminant network in the GAN to get a more natural result. A complete audio coding system was built by using AAC Low Complex as the single-channel core encoder with the proposed bandwidth extension method. To evaluate the audio quality decoded by the new system, a subjective evaluation experiment was carried out using the HE-AAC as the baseline system with the MUSHRA experimental method.
2017
Gao Y, Wang Q, Ding Y, Wang C, Li H, Wu X, Qu T, Li L. Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under “Cocktail-Party”Listening Conditions. Frontiers in Human Neuroscience. 2017;11:Artical 34.
2015
Kong LZ, Xie ZL, Lu LX, Qu TS, Wu XH, Yan J, Li L. Similar impacts of the interaural delay and interaural correlation on binaural gap detection. PLOS ONE. 2015;10(6):e0126342.
2014
Lei M, Luo L, Qu TS, Jia HX, Li L. Perceived location specificity in perceptual separation-induced but not fear conditioning-induced enhancement of prepulse inhibition in rats. Behavioural Brain Research. 2014;269:87-94.
Gao YY, Cao SY, Qu TS, Wu XH, Li HF, Zhang JS, Li L. Voice-associated static face image releases speech from informational masking. PsyCh Journal. 2014;3:113-120.
吴玺宏, 吕振扬, 高源, 曲天书. 近场结构化头相关传递函数的测量与分析. 数据采集与处理. 2014;29(2):180-185.
2013
Qu TS, Cao SY, Chen X, Huang Y, Li L, Wu XH, Schneider BA. Aging Effects on Detection of Spectral Changes Induced by a Break in Sound Correlation. Ear and Hearing. 2013;34(3):280-287.
2012
He WX, Gao Y, Qu TS. Introduction to AVS Audio Lossless Coding/Decoding Standard. Multimedia communications technical committee. 2012;7(2):21-24.
2011
Huang Y, J.Y.Li, Zou XF, Qu TS, Wu XH, Mao LH, Wu YH, L.Li. Perceptual Fusion Tendency of Speech Sounds. Journal of Cognitive Neuroscience. 2011;23(4):1003-1014.
2010
Qu TS, Cao SW, Wu XH. Relationship between Distance and Binaural Cues on Sound Source Localization. Acta Scientiarum Naturalium Universitatis Pekinensis. 2010;46(06):901-906.
曲天书, 何文欣, 高懿, 张搏, 吴玺宏. 一种基于提升小波变换的音频无损编解码方法. 电声技术. 2010;34(12):65-68.
杨新辉, 舒海燕, 曲天书, 张涛, 窦维蓓. 从有损到无损的音频编解码框架. 电声技术. 2010;34(12):60-64.
2009
Qu TS, Xiao Z, Gong M, Huang Y, Li XD, Wu XH. Distance-dependent Head-related Transfer Functions Measured with High Spatial Resolution Using a Spark Gap. IEEE Trans. on Audio, Speech, and Language Processing [Internet]. 2009;17(6):1124-1132. PKU-IOA HETF DATABASE

Pages