Wang Q, Wang Y, Wang Y, Ying X. Dissecting the Failure of Invariant Learning on Graphs, in Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024.; 2024. 访问链接
Wang Q, Wang Y, Wang Y, Ying X. Dissecting the Failure of Invariant Learning on Graphs, in Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024.; 2024. 访问链接
The multiple-channel[1] sound source enhancement methods have made a great progress in recent years, especially when combined with the learning-based algorithms. However, the performance of these techniques is limited by the completeness of the training dataset, which may degrade in mismatched environments. In this paper, we propose a reconstruction Model based Self-supervised Learning (RMSL) method for sound source enhancement. A reconstruction module is used to integrate the estimated target signal and noise components to regenerate the multi-channel mixed signals, and it is connected with a separating model to form a closed loop.In this case, the optimization of the separation model can be achieved by continuously iterating the separation-reconstruction process. We use the separation error, the reconstruction error, and the signal-noise independence error as lossfunctions in the self-supervised learning process. This method is applied to the state-of-the-art sound source separation model (ADL-MVDR) and evaluated under different scenarios. Experimental results demonstrate that the proposed method can improve the performance of ADL-MVDR algorithm under different number of sound sources, bringing about 0.5 dB to 1 dB Si-SNR gain, while maintaining good clarity and intelligibility in practical application.
Photometric stereo is a well-established technique to estimate the surface normal of an object. However the requirement of capturing multiple high dynamic range images under different illumination conditions limits the speed and real-time applications. This paper introduces EventPS a novel approach to real-time photometric stereo using an event camera. Capitalizing on the exceptional temporal resolution dynamic range and low bandwidth characteristics of event cameras EventPS estimates surface normal only from the radiance changes significantly enhancing data efficiency. EventPS seamlessly integrates with both optimization-based and deep-learning-based photometric stereo techniques to offer a robust solution for non-Lambertian surfaces. Extensive experiments validate the effectiveness and efficiency of EventPS compared to frame-based counterparts. Our algorithm runs at over 30 fps in real-world scenarios unleashing the potential of EventPS in time-sensitive and high-speed downstream applications.
Deep neural networks can be employed for estimating the direction of arrival (DOA) of individual sound sources from audio signals. Existing methods mostly focus on estimating the DOA of each source on individual frames, without utilizing the motion information of the sources. This paper proposes a method for estimating trajectories of sources, leveraging the differential of trajectories across different time scales. Additionally, a neural network is employed for enhancing the trajectories wrongly estimated especially for sound sources with low-energy. Experimental evaluations conducted on simulated dataset validate that the proposed method achieves more precise localization and tracking performance and encounters less interference when the sound source energy is low.