Wang Q, Wang Y, Wang Y, Ying X. Dissecting the Failure of Invariant Learning on Graphs, in Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024.; 2024. 访问链接
The multiple-channel[1] sound source enhancement methods have made a great progress in recent years, especially when combined with the learning-based algorithms. However, the performance of these techniques is limited by the completeness of the training dataset, which may degrade in mismatched environments. In this paper, we propose a reconstruction Model based Self-supervised Learning (RMSL) method for sound source enhancement. A reconstruction module is used to integrate the estimated target signal and noise components to regenerate the multi-channel mixed signals, and it is connected with a separating model to form a closed loop.In this case, the optimization of the separation model can be achieved by continuously iterating the separation-reconstruction process. We use the separation error, the reconstruction error, and the signal-noise independence error as lossfunctions in the self-supervised learning process. This method is applied to the state-of-the-art sound source separation model (ADL-MVDR) and evaluated under different scenarios. Experimental results demonstrate that the proposed method can improve the performance of ADL-MVDR algorithm under different number of sound sources, bringing about 0.5 dB to 1 dB Si-SNR gain, while maintaining good clarity and intelligibility in practical application.
Photometric stereo is a well-established technique to estimate the surface normal of an object. However the requirement of capturing multiple high dynamic range images under different illumination conditions limits the speed and real-time applications. This paper introduces EventPS a novel approach to real-time photometric stereo using an event camera. Capitalizing on the exceptional temporal resolution dynamic range and low bandwidth characteristics of event cameras EventPS estimates surface normal only from the radiance changes significantly enhancing data efficiency. EventPS seamlessly integrates with both optimization-based and deep-learning-based photometric stereo techniques to offer a robust solution for non-Lambertian surfaces. Extensive experiments validate the effectiveness and efficiency of EventPS compared to frame-based counterparts. Our algorithm runs at over 30 fps in real-world scenarios unleashing the potential of EventPS in time-sensitive and high-speed downstream applications.
Shi R, Duan L, Huang T, Jiang T. Evidential Uncertainty-Guided Mitochondria Segmentation for 3D EM Images, in Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-. AAAI Press; 2024:4847–4855. 访问链接
Deep neural networks can be employed for estimating the direction of arrival (DOA) of individual sound sources from audio signals. Existing methods mostly focus on estimating the DOA of each source on individual frames, without utilizing the motion information of the sources. This paper proposes a method for estimating trajectories of sources, leveraging the differential of trajectories across different time scales. Additionally, a neural network is employed for enhancing the trajectories wrongly estimated especially for sound sources with low-energy. Experimental evaluations conducted on simulated dataset validate that the proposed method achieves more precise localization and tracking performance and encounters less interference when the sound source energy is low.
This paper considers the connectedness of the two ports-of-call of Amastris and Heraclea Pontica in the eparcheia of Pontus during the Roman principate. Stanford's ORBIS platform offers a heuristic model of connectedness. We find the two ports-of-call the most popular segments along the south for maritime traffic coming from eastern Pontus and the Bosporus.Where the two is most different concerns their connections with the interior. Heraclea Pontica connected Ancyra to the Pontic coast, while Amastris had none. ORBIS is understandably non- granular in the sense that it "restrict[s] coverage to the more important elements of the Roman communication system," but if this is the case, it means that Heraclea Pontica and Amastris were connected in other ways as well, and the Amastrian mountainous interior, which couldbe described as the "previously unconjoined, or at least the previously less well-connected" segment of Anatolia (Horden 2020: 204), could have also been connected with the wider ancient Mediterranean world. Low visibility of settlements beyond known the one known urbanized area in modern Amasra makes discussions of broader connectedness difficult, but at least from recent field survey results suggest that the number and vibrancy of settlements likely increased in the Roman period (Bes 2015: 288-289; Çam et al. 2019; Çam 2021). The question then is whether recent studies contribute to a new assessment of Amastrian connectedness, and how it compares with existing impressions of both Amastris and its peer poleis, with Heraclea Pontica serving as the primary example.Building upon Alexandru Avram's assumption that the aggregate of attestations of persons who have spent time in a city other than their homeland can serve as proxy for gauging their mobility (Avram 2013: 7-8), this paper uses the Prosopographia Ponti Euxini externa to test whether Amastrian connectedness reached currently unknown areas, particularly theinterior. Comparison between Amastrian data (n=136) and Heracleote specimens (n=1101)
may seem disproportionate, but this paper focuses on persons from the first to the third centuries CE and privileges locations instead of volumes so to visualize connectedness in the Roman world. The same concept is applied to persons of locales beyond the two subjects in question – foreigners who left records in Heracleote (n=5) and Amastrian territory (n=11) – and visualized together. In addition, though coins are a poor proxy as they may be transmitted in a variety of ways that do not reflect direct connections between Amastris and the cities that issued them, this paper considers coins from the Amasra Museum as published by Stanley Ireland and Soner Atesogullari (1996) to complement Amastris relatively poor prosopographical record and increase the potential to capture connections. The overall impression gleaned from this exercise is that Amastris could have played a comparable (though potentially less pronounced) role as that of Heraclea Pontica in terms of a hub-like node that connected interior land routes with maritime traffic, particularly for Hadrianopolis and Pompeiopolis (Corsten 2007; Ruscu 2017), but also potentially for centers such as Caesarea in Cappadocia.Bibliography:Avram, A. 2013. Prosopographia Ponti Euxini externa. Leuven.Bes, P. 2015. "The Cide-Şenpazar Region in the Roman Period," in Kinetic Landscapes. The Cide Archaeological Project: Surveying the Turkish Western Black Sea Region, Bleda Düring and Claudia Glatz, eds., Warsaw/Berlin, pp. 260-293.Çam, F. et al. 2019. "New Archaeological Expeditions in the Ancient City of Amastris,"Settlements and Necropoleis of the Black Sea and its Hinterland in Antiquity, Select Papers from the Third International Conference 'The Black Sea in Antiquity and Tekkeköy: An Ancient Settlement on the Southern Black Sea Coast', 27-29 October 2017, Tekkeköy, Samsun, Gocha Tsetskhladze and Sümer Atasoy, eds., Oxford, pp. 190-207.Çam, F. 2022. "Ancient Settlements in Bartin Province: 2017-2019 Research Results," in Bartın İli ve İlçeleri Yüzey Araştırması (Biya) İlk Tespitler ve Belgeler - Paphlagonia'dan Parthenios'a - I, Fatima Çam, ed., Istanbul, pp. 13-112.Corsten, T. 2007. "Prosoporaphische und Onomastische Notizen III," Gephyra 4, pp. 133-144. Horden, P. 2020. "Knitting Together the Unconjoined," Zeitschrift für Ethnologie 145.2 (2020)197-218.Irland, S. and Soner Atesogul. 1996. "The Ancient Coins in the Amasra Museum," in Studies in Ancient Coinage from Turkey, Richard Ashton, ed., London, pp. 115-137.Ruscu, L. 2017. "Über Sex. Vibius Gallus aus Amastris," Journal of Historical Researches 28, pp. 52-68.
The traditional feedback Active Noise Control (ANC) algorithms arebuilt upon linear filters, which leads to reduced performance whendealing with real-world noise. Deep learning-based feedback ANCalgorithms have been proposed to overcome this problem. However,methods relying on pre-trained neural networks exhibit performancedegradation when encountering noise from unseen scenes inthe training dataset. This paper proposed a hybrid deep-online learningbased spatial ANC system which combines online learning withpre-trained deep neural networks. The proposed method can keepthe performance on noise from the trained scenes while improve theperformance of cancelling noise from new scenes. Additionally, byincorporating wave domain decomposition, this paper achieves noisecancellation over a control spatial region. Simulation experimentsvalidate the effectiveness of the combination of online learning anddeep learning in handling previously unseen noise. Furthermore, theefficiency of wave domain decomposition in spatial noise cancellationis also verified.
Creating an immersive scene relies on detailed spatial sound. Traditional methods, using probe points for impulse responses, need lots of storage. Meanwhile, geometry-based simulations struggle with complex sound effects. Now, neural-based methods are improving accuracy and slashing storage needs. In our study, we propose a hybrid time and time-frequency domain strategy to model the time series of Ambisonic acoustic fields. The networks excels in generating high-fidelity time-domain impulse responses at arbitrary source-recceiver positions by learning a continuous representation of the acoustic field. Our experimental results demonstrate that the proposed model outperforms baseline methods in various aspects of sound representation and rendering for different source-receiver positions.
Guo R, Niu D, Qu L, Qi Y, Shi J, Yue W, Xing B, Chen T, Ying X. Instance-Level Panoramic Audio-Visual Saliency Detection and Ranking, in Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024. ACM; 2024:9426–9434. 访问链接
Zhong H, Hong Y, Weng S, Liang J, Shi B. Language-Guided Image Reflection Separation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024:24913–24922.Abstract
This paper studies the problem of language-guided reflection separation which aims at addressing the ill-posed reflection separation problem by introducing language descriptions to provide layer content. We propose a unified framework to solve this problem which leverages the cross-attention mechanism with contrastive learning strategies to construct the correspondence between language descriptions and image layers. A gated network design and a randomized training strategy are employed to tackle the recognizable layer ambiguity. The effectiveness of the proposed method is validated by the significant performance advantage over existing reflection separation methods on both quantitative and qualitative comparisons.
Event cameras with their high temporal resolution dynamic range and low power consumption are particularly good at time-sensitive applications like deblurring and frame interpolation. However their performance is hindered by latency variability especially under low-light conditions and with fast-moving objects. This paper addresses the challenge of latency in event cameras – the temporal discrepancy between the actual occurrence of changes in the corresponding timestamp assigned by the sensor. Focusing on event-guided deblurring and frame interpolation tasks we propose a latency correction method based on a parameterized latency model. To enable data-driven learning we develop an event-based temporal fidelity to describe the sharpness of latent images reconstructed from events and the corresponding blurry images and reformulate the event-based double integral model differentiable to latency. The proposed method is validated using synthetic and real-world datasets demonstrating the benefits of latency correction for deblurring and interpolation across different lighting conditions.