Guo R, Niu D, Qu L, Qi Y, Shi J, Yue W, Xing B, Chen T, Ying X. Instance-Level Panoramic Audio-Visual Saliency Detection and Ranking, in Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024. ACM; 2024:9426–9434. 访问链接
Zhong H, Hong Y, Weng S, Liang J, Shi B. Language-Guided Image Reflection Separation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024:24913–24922.Abstract
This paper studies the problem of language-guided reflection separation which aims at addressing the ill-posed reflection separation problem by introducing language descriptions to provide layer content. We propose a unified framework to solve this problem which leverages the cross-attention mechanism with contrastive learning strategies to construct the correspondence between language descriptions and image layers. A gated network design and a randomized training strategy are employed to tackle the recognizable layer ambiguity. The effectiveness of the proposed method is validated by the significant performance advantage over existing reflection separation methods on both quantitative and qualitative comparisons.
Event cameras with their high temporal resolution dynamic range and low power consumption are particularly good at time-sensitive applications like deblurring and frame interpolation. However their performance is hindered by latency variability especially under low-light conditions and with fast-moving objects. This paper addresses the challenge of latency in event cameras – the temporal discrepancy between the actual occurrence of changes in the corresponding timestamp assigned by the sensor. Focusing on event-guided deblurring and frame interpolation tasks we propose a latency correction method based on a parameterized latency model. To enable data-driven learning we develop an event-based temporal fidelity to describe the sharpness of latent images reconstructed from events and the corresponding blurry images and reformulate the event-based double integral model differentiable to latency. The proposed method is validated using synthetic and real-world datasets demonstrating the benefits of latency correction for deblurring and interpolation across different lighting conditions.
In this paper, we introduce L-DiffER, a language-based diffusion model designed for the ill-posed single image reflection removal task. Although having shown impressive performance for image generation, existing language-based diffusion models struggle with precise control and faithfulness in image restoration. To overcome these limitations, we propose an iterative condition refinement strategy to resolve the problem of inaccurate control conditions. A multi-condition constraint mechanism is employed to ensure the recovery faithfulness of image color and structure while retaining the generation capability to handle low-transmitted reflections. We demonstrate the superiority of the proposed method through extensive experiments, showcasing both quantitative and qualitative improvements over existing methods.
This paper discusses the so-called Bakers’ Strike Edict from Ephesus (Ephesos 231 = IK 12.215 p. 27) in light of recent studies on the Roman imperial toolkit to build empire-wide communities. Clifford Ando and Myles Lavan argued that Roman emperors in the first two centuries CE were consciously blurring distinctions between Roman and non-Roman populations, so that there could be a shared sense of an empire-wide community among people in the provinces. This paper argues that, in addition to Lavan’s observations, gubernatorial edicts also show concerns and measures that sought to communicate a sense of the communal at the local level. While the focus of discussion is on the edict responding to a bakers’ strike at Ephesus, several other examples from a corpus of gubernatorial edicts are also used in connection with this example. This paper hopes to contribute to Ando’s and Lavan’s arguments by pointing to a lower register of community building visible in gubernatorial edicts. The governors’ concerns for and efforts to creating communal cohesion and their need to balance parallel and at times competing “common goods” not only adds another nuance to the grander community building project at the imperial level, but demonstrates further complications on how praesidial governors – and in particular proconsuls – can and would react to difficult issues at the local level.
Guo R, Qu L, Niu D, Qi Y, Yue W, Shi J, Xing B, Ying X. Open-Vocabulary Audio-Visual Semantic Segmentation, in Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024. ACM; 2024:7533–7541. 访问链接
Guo R, Qu L, Niu D, Qi Y, Yue W, Shi J, Xing B, Ying X. Open-Vocabulary Audio-Visual Semantic Segmentation, in Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024. ACM; 2024:7533–7541. 访问链接