Yang Y, Liang J, Yu B, Chen Y, Ren JS, Shi B.
Latency Correction for Event-guided Deblurring and Frame Interpolation, in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024:24977–24986.
AbstractEvent cameras with their high temporal resolution dynamic range and low power consumption are particularly good at time-sensitive applications like deblurring and frame interpolation. However their performance is hindered by latency variability especially under low-light conditions and with fast-moving objects. This paper addresses the challenge of latency in event cameras – the temporal discrepancy between the actual occurrence of changes in the corresponding timestamp assigned by the sensor. Focusing on event-guided deblurring and frame interpolation tasks we propose a latency correction method based on a parameterized latency model. To enable data-driven learning we develop an event-based temporal fidelity to describe the sharpness of latent images reconstructed from events and the corresponding blurry images and reformulate the event-based double integral model differentiable to latency. The proposed method is validated using synthetic and real-world datasets demonstrating the benefits of latency correction for deblurring and interpolation across different lighting conditions.
Yu B, Liang J, Wang Z, Fan B, Subpa-asa A, Shi B, Sato I.
Active Hyperspectral Imaging Using an Event Camera, in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024.
AbstractHyperspectral imaging plays a critical role in numerous scientific and industrial fields. Conventional hyperspectral imaging systems often struggle with the trade-off between spectral and temporal resolution, particularly in dynamic environments. In ours work, we present an innovative event-based active hyperspectral imaging system designed for real-time performance in dynamic scenes. By integrating a diffraction grating and rotating mirror with an event-based camera, the proposed system captures high-fidelity spectral information at a microsecond temporal resolution, leveraging the event camera's unique capability to detect instantaneous changes in brightness rather than absolute intensity. The proposed system trade-off between conventional frame-based systems by reducing the bandwidth and computational load and mosaic-based system by remaining the original sensor spatial resolution. It records only meaningful changes in brightness, achieving high temporal and spectral resolution with minimal latency and is practical for real-time applications in complex dynamic conditions.
Yu B, Ren J, Han J, Wang F, Liang J, Shi B.
EventPS: Real-Time Photometric Stereo Using an Event Camera, in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024:9602–9611.
AbstractPhotometric stereo is a well-established technique to estimate the surface normal of an object. However the requirement of capturing multiple high dynamic range images under different illumination conditions limits the speed and real-time applications. This paper introduces EventPS a novel approach to real-time photometric stereo using an event camera. Capitalizing on the exceptional temporal resolution dynamic range and low bandwidth characteristics of event cameras EventPS estimates surface normal only from the radiance changes significantly enhancing data efficiency. EventPS seamlessly integrates with both optimization-based and deep-learning-based photometric stereo techniques to offer a robust solution for non-Lambertian surfaces. Extensive experiments validate the effectiveness and efficiency of EventPS compared to frame-based counterparts. Our algorithm runs at over 30 fps in real-world scenarios unleashing the potential of EventPS in time-sensitive and high-speed downstream applications.
Zhong H, Hong Y, Weng S, Liang J, Shi B.
Language-Guided Image Reflection Separation, in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).; 2024:24913–24922.
AbstractThis paper studies the problem of language-guided reflection separation which aims at addressing the ill-posed reflection separation problem by introducing language descriptions to provide layer content. We propose a unified framework to solve this problem which leverages the cross-attention mechanism with contrastive learning strategies to construct the correspondence between language descriptions and image layers. A gated network design and a randomized training strategy are employed to tackle the recognizable layer ambiguity. The effectiveness of the proposed method is validated by the significant performance advantage over existing reflection separation methods on both quantitative and qualitative comparisons.
Hong Y, Zhong H, Weng S, Liang J, Shi B.
L-DiffER: Single Image Reflection Removal with Language-based Diffusion Model, in
Proceedings of the European Conference on Computer Vision (ECCV).; 2024.
AbstractIn this paper, we introduce L-DiffER, a language-based diffusion model designed for the ill-posed single image reflection removal task. Although having shown impressive performance for image generation, existing language-based diffusion models struggle with precise control and faithfulness in image restoration. To overcome these limitations, we propose an iterative condition refinement strategy to resolve the problem of inaccurate control conditions. A multi-condition constraint mechanism is employed to ensure the recovery faithfulness of image color and structure while retaining the generation capability to handle low-transmitted reflections. We demonstrate the superiority of the proposed method through extensive experiments, showcasing both quantitative and qualitative improvements over existing methods.
Hong Y, Chang Y, Liang J, Ma L, Huang T, Shi B.
Light Flickering Guided Reflection Removal. International Journal of Computer Vision (IJCV). 2024.
AbstractWhen photographing through a piece of glass, reflections usually degrade the quality of captured images or videos. In this paper, by exploiting periodically varying light flickering, we investigate the problem of removing strong reflections from contaminated image sequences or videos with a unified capturing setup. We propose a learning-based method that utilizes short-term and long-term observations of mixture videos to exploit one-side contextual clues in fluctuant components and brightness-consistent clues in consistent components for achieving layer separation and flickering removal, respectively. A dataset containing synthetic and real mixture videos with light flickering is built for network training and testing. The effectiveness of the proposed method is demonstrated by the comprehensive evaluation on synthetic and real data, the application for video flickering removal, and the exploratory experiment on high-speed scenes.
Lou H, Liang J, Teng M, Fan B, Xu Y, Shi B.
Zero-Shot Event-Intensity Asymmetric Stereo via Visual Prompting from Image Domain, in
Advances in Neural Information Processing Systems.Vol 37.; 2024:13274–13301.