Citation:
Guo R, Ying X, Chen Y, Niu D, Li G, Qu L, Qi Y, Zhou J, Xing B, Yue W, et al. Audio-Visual Instance Segmentation, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. Computer Vision Foundation / IEEE; 2025:13550–13560.
