*Li, Jia; Duan L-Y; CX; HT; TY.
Finding the Secret of Image Saliency in the Frequency Domain. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015;37(12):2428-2440.
AbstractThere are two sides to every story of visual saliency modeling in the frequency domain. On the one hand, image saliency can be effectively estimated by applying simple operations to the frequency spectrum. On the other hand, it is still unclear which part of the frequency spectrum contributes the most to popping-out targets and suppressing distractors. Toward this end, this paper tentatively explores the secret of image saliency in the frequency domain. From the results obtained in several qualitative and quantitative experiments, we find that the secret of visual saliency may mainly hide in the phases of intermediate frequencies. To explain this finding, we reinterpret the concept of discrete Fourier transform from the perspective of template-based contrast computation and thus develop several principles for designing the saliency detector in the frequency domain. Following these principles, we propose a novel approach to design the saliency detector under the assistance of prior knowledge obtained through both unsupervised and supervised learning processes. Experimental results on a public image benchmark show that the learned saliency detector outperforms 18 state-of-the-art approaches in predicting human fixations.
Li, Bing; *Duan L-Y; LC-W; HT; GW.
Depth-Preserving Warping for Stereo Image Retargeting. IEEE Transactions on Image Processing. 2015;24(9):2811-2826.
AbstractThe popularity of stereo images and various display devices poses the need of stereo image retargeting techniques. Existing warping-based retargeting methods can well preserve the shape of salient objects in a retargeted stereo image pair. Nevertheless, these methods often incur depth distortion, since they attempt to preserve depth by maintaining the disparity of a set of sparse correspondences, rather than directly controlling the warping. In this paper, by considering how to directly control the warping functions, we propose a warping-based stereo image retargeting approach that can simultaneously preserve the shape of salient objects and the depth of 3D scenes. We first characterize the depth distortion in terms of warping functions to investigate the impact of a warping function on depth distortion. Based on the depth distortion model, we then exploit binocular visual characteristics of stereo images to derive region-based depth-preserving constraints which directly control the warping functions so as to faithfully preserve the depth of 3D scenes. Third, with the region-based depth-preserving constraints, we present a novel warping-based stereo image retargeting framework. Since the depth-preserving constraints are derived regardless of shape preservation, we relax the depth-preserving constraints to fulfill a tradeoff between shape preservation and depth preservation. Finally, we propose a quad-based implementation of the proposed framework. The results demonstrate the efficacy of our method in both depth and shape preservation for stereo image retargeting.
*Duan, Ling-Yu; Lin J; WZ; HT; GW.
Weighted Component Hashing of Binary Aggregated Descriptors for Fast Visual Search. IEEE Transactions on Multimedia. 2015;17(6):828-842.
AbstractTowards low bit rate mobile visual search, recent works have proposed to aggregate the local features and compress the aggregated descriptor (such as Fisher vector, the vector of locally aggregated descriptors) for low latency query delivery as well as moderate search complexity. Even though Hamming distance can be computed very fast, the computational cost of exhaustive linear search over the binary descriptors grows linearly with either the length of a binary descriptor or the number of database images. In this paper, we propose a novel weighted component hashing (WeCoHash) algorithm for long binary aggregated descriptors to significantly improve search efficiency over a large scale image database. Accordingly, the proposed WeCoHash has attempted to address two essential issues in Hashing algorithms: "what to hash" and "how to search." "What to hash" is tackled by a hybrid approach, which utilizes both image-specific component (i.e., visual word) redundancy and bit dependency within each component of a binary aggregated descriptor to produce discriminative hash values for bucketing. "How to search" is tackled by an adaptive relevance weighting based on the statistics of hash values. Extensive comparison results have shown that WeCoHash is at least 20 times faster than linear search and 10 times faster than local sensitive hash (LSH) when maintaining comparable search accuracy. In particular, the WeCoHash solution has been adopted by the emerging MPEG compact descriptor for visual search (CDVS) standard to significantly speed up the exhaustive search of the binary aggregated descriptors.
Chen, Jie; *Duan L-Y; GF; CJ; KAHTC ;.
A Low Complexity Interest Point Detector. IEEE Signal Processing Letters. 2015;22(2):172-176.
AbstractInterest point detection is a fundamental approach to feature extraction in computer vision tasks. To handle the scale invariance, interest points usually work on the scale-space representation of an image. In this letter, we propose a novel block-wise scale-space representation to significantly reduce the computational complexity of an interest point detector. Laplacian of Gaussian (LoG) filtering is applied to implement the block-wise scale-space representation. Extensive comparison experiments have shown the block-wise scale-space representation enables the efficient and effective implementation of an interest point detector in terms of memory and time complexity reduction, as well as promising performance in visual search.
Huang YT; MQ; T.
TASC: A Transformation-Aware Soft Cascading Approach for Multimodal Video Copy Detection. ACM Transactions on Information Systems. 2015;33(2):Article 7-34 pages.
PeixiPeng(博士生);*YonghongTian;YaoweiWang;JiaLi;TiejunHuang.
Robust Multiple Cameras Pedestrian Detection with Multi-view Bayesian Network. Pattern Recognition. 2015;48(5):1760-1772.