摘要:
Content-based copy detection (CBCD) is drawing increasing attention as an alternative technology to watermarking for video identification and copyright protection. In this article, we present a comprehensive method to detect copies that are subjected to complicated transformations. A multimodal feature representation scheme is designed to exploit the complementarity of audio features, global and local visual features so that optimal overall robustness to a wide range of complicated modifications can be achieved. Meanwhile, a temporal pyramid matching algorithm is proposed to assemble frame-level similarity search results into sequence-level matching results through similarity evaluation over multiple temporal granularities. Additionally, inverted indexing and locality sensitive hashing (LSH) are also adopted to speed up similarity search. Experimental results over benchmarking datasets of TRECVID 2010 and 2009 demonstrate that the proposed method outperforms other methods for most transformations in terms of copy detection accuracy. The evaluation results also suggest that our method can achieve competitive copy localization preciseness.
Website