摘要:
Visual saliency plays an important role in various video applications such as video retargeting and intelligent video advertising. However, existing visual saliency estimation approaches often construct a unified model for all scenes, thus leading to poor performance for the scenes with diversified contents. To solve this problem, we propose a multi-task rank learning approach which can be used to infer multiple saliency models that apply to different scene clusters. In our approach, the problem of visual saliency estimation is formulated in a pair-wise rank learning framework, in which the visual features can be effectively integrated to distinguish salient targets from distractors. A multi-task learning algorithm is then presented to infer multiple visual saliency models simultaneously. By an appropriate sharing of information across models, the generalization ability of each model can be greatly improved. Extensive experiments on a public eye-fixation dataset show that our multi-task rank learning approach outperforms 12 state-of-the-art methods remarkably in visual saliency estimation.