Learning Deep Trajectory Descriptor for Action Recognition in Videos using Deep Neural Networks

Yemin Shi; Yaowei Wang; Wei Zeng; Tiejun Huang

Citation:

Shi Y, Wang Y, Zeng W, Huang T. Learning Deep Trajectory Descriptor for Action Recognition in Videos using Deep Neural Networks, in IEEE International Conference on Multimedia and Expo (ICME).; 2015.

摘要:

Human action recognition is widely recognized as a challenging task due to the difficulty of effectively characterizing human action in a complex scene. Recent studies have shown that the dense-trajectory-based methods can achieve state-of-the-art recognition results on some challenging datasets. However, in these methods, each dense trajectory is often represented as a vector of coordinates, consequently losing the structural relationship between different trajectories. To address the problem, this paper proposes a novel Deep Trajectory Descriptor (DTD) for action recognition. First, we extract dense trajectories from multiple consecutive frames and then project them onto a canvas. This will result in a “trajectory texture” image which can effectively characterize the relative motion in these frames. Based on these trajectory texture images, a deep neural network (DNN) is utilized to learn a more compact and powerful representation of dense trajectories. In the action recognition system, the DTD descriptor, together with other non-trajectory features such as HOG, HOF and MBH, can provide an effective way to characterize human action from various aspects. Experimental results show that our system can statistically outperform several state-of-the-art approaches, with an average accuracy of 95:6% on KTH and an accuracy of 92.14% on UCF50.

Paper

Yemin Shi

北京大学信息科学技术学院Ph.D. Candidate

Learning Deep Trajectory Descriptor for Action Recognition in Videos using Deep Neural Networks

Citation:

摘要:

Types

Years

Recent