期刊名称:International Journal of Signal Processing, Image Processing and Pattern Recognition
印刷版ISSN:2005-4254
出版年度:2016
卷号:9
期号:2
页码:95-106
DOI:10.14257/ijsip.2016.9.2.09
出版社:SERSC
摘要:In this paper, we handle the problem of human action recognition by combining covariance matrices as local spatio-temporal (ST) descriptors and local ST features extracted densely from action video. Unlike traditional methods that separately utilizing gradient-based feature and optical flow-based feature, we use covariance matrix to fuse the two types of feature. Since covariance matrices are Symmetric Positive Definite (SPD) matrices, which form a special type of Riemannian manifold. To measure the distance of SPDs while avoid computing the geodesic distance between them, covariance features are transformed to log-Euclidean covariance matrices (LECM) by matrix logarithm operation. After encoding LECM by Locality-constrained Linear Coding method, in order to provide position information to ST-LECM features, spatial pyramid is used to partition the video frames, and the average-pooling-on-absolute-value function is implemented over each sub-frames. Finally, non-linear support vector machine is used as classifier. Experiments on public human action datasets show that the proposed method obtains great improvements in recognition accuracy, in comparison to several state-of- the-art methods.