出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:The recognition of human actions based on three-dimensional depth data has become a veryactive research field in computer vision. In this paper, we study the fusion at the feature anddecision levels for depth data captured by a Kinect camera to improve action recognition. Moreprecisely, from each depth video sequence, we compute Depth Motion Maps (DMM) from threeprojection views: front, side and top. Then shape and texture features are extracted from theobtained DMMs. These features are based essentially on Histogram of Oriented Gradients(HOG) and Local Binary Patterns (LBP) descriptors. We propose to use two fusion levels. Thefirst is a feature fusion level and is based on the concatenation of HOG and LBP descriptors.The second, a score fusion level, based on the naive-Bayes combination approach, aggregatesthe scores of three classifiers: a collaborative representation classifier, a sparse representationclassifier and a kernel based extreme learning machine classifier. The experimental resultsconducted on two public datasets, Kinect v2 and UTD-MHAD, show that our approach achievesa high recognition accuracy and outperforms several existing methods.