摘要:Repetitive practice is one of the most important factors in improving performance in motor skills. This paper focuses on the analysis and classification of forearm gestures in the context of violin learning. We recorded five experts and three students performing eight traditional classical violin bow-strokes that are part of the music skills repertoire known as Martelé, Staccato, Detaché, Ricochet, Legato, Trémolo, Collé and Col-legno. To record inertial motion information, we utilised the Myo sensor, which reports a multidimensional time-series signal. We synchronised inertial motion recordings with audio data to extract the spatiotemporal dynamics of the gestures. Applying state-of-the-art deep neural networks, we implemented and compared different architectures where convolutional neural networks (CNN) models demonstrate recognition rates of 97.147%; for the case of 3DMultiHeaded_CNN 98.553% and an accuracy of correct estimations gestures of 99,234% in the case of CNN_LSTM. The primary purpose of the study is to create a computer assistant to enhance the music self-practice hours providing real-time feedback about specific music material. The collected data (quaternion of the bowing arm of a violinist) contains sufficient information to distinguish the bowing techniques studied, and deep learning methods are capable of learning the movement patterns that distinguish these techniques. All learning algorithms investigated (CNN, 3DMultiHeaded_CNN and CNN_LSTM) produced high classification accuracies which support the feasibility of training classifiers for the problem studied. The resulting classifiers may be used to implement a computer assistant to enhance self-practice hours providing real-time feedback about specific bowing techniques.
关键词:gesture recognition; Bow-Strokes; music interaction; CNN; LSTM; music education; ConvLSTM; CNN_LSTM