期刊名称:Romanian Conference on Human-Computer Interaction
印刷版ISSN:2344-1690
出版年度:2018
卷号:RoCHI 2018
页码:54-62
语种:English
出版社:Matrix ROM
摘要:The present paper illustrates the main methods that can be employed to build a speech and speaker recognition system for Romanian language. To this aim, we start by presenting the classical approach of extracting the Mell Frequency Cepstral Coefficients features from a dataset of speech signals (which represents some words/phrases in Romanian language). The recognition is done either by using Dynamic Time Warping (DTW) or by training an Convolutional Neural Network. A comparison between these models is presented and commented. Once such a system is developed, we proceed further by implementing an application that listens and executes some predefined commands. In our setup, the system performs two main tasks: it recognizes the user by his voice and executes a task corresponding to the vocal command. Source code can be downloaded at: click to download