期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2018
卷号:9
期号:3
DOI:10.14569/IJACSA.2018.090326
出版社:Science and Information Society (SAI)
摘要:Automatic speech recognition allows the machine to understand and process information provided orally by a human user. It consists of using matching techniques to compare a sound wave to a set of samples, usually composed of words but also of phonemes. This field uses the knowledge of several sciences: anatomy, phonetics, signal processing, linguistics, computer science, artificial intelligence and statistics. The latest acoustic modeling methods provide deep neural networks for speech recognition. In particular, recurrent neural networks (RNNs) have several characteristics that make them a model of choice for automatic speech processing. They can keep and take into account in their decisions past and future contextual information. This paper specifically studies the behavior of Long Short-Term Memory (LSTM)-based neural networks on a specific task of automatic speech processing: speech detection. LSTM model were compared to two neural models: Multi-Layer Perceptron (MLP) and Elman’s Recurrent Neural Network (RNN). Tests on five speech detection tasks show the efficiency of the Long Short-Term Memory (LSTM) model.