期刊名称:International Journal of Advanced Networking and Applications
电子版ISSN:0975-0290
出版年度:2021
卷号:12
期号:6
页码:4800-4808
DOI:10.35444/IJANA.2021.12611
语种:English
出版社:Eswar Publications
摘要:To nudge the state of the art of human-machine interacting applications, research in speech recognition systems has progressively been examining speech-to-text synthesis, but implementation has been done to minimal languages. Although the Bengali language has not been much of an object of interest, we present the automatic speech recognition (ASR) system solely based on this particular language since around 16% of the world’s population speak Bengali. It has been a demanding task to implement Bengali ASR because it consists of diacritic characters. We conduct a series of preprocessing and feature selection methods along with a convolutional neural net model in consideration of an automatic verbal communication recognition system. Furthermore, the researchers compared this method to a recurrent neural network that is based on an LSTM network and a vast data file of Google Inc. Investigation of these two models indicates such as the recurrent neural net outperforms the convolutional neural net: the former benefits from combining connectionist temporal classification (CTC) and language model (LM). A quantitative analysis of the output shows that the word error rate and validation loss can be affected by variation in dropout values. It also shows that the parameters are also affected by clean and augmented data.