期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2010
卷号:10
期号:9
页码:96-100
出版社:International Journal of Computer Science and Network Security
摘要:This paper describes a medium size Bangla speech corpus preparation and the comparison of the performances of different acoustic features for Bangla word recognition. A small number of speakers are use for most of the Bangla automatic speech recognition (ASR) system, but 40 speakers selected from a wide area of Bangladesh, where Bangla is used as a native language, are involved here. In the experiments, mel-frequency cepstral coefficients (MFCCs) and local features (LFs) are inputted to the MLN to improve the hidden Markov model (HMM) based classifiers for obtaining word recognition performance. From the experiments, it is shown that MFCC based method of 39 dimensions provides a higher word correct rate (WCR) than the other methods investigated. Moreover, a higher WCR is obtained by the MFCC39-based method with fewer mixture components in the HMM.
关键词:mel-frequency cepstral coefficients; local features; hidden Markov model; automatic speech recognition; acoustic features