首页    期刊浏览 2025年07月02日 星期三
登录注册

文章基本信息

  • 标题:Performance Analysis of Different Acoustic Features based on LSTM for Bangla Speech Recognition
  • 本地全文:下载
  • 作者:Nahyan Al Mahmud ; Ahsanullah University of Science ; Technology
  • 期刊名称:The International Journal of Multimedia & Its Applications (IJMA)
  • 印刷版ISSN:0975-5934
  • 电子版ISSN:0975-5578
  • 出版年度:2020
  • 卷号:12
  • 期号:1/2/3/4
  • 页码:1-9
  • DOI:10.5121/ijma.2020.12402
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:In this work a new Bangla speech corpus along with proper transcriptions has been developed; also various acoustic feature extraction methods have been investigated using Long Short-Term Memory (LSTM) neural network to find their effective integration into a state-of-the-art Bangla speech recognition system. The acoustic features are usually a sequence of representative vectors that are extracted from speech signals and the classes are either words or sub word units such as phonemes. The most commonly used feature extraction method, known as linear predictive coding (LPC), has been used first in this work. Then the other two popular methods, namely, the Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction (PLP) have also been applied. These methods are based on the models of the human auditory system. A detailed review of the implementation of these methods have been described first. Then the steps of the implementation have been elaborated for the development of an automatic speech recognition system (ASR) for Bangla speech.
  • 关键词:Mel frequency cepstral coefficients; linear predictive coding; perceptual linear prediction; sentence correct rates; LSTM.
国家哲学社会科学文献中心版权所有