首页    期刊浏览 2024年09月12日 星期四
登录注册

文章基本信息

  • 标题:AUTOMATIC SPEECH RECOGNITION SYSTEM FOR KAZAKH LANGUAGE USING CONNECTIONIST TEMPORAL CLASSIFIER
  • 本地全文:下载
  • 作者:YEDILKHAN AMIRGALIYEV ; DARKHAN KUANYSHBAY ; DIDAR YEDILKHAN
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2020
  • 卷号:98
  • 期号:4
  • 页码:703-713
  • 出版社:Journal of Theoretical and Applied
  • 摘要:This scientific report illustrates the performance evaluation of the well-known, recently popular neural network Connectionist Temporal Classifier (CTC) for speech recognition. The CTC contains LSTM layers with 256 cells and Momentum Optimizer with learning rate 0.005 and momentum 0.9. Dataset that we have used has 35 native speakers with 360 utterances. For expanding the size of our dataset with overall performance augmentation techniques has been applied using Adobe Audition software, which output 20 more speakers to our original dataset. The result of our experiment has been evaluated with LER (Label error rate). LER measures the inaccuracy between predicted an actual texts. The output of the experiment reported training LER 0.000 and validation LER 0.5.
  • 关键词:Recurrent Neural Network;Language Model;Acoustic Model;CTC;Data Augmentation;Time Warping
国家哲学社会科学文献中心版权所有