文章基本信息

标题：Tandem MLNs based Phonetic Feature Extraction for Phoneme Recognition
本地全文：下载
作者：Mohammed Rokibul Alam Kotwal ; Foyzul Hassan ; Ghulam Muhammad 等
期刊名称：International Journal of Computer Information Systems and Industrial Management Applications
印刷版ISSN：2150-7988
电子版ISSN：2150-7988
出版年度：2011
卷号：3
页码：88-95
出版社：Machine Intelligence Research Labs (MIR Labs)
摘要：This paper presents a method for automatic phoneme recognition for Japanese language using tandem MLNs. Here, an accurate phoneme recognizer or phonetic type-writer, which extracts out-of-vocabulary (OOV) word for resolving OOV problem that occurred when a new vocabulary does not exist in word lexicon, plays an important role in current hidden Markov model (HMM)-based automatic speech recognition (ASR) system. The construction of the proposed method comprises three stages: (i) the multilayer neural network (MLN) that converts acoustic features, mel frequency cepstral coefficients (MFCCs), into distinctive phonetic features (DPFs) is incorporated at first stage, (ii) the second MLN that combines DPFs and acoustic features as input and outputs a 45 dimensional DPF vector with less context effect is added and (iii) the 45 dimensional feature vector generated by the second MLN are inserted into a hidden Markov model (HMM) based classifier to obtain more accurate phoneme strings from the input speech. From the experiments on Japanese Newspaper Article Sentences (JNAS) in clean acoustic environment, it is observed that the proposed method provides a higher phoneme correct rate and improves phoneme accuracy tremendously over the method based on a single MLN. Moreover, it requires fewer mixture components in HMMs. Consequently, less computation time is required for the HMMs.
关键词：multilayer neural network; hidden Markov model; ; automatic speech recognition; mel frequency cepstral coefficients; ; distinctive phonetic features; out-of-vocabulary