期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2013
卷号:3
期号:6
出版社:S.S. Mishra
摘要:Speech synthesis, speech recognition, and speech transformation are the essential techniques used for human-machine communication. Feature extraction procedures give a compressed representation of the speech signals. The Harmonic plus noise model (HNM) module used for the analyses and synthesis provides high quality speech with less number of parameters. Dynamic time warping procedure is used for aligning two given multidimensional sequences. The improvement in the alignment is anticipated by the corresponding distances between the sequences. The objective of this research is to investigate the effect of LSF, HNM and dynamic time warping on phrases, words, and phonemes based alignments. The speech signals in the form of twenty five phrases have been recorded. The recorded speech is segmented manually and aligned at sentence, word, and phoneme level. The Mahalanobis distance (MD) is calculated between the aligned frames. The study has exposed better alignment in case of HNM parametric domain. It has been observed that effective speech alignment at phrase level.
关键词:LSF; HNM; Mahalanobis distance; speech recognition; speech transformation and Dynamic time ;warping