期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2017
卷号:17
期号:2
页码:161-169
出版社:International Journal of Computer Science and Network Security
摘要:Language identification and research in its related areas is gaining more and more importance and becoming the focus of research these days. People from different backgrounds talk in different languages which creates a language barrier for communication among individuals but this problem can be resolved using emerging and latest techniques of speech technology. This paper presents an automatic language identification system that differentiates between two different spoken utterances in Urdu and Sindhi which are national and one of the provincial languages of Pakistan respectively. The proposed approach in this paper is based on audio feature extraction, vector quantization for phoneme codebook generation and multi class like support vector machine for classification and identification of the respective languages. The experimental result is encouraging and indicates that the proposed approach is effective by identifying the spoken utterances for two languages of Pakistan in real environment.
关键词:Language Identification; MFCCs (Mel Frequency Cepstral Coefficients); Vector Quantization (VQ); Support Vector Machine (SVM); Languages of Pakistan