摘要:Summary In presence of environmental noise, speakers tend to emphasize their vocal effort to improve the audibility of voice. This involuntary adjustment is known as Lombard effect (LE). Due to LE the signal to noise ratio of speech increases, but at the same time the loudness, pitch and duration of phonemes changes. Hence, accuracy of automatic speech recognition systems degrades. In this paper, the effect of unsupervised equalization of Lombard effect is investigated for Hindi vowel classification task using Hindi database designed at TIFR Mumbai, India. Proposed Quantile-based Dynamic Cepstral Normalization MFCC (QCN-MFCC) along with baseline MFCC features have been used for vowel classification. Hidden Markov Model (HMM) is used as classifier. It is observed that QCN-MFCC features have given a maximum improvement of 5.97% and 5% over MFCC features for context-dependent and context-independent cases respectively. It is also observed that QCN-MFCC features have given improvement of 13% and 11.5% over MFCC features for context-dependent and context-independent classification of mid vowels.
关键词:Speech recognition; Lombard effect; QCN; Hidden Markov Model (HMM); Mel Frequency Cepstral Coefficients (MFCC);