出版社:Vilnius University, University of Latvia, Latvia University of Agriculture, Institute of Mathematics and Informatics of University of Latvia
摘要:Auditory model based feature systems include filterbank analysis and nonlinear compression of the speech signals. The Mel Frequency Cepstral Coefficients (MFCC) is the state- of-the-art feature system employing this auditory model. In this paper we proposed to modify MFCC analysis by applying power nonlinearity operator instead of logarithmic and to modify the size of filterbank. Power nonlinear operator caused increased recognition rate of deteriorated speech by 2.4 %. In combination with reduced filterbank size (down to 20 filters) power nonlinearity enhanced robustness of speech recognition: the gain of recognition rate varied from 0.6 % to 3 % in comparison with common MFCC features for different noise levels .
关键词:speech recognition; Mel frequency cepstral coefficients; band-pass filters; nonlinearity ; coefficient; power nonlinearity