期刊名称:International Journal of Signal Processing, Image Processing and Pattern Recognition
印刷版ISSN:2005-4254
出版年度:2014
卷号:7
期号:2
页码:355-364
DOI:10.14257/ijsip.2014.7.2.33
出版社:SERSC
摘要:A practical speech enhancement system consists of two major components, the estimation of noise power spectrum, and the estimation of speech.In single channel speech enhancement systems, most algorithms require an estimation of average noise spectrum since a secondary channel is not available. This requires a reliable speech/silence detector. Thus the speech/silence detection can be a determining factor for the performance of the whole speech enhancement system. The speech/silence detection finds out the frames of the noisy speech that contain only noise. If the speech/silence detection is not accurate then speech echoes and residual noise tend to be present in the enhanced speech. The performance of noise estimation algorithm is usually a tradeoff between speech distortion and noise reduction. In existing methods, noise is estimated only during speech pauses and these pauses are identified using Voice Activity Detector (VAD). This paper describes novel noise estimation method to estimate noise in non-stationary environments. This approach uses an algorithm that classifies noisy speech signal into pure speech, quasi speech and non-speech frames based on adaptive thresholds without using of VAD.Speech presence is determined by computing the ratio of the noisy speech power spectrum to its local minimum, which is computed by averaging past values of the noisy speech power spectra with a look-ahead factor. To evaluate proposed method performance, segmental SNR as evaluation criteria and compared with weighted average noise estimation method. The simulation results of the proposed algorithm shows better performance than conventional methods.