首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:EVALUATION OF FEATURES FOR VOICE ACTIVITY DETECTION USING DEEP NEURAL NETWORK
  • 本地全文:下载
  • 作者:SUCI DWIJAYANTI ; MASATO MIYOSHI
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2018
  • 卷号:96
  • 期号:4
  • 页码:1114
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Voice activity detection (VAD) is implemented in the preprocessing stage of various speech applications to identify speech and non-speech periods. Recently, deep neural networks (DNNs) have been utilized for VAD given their superior performance over other methods. When used to identify speech and non-speech periods, DNNs depend on the input of different features to discriminate speech from noise. Hence, different features have been used as input for DNN-based VAD. However, the contribution and effectiveness of such features have not been thoroughly evaluated. In this paper, we address these aspects by comparing five features, namely, log power spectra, filter bank, mel-frequency cepstral coefficients, relative spectral perceptual linear predictive analysis, and amplitude modulation spectrogram, which are widely used on speech processing, to evaluate their performance in a DNN-based VAD. Experiments on the TIMIT speech corpus show that the amplitude modulation spectrogram is the feature with the best performance given its high accuracy even when processing speech data with low signal-to-noise ratio. The next feature showing high performance is log power spectra, which can be considered as a raw feature because it does not require as many calculations or processing as the other features. This suggests that raw features may be suitable inputs for DNN-based VAD. Moreover, limiting the number and processing of features for DNNs may foster system performance, real-time application, and portability of VAD by reducing the computational cost, required memory and storage.
  • 关键词:DNN; Speech Period; Speech Features; Voice Activity Detection; Amplitude Modulation Spectrogram; Log Power Spectra
国家哲学社会科学文献中心版权所有