文章基本信息

标题：A Speech/Music Discriminator based on Frequency energy, Spectrogram and Autocorrelation
本地全文：下载
作者：Sumit Kumar Banchhor ; Om Prakash Sahu ; Prabhakar 等
期刊名称：International Journal of Soft Computing & Engineering
电子版ISSN：2231-2307
出版年度：2012
卷号：2
期号：1
页码：480-483
出版社：International Journal of Soft Computing & Engineering
摘要：Over the last few years major efforts have been made to develop methods for extracting information from audio-visual media, in order that they may be stored and retrieved in databases automatically, based on their content. In this work we deal with the characterization of an audio signal, which may be part of a larger audio-visual system or may be autonomous, as for example in case of an audio recording stored digitally on disk. Our goal was first to develop a system for segmentation of the audio signal, and then classify into one of two main categories: speech or music. Segmentation is based on mean signal amplitude distribution, whereas classification utilizes an additional characteristic related to frequency. The basic characteristics are computed in 2sec intervals, resulting in the segments' limits being specified within an accuracy of 2sec. The result shows the difference in human voice and musical instrument.
关键词：Speech/music;classification;audio;segmentation; zero crossing rate; short time energy; spectrum;flux.