文章基本信息

标题：Search the Audio, Browse the Video—A Generic Paradigm for Video Collections
本地全文：下载
作者：Arnon Amir ; Savitha Srinivasan ; Alon Efrat 等
期刊名称：EURASIP Journal on Advances in Signal Processing
印刷版ISSN：1687-6172
电子版ISSN：1687-6180
出版年度：2003
卷号：2003
期号：3
页码：209-222
DOI：10.1155/S111086570321012X
出版社：Hindawi Publishing Corporation
摘要：
The amount of digital video being shot, captured, and stored is growing at a rate faster than ever before. The large amount of stored video is not penetrable without efficient video indexing, retrieval, and browsing technology. Most prior work in the field can be roughly categorized into two classes. One class is based on image processing techniques, often called content-based image and video retrieval, in which video frames are indexed and searched for visual content. The other class is based on spoken document retrieval, which relies on automatic speech recognition and text queries. Both approaches have major limitations. In the first approach, semantic queries pose a great challenge, while the second, speech-based approach, does not support efficient video browsing. This paper describes a system where speech is used for efficient searching and visual data for efficient browsing, a combination that takes advantage of both approaches. A fully automatic indexing and retrieval system has been developed and tested. Automated speech recognition and phonetic speech indexing support text-to-speech queries. New browsable views are generated from the original video. A special synchronized browser allows instantaneous, context-preserving switching from one view to another. The system was successfully used to produce searchable-browsable video proceedings for three local conferences.