文章基本信息

标题：聴覚障害者のためのリアルタイム字幕システムにおける話者顔映像と誤認識字幕の呈示タイミングに関する研究
本地全文：下载
作者：黒木速人 ; 井野秀一 ; 中野聡子等
期刊名称：映像情報メディア学会誌
印刷版ISSN：1342-6907
电子版ISSN：1881-6908
出版年度：2011
卷号：65
期号：12
页码：1750-1757
DOI：10.3169/itej.65.1750
出版社：The Institute of Image Information and Television Engineers
摘要：We have been studying a real-time speech-to-caption system using speech recognition technology with a repeat speaking method. In this system, we used a repeat speaker who listens to a lecturer's voice and then speaks back the lecturer's utterances into a speech recognition computer. Our developing system showed that the accuracy of the captions is about 97% in Japanese-Japanese conversion, and the conversion time from voices to captions is about 4 seconds in English-English conversion in some international conferences. Of course it required a lot of costs to achieve these high performances. In human communications, speech understanding depends not only on verbal information but also on non-verbal information such as speaker's gestures and face and mouth movements. Therefore, we found a suitable way to display the information of captions and speaker's face movement images to achieve higher comprehension after briefly storing information once into a computer. In this paper, we investigated the relationship of the display sequence and display timing between captions that have speech recognition errors and the speaker's face movement images. The results showed that the sequence displaying the caption before the speaker's face image improved the comprehension of the captions. The sequence displaying both simultaneously showed an improvement of only a few percent higher than that of the question sentence, and the sequence displaying the speaker's face image before the caption showed almost no change. In addition, the sequence displaying the caption 1 second before the speaker's face showed the most significant improvement of all the conditions in the hearing-impaired.
关键词：聴覚障害者;音声認識;リアルタイム字幕;ノンバーバル情報;顔