期刊名称:International Journal of Multimedia and Ubiquitous Engineering
印刷版ISSN:1975-0080
出版年度:2015
卷号:10
期号:5
页码:355-362
DOI:10.14257/ijmue.2015.10.5.33
出版社:SERSC
摘要:Given the importance of the textual information in content retrieval, it is desirable that the textual representation of educational videos contents in social media platforms like YouTube capture the semantics of what is really in content they represent. Such coherent textual representations are important in objective video content retrieval, repurposing, reuse and sense- making of the content. In this study,the Automatic Speech Recognition (ASR) in the video tracks was leveraged to supplement the insufficient video content representations done through video title alone. The Latent Dirichlet allocation (LDA) implementation of Gibb's sampling topic modeling approach was used to evaluate the suitability of various textual representations for YouTube educational videos and extract the candidate topic that extends well the original YouTube keywords. The results show that in topics space, YouTube ASR script performs well as a representative textual source in dominant topic than the combined textual representations. The automatic keywords extension obtained using our method add value to applications that use tags for content discovery or retrieval
关键词:content discovery; textual representation; Gibb's sampling; Video ASR ; scripts; topic modeling