期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2014
卷号:5
期号:4
页码:5221-5224
出版社:TechScience Publications
摘要:Now a day’s, whenever retrieving footage from the search Engines that retrieves footage whereas not analyzing their content, simply by matching user queries against the image’s file name and format, user-annotated tags, captions, and, generally, text shut the image. To boot the retrieved image does not contain any matter data beside the photographs. We have a bent to introduced the task of automatic caption generation for news footage. The task fuses insights from computer vision and communication method and holds promise for various multimedia applications, like image retrieval, development of tools supporting fourth estate management, and for folks with handicap. It's potential to search out a caption generation model from frail labelled data whereas not costly manual involvement. Instead of manually creating annotations, image captions unit of measurement treated as labels for the image. Although the caption words unit of measurement avowedly noisy compared to ancient human-created keywords, we have a bent to indicate that they're going to be used to learn the correspondences between visual and matter modalities, and to boot operate a gold traditional for the caption generation task. We have given extractive and hypothetical caption generation models. A key aspect of our approach is to allow every the visual and matter modalities to influence the generation task.