文章基本信息

标题：Generating Captions for Images Using Multimodal Neural Networks
本地全文：下载
作者：Dhoomil Sheta ; Parth Parekh ; Nishant Shah 等
期刊名称：International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN：2320-9798
电子版ISSN：2320-9801
出版年度：2017
卷号：5
期号：7
页码：13727
DOI：10.15680/IJIRCCE.2017.0507034
出版社：S&S Publications
摘要：Automatically generating captions for images has always been an area of research in ArtificialIntelligence. We propose a model to train the image representation to generate captions for the images using RecurrentNeural Networks. Image representations are extracted using highly efficient Deep Residual Network (ResNet-50)[0].We also present an extension to traditional LSTM that improves performance of caption generation. We use the datasetFlickr8k and validate the performance using widely accepted metrics such as BLEU, CIDEr, METEOR.