期刊名称:IAENG International Journal of Computer Science
印刷版ISSN:1819-656X
电子版ISSN:1819-9224
出版年度:2021
卷号:48
期号:3
语种:English
出版社:IAENG - International Association of Engineers
摘要:In an image annotation model based on deep learning, the number of neurons in its output layer is proportional to the vocabulary of the annotation, i.e., the model structure changes with a change in the vocabulary, thereby reducing the accuracy of image annotation. To solve this problem, in this study a new annotation model combining the improved Wasserstein generative adversarial network (GAN) and word2vec was proposed. First, the tagged vocabulary was mapped to a fixed multidimensional word vector by word2vec. Second, a neural network model (GAN-IW) was constructed by using the generated confrontation network. It was observed that the number of neurons in the output layer was equal to the dimension of the multidimensional word vector and no longer relevant to the vocabulary. Finally, the model was tested for the Corel 5K and IAPRTC-12 image annotation datasets. Compared to the convolutional neural network regression method, the model accuracy, the recall rate, and the F1 value increased by 16%, 6%, and 9%, respectively, when the model was tested on the Corel 5K dataset. Compared to the two-pass K-nearest neighbor models, our model accuracy, recall rate, and F1 value were increased by 8%, 6%, and 4%, respectively, when the model was tested on the IAPRTC-12 dataset. The experimental results showed that the GAN-IW model can solve the problem of change in the number of output neurons with a change in the vocabulary and the number of labels annotated with each image is adaptive, making the results of model annotation more in line with the actual image annotation.