期刊名称:International Journal of Software Engineering and Its Applications
印刷版ISSN:1738-9984
出版年度:2016
卷号:10
期号:11
页码:153-160
DOI:10.14257/ijseia.2016.10.11.13
出版社:SERSC
摘要:Most information in Web 2.0 is made by users and classified by tags assigned by users. T ag - related service s and research are focused on work such as automatic tagging and tag - cloud composition ; however, classif yin g media resources and information according to tags and provid ing the results to users is not still up to the mark. In this paper, image resources and their tag information scattered in the web are collected and a tag - pair weight matrix is created , accordi ng to the relation s and semantic similarit ies between tags. To overcome the problems of the existing system, a t ag - pair w eight m atrix - based t ag c lustering (TBTC) algorithm was proposed to find highly related tags. The threshold used for clustering in this algorithm was studied, and an optimal threshold with high cluster cohesion was determined. Finally , as an experiment, 500 images with the keyword 'tomato' were searched from the Flickr website and highly related tags were derived from the proposed algorith m . T he results of this research were examined and compared with the results of existing studies . It was found that the proposed research showed more advanced accuracy and precision than e arlier methods.
关键词:Tag; highly related tag; threshold; cluster; information retrieval