文章基本信息

标题：Mining the Web for Association Similarity between Concepts
本地全文：下载
作者：CHEN, Li ; SONG, Zi-lin ; ZHANG, Ying 等
期刊名称：Journal of Software
印刷版ISSN：1796-217X
出版年度：2012
卷号：7
期号：5
页码：1006-1013
DOI：10.4304/jsw.7.5.1006-1013
语种：English
出版社：Academy Publisher
摘要：Measures of similarity between two terms or concepts have been widely used in the domain of Natural Language Processing, Semantic Web, and so on. There are mainly two kinds of methods for measuring similarity. One is based on prior manually built taxonomy or Ontology; the other which is usually referred to as the statistical approaches is based on the corpus. However, the ontology-based method has problem of coverage and the corpus-based method has the problem of sparse data. In order to overcome these problems, a huge data source World Wide Web was used to calculate similarity between concepts. The concept similarity was measured using the association rule mining in the snippets returned from Web search engines. The most influential algorithm for association rule mining is Apriori. In order to improve the efficiency of Apriori algorithm and use it to measure the concept similarity, there are three main improvements in Apriori algorithm. The experimental result shows that the algorithm can improve the precise of measuring concept similarity.
关键词：concept similarity;snippets;association similarity;improved Apriori algorithm