期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2007
卷号:7
期号:8
页码:20-28
出版社:International Journal of Computer Science and Network Security
摘要:The technology world has provided a more efficient and quicker way of accessing information through the web and databases in organizations that implement information systems in order to achieve a competitive edge. The simplest way of filtering information is to extract keywords in measuring the documents relevance. Nonetheless, getting to the right document is often a problem. Synonymy i.e., two words with the same meaning, for example, taxi and cab is a major problem in information searching. This work uses the soft computing techniques in the area of information retrieval and they encompass both fuzzy set theory and probability theory. We propose an algorithm for computing asymmetric word similarities (AWS) to overcome the synonymy problem. The algorithm is computed using mass assignment based on fuzzy sets of words. A key feature of our algorithm is that it is incremental, i.e. words (and documents) can be added or subtracted without extensive re-computation. AWS produced similarity measures of consistently 10% higher than tf.idf algorithm and performed successful document groupings.
关键词:fuzzy set, soft computing, asymmetric word similarity, information retrieval