期刊名称:International Journal of Advanced Computer Research
印刷版ISSN:2249-7277
电子版ISSN:2277-7970
出版年度:2015
卷号:5
期号:21
页码:334-346
出版社:Association of Computer Communication Education for National Triumph (ACCENT)
摘要:Current vector space model, for instance TF/IDF, has not yet taken into account the relations between terms; it only combines the term frequency in a document and the inverse document frequency in whole database to identify importance-score (weight) of a term respect with the document. Here we discover lexical relations among terms in the document to improve the vector space model TF/IDF. The weight generated from TF/IDF for each term, which is improved by lexical relations among related terms in the document. We evaluate the proposed method using documents selected from Wikipedia. The result shown that the proposed method is significant effective.
关键词:Sector space model; TF/IDF; Semantics; Information retrieval; Natural language processing.