期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2014
卷号:5
期号:6
页码:8138-8143
出版社:TechScience Publications
摘要:As the volume of information is in internet is increasing staggeringly therefore it is required to develop new methods for document retrieval and then ranking them according to their relevance value as per the user query. The quantity and complexity of information available over the internet is rapidly increasing. Information Retrieval helps in searching meaningful and relevant documents present on semantic web. The concept of semantic similarity is used in variety of fields like artificial intelligence, natural language processing, cognitive science, Biomedical Informatics etc. Even after many developments maintaining updated version of documents ordering as per user’s request is still a big challenge. Our approach uses natural language processing techniques for preprocessing of documents that includes standard representation of documents, removal of stop words, stemming etc. Then the keywords are extracted using a technique. Later a correlation value between the query keywords and document keywords is calculated for ranking documents by using WordSimilarity-353 Test Collection values. The novel approach suggested in this paper not only dependent on the syntactic structure of the document but the semantic structure also. It includes both lexical and conceptual matching. The combination of conceptual, linguistic and ontology based matching can significantly improve the performance of the retrieval system. Through experiments we have found that this semantic similarity based ranking methodology gives much better results as compared to traditional methods.
关键词:natural language processing; information;retrieval; semantic similarity; ontology