期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2016
卷号:4
期号:2
页码:2507
DOI:10.15680/IJIRCCE.2016.0402223
出版社:S&S Publications
摘要:Crawling is one of the important systems for building knowledge stockpiles. Focused crawling is aimedat specifically finding out pages that are pertinent to a predefined set of subjects. The cause for semantic focusedcrawler is naturally finding, formatting and ordering the administration data with the semantic web advances.Heterogeneity, universality and equivocalness are the three major problems with administration clients when searchingfor mining administration data onto the internet. In this paper, we present the structure of a new self-adaptive semanticfocused crawler with machine learning approach, the motive of definitely and proficiently finding, arranging andindexing mining administration data onto the internet, with high performance rate by taking into account the threenoteworthy issues. This structure assembles the technologies of semantic focused crawling and machine learning, inorder to nurture the performance of this crawler, heedless of the variety in the web environment. Also it uses theconcepts of word net and semantic similarity.
关键词:Machine Learning; Semantic Similarity; Service Advertisements; Prediction; Semantic Focused;Crawler