期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2015
卷号:3
期号:5
DOI:10.15680/ijircce.2015.0305134
出版社:S&S Publications
摘要:The web consist of Surface web and hidden web. Surface web is also known as publically indexableweb. It can be accessed by search engines using hyperlinks present on the pages and using simple keyword matchingschemes. Hidden web refers to content that is hidden behind HTML forms. This contains a large collection of data thatare unreachable by link-based search engines. A study conducted at University of California, Berkeley estimated thatthe deep web consists of around 91,000 terabytes of data, whereas the surface web is only about 167 terabytes. Thehidden and surface web crawlers return huge result set for the user query. But users commonly look at top ten or twentyresults that can be seen without scrolling. Users rarely look at results coming after first response page so ranking of theresults is needed. Till now ranking of the web data is a big challenge. Various scholars tried to propose better andefficient techniques for ranking. In this paper, various ranking methods for the hidden web as well as surface web willbe explored.
关键词:Surface Web; Hidden Web; Deep Web; Ranking Techniques