期刊名称:International Journal of Computer Science and Security (IJCSS)
电子版ISSN:1985-1553
出版年度:2008
卷号:2
期号:4
页码:18-27
出版社:Computer Science Journals
摘要:The World Wide Web is growing exponentially and the dynamic, unstructured nature of the web makes it difficult to locate useful resources. Web Search engines such as Google and Alta Vista provide huge amount of information many of which might not be relevant to the users query. In this paper, we build a vertical search engine which takes a seed URL and classifies the URLs crawled as Medical or Finance domains. The filter component of the vertical search engine classifies the web pages downloaded by the crawler into appropriate domains. The web pages crawled is checked for relevance based on the domain chosen and indexed. External users query the database with keywords to search; The Domain classifiers classify the URLs into relevant domain and are presented in descending order according to the rank number. This paper focuses on two issues ââ,¬âO page relevance to a particular domain and page contents for the search keywords to improve the quality of URLs to be listed thereby avoiding irrelevant or low-quality ones .