期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2012
卷号:9
期号:4
出版社:IJCSI Press
摘要:The World Wide Web (WWW) is the repository of large number of web pages which can be accessed via Internet by multiple users at the same time and therefore it is Ubiquitous in nature. The search engine is a key application used to search the web pages from this huge repository, which uses the link analysis for ranking the web pages without considering the facts provided by them. A new algorithm called Probability of Correctness of Facts(PCF)-Engine is proposed to find the accuracy of the facts provided by the web pages. It uses the Probability based similarity function (SIM) which performs the string matching between the true facts and the facts of web pages to find their probability of correctness. The existing semantic search engines, may give the relevant result to the user query but may not be 100% accurate. Our algorithm computes trustworthiness of websites to rank the web pages. Simulation results show that our approach is efficient when compared with existing Voting and Truthfinder[1] algorithms with respect to the trustworthiness of the websites.
关键词:Data Quality; Page Rank; Search Engine; Trustworthiness; Web Content Mining; Web Mining