期刊名称:International Journal of Computers and Communications
印刷版ISSN:2074-1294
出版年度:2012
卷号:6
期号:1
页码:60-67
出版社:University Press
摘要:Predicting the next page to be visited by a web user with increasing accuracy have many important applications like caching and prefetching web pages to improve the speed of navigation or creating systems of recommendation to help users to find faster in the site what they are looking for. We have created a java program, using Net Beans IDE, that calculates the probability of visiting the pages using the page rank algorithm and counting links. For exemplification we used the NASA log file available online at http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html and a log file from a commercial web site http://www.nice-layouts.com. We applied to the entire data set of sessions the program and we obtained probabilities of visiting the pages. After that we applied the program only to the subset of sessions which contain the current page. For data obtained from log files of the NASA website was obtained an improvement in prediction in the sense of increasing the precentage from 19,75% to 32,5%. In the case of data obtained from the log files of the commercial site the improvements for the predictions was smaller from 74,66% to 77,77%. In the chapter with conclusions we present explanations for this differences of improvments obtained in those two cases.
关键词:Clickstream; Link counts; Page Rank; Prediction;Web logs.