期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2010
卷号:10
期号:2
页码:102-104
出版社:International Journal of Computer Science and Network Security
摘要:Today the internet has become one of the biggest information provider where web sites are playing a wide role, web mining has gained more important .Web mining can be classified into three main areas: web usage mining, Web content Mining and Web Structure mining .Web usage mining is a kind of web mining, which exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web users. There are generally three tasks in Web Usage Mining: Preprocessing, Pattern analysis and Knowledge discovery .Preprocessing cleans log file of server by removing log entries such as error or failure and repeated request for the same URL from the same host etc. The main task of Pattern analysis is to filter uninteresting information and to visualize and interpret the interesting pattern to users. The information collected from the log file can help to discover the knowledge. This knowledge collected can be used to take decision on various factors like Class1, Class 2, users and Eminent, Average and Delicate web pages based on hit counts of the web page in the website. The topology of the website is reconstructed based on hit counts which provide quick response to the web users. This paper addresses challenges in three phases of Web Usage mining along with Web Structure Mining
关键词:Web mining; web site; hit count; log file; HTTP; URL; topology