期刊名称:International Journal of Computer Science and Communication Networks
电子版ISSN:2249-5789
出版年度:2012
卷号:2
期号:4
页码:526-530
出版社:Technopark Publications
摘要:Millions of visitors interact daily with web sites around the world. Huge amount of data are being generated and these information could be very prized to the company in the field of understanding Customer’s behaviors. In this paper a complete preprocessing methodology having data cleaning, Enhanced preprocessing technique one of the User Identification which is key issue in preprocessing technique phase is to identify the web users. Traditional User Identification is based on the site structure by using some heuristic rules. In most cases relationship between pages are based on the site topology which reduced the efficiency of identification solve this problem we introduced proposed Technique DUI (Distinct User Identification) based on IP address ,Agent ,Referred pages on desired session time. Which can be used in counter terrorism, fraud detection and detection of unusual access of secure data, as well as through detection of frequent access behavior improve the overall designing and performance of future access. Experiments have proved that advanced data preprocessing technique can enhanced the quality of data preprocessing results.
关键词:server log; Web usage mining; preprocssing; user identification