期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2017
卷号:5
期号:4
页码:6805
DOI:10.15680/IJIRCCE.2017.0504032
出版社:S&S Publications
摘要:Web usage mining is the application of data mining techniques to discover usage patterns from Webdata, in order to understand and better serve the needs of web-based applications. Web usage mining consists of threephases, preprocessing, pattern discovery, and pattern analysis. Data preprocessing is the process to convert the raw datainto the data abstraction necessary for the further applying the data mining algorithm. The preprocessed web log filecan then be suitable for the discovery and analysis of useful information referred to as web mining. To fulfill thisrequirement the navigations are recorded in web log file as well as the IP address of the website, session of usage &visited web link. This paper presents several data preparation techniques that can be used to improve the performanceof data preprocessing in order to identify unique users and user sessions. These techniques and algorithms have beenproved valid and efficient by experiments.
关键词:Web Usage Mining; Data Preprocessing; Longest Common Subsequence; LCS Problem with Fixed;Gap