期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2012
卷号:2
期号:3
出版社:S.S. Mishra
摘要:The information on the web is growing dramatically and it is well known that over 80% of the time required to carry out any real world data mining project is usually spent on data pre-processing. Data pre-processing lays the groundwork for data mining. Before the discovery of useful information/knowledge, the target data set must be properly prepared. But it is unfortunately ignored by most researchers on data mining due to its perceived difficulty. This paper describes an efficient approach for data pre-processing for mining Web based user data in order to speed up the data preparation process. It not only provides flexibility for data pre-processing but also reduce complexity and difficulty of preparation for mining user data. However, the Web log data doesn't perform the data mining directly in most cases because of the messy and redundant content and other rea sons. So, this paper analyzes the data pre- processing on Web log in order to meet the needs of data mining.