期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2013
卷号:3
期号:4
出版社:S.S. Mishra
摘要:With the high development of Internet, e-commerce web sites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. To address the problem of extracting and maintaining a very large number of user profiles from large scale data, we first describe the different scalable implementations of the proposed framework. Then we will see the challenges they faced in the implementation. And at the end we will see how hadoop can be used as an efficient solution for the problem.
关键词:C User Profile; Web Logs; WEB data mining; Hadoop Framework; MapReduce