期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2015
卷号:28
期号:4
页码:200-202
DOI:10.14445/22312803/IJCTT-V28P139
出版社:Seventh Sense Research Group
摘要:As with the growth in businesses which requires knowledge about customer behavior and trends for making crucial and vital decision about policies to be formed based on varying complex parameter are real need for overall benefits and growth of business as well as end users. User metadata analysis plays one of the vital role for the same. User sessions are session obtained from various logs maintained by application or web servers. Access logs basically maintains records containing access time, IP, URL, response etc. through which useful results can be derived. Session identification is a common strategy used to develop metrics for web analytics and behavioral analyses of userfacing systems and further it is used for pattern identification and analysis. A very powerful way to handle huge amount of data is by using HDFS, Hadoop Distributed File System, which provides way to distribute data among several machines connected in a network called cluster. MapReduce provides creation of such queries which run on all nodes trough mapper and collect the individual result to form as a whole in reducer. This research suggests implementation of each sessionization [1] process using Hadoop MapReduce to improve processing performance. User session identification process can be improved by combining right available techniques to get more effective and accurate results and using distributed file processing system like Hadoop, the overall processing can be speedup to a great extent.
关键词:Web Mining; Data Preprocessing; PatternAnalysis; Hadoop; Distributed File System.