首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:Hybrid Data Preprocessing: User Sessions Identification through Hadoop
  • 本地全文:下载
  • 作者:Vikram Singh Chauhan ; B.L Pal
  • 期刊名称:International Journal of Computer Trends and Technology
  • 电子版ISSN:2231-2803
  • 出版年度:2015
  • 卷号:28
  • 期号:4
  • 页码:200-202
  • DOI:10.14445/22312803/IJCTT-V28P139
  • 出版社:Seventh Sense Research Group
  • 摘要:As with the growth in businesses which requires knowledge about customer behavior and trends for making crucial and vital decision about policies to be formed based on varying complex parameter are real need for overall benefits and growth of business as well as end users. User metadata analysis plays one of the vital role for the same. User sessions are session obtained from various logs maintained by application or web servers. Access logs basically maintains records containing access time, IP, URL, response etc. through which useful results can be derived. Session identification is a common strategy used to develop metrics for web analytics and behavioral analyses of userfacing systems and further it is used for pattern identification and analysis. A very powerful way to handle huge amount of data is by using HDFS, Hadoop Distributed File System, which provides way to distribute data among several machines connected in a network called cluster. MapReduce provides creation of such queries which run on all nodes trough mapper and collect the individual result to form as a whole in reducer. This research suggests implementation of each sessionization [1] process using Hadoop MapReduce to improve processing performance. User session identification process can be improved by combining right available techniques to get more effective and accurate results and using distributed file processing system like Hadoop, the overall processing can be speedup to a great extent.
  • 关键词:Web Mining; Data Preprocessing; PatternAnalysis; Hadoop; Distributed File System.
国家哲学社会科学文献中心版权所有