标题:Frequent Itemset Generation using Enhanced Apriori Algorithm and Multiple Projection Rule Pruning Algorithm for Sessions Identified by Incremental Learning Approach for Dynamic Database Trace Logs
期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2015
卷号:6
期号:4
页码:238-244
语种:English
出版社:Ayushmaan Technologies
摘要:Web sessions is a sequence of requests sent to the database system by a user or an application to complete certain task. Session identification is a main task in preprocessing of database trace logs used for discovering useful patterns. The performance of database systems is improved by discovered patterns through getting predicted queries and rewriting the current query or conducting effective cache replacement. To overcome the above mentioned problem a novel session identification method uses an online incremental learning approach which provides more strongoutcomes for identifying session boundaries compared to ordinary timeout methods. It is named as (DS-OILSD) because it selects threshold automatically based on the Standard Deviation then it is applied for Dynamic web log Session identification based on the Online Incremental Learning (DS-OIL) approach. Session records in the database trace logs files doesn’t mine efficiently the frequent pattern mining in online incremental learning. An Enhanced Apriori Algorithm(EAA) is proposed to perform frequent itemset generation. The goal of EAA is optimization of the initial iterations of Apriori, which are the most time consuming ones because of considering the characterization of datasets by short or medium length frequent patterns. The innovative method is used for storing candidate set of items and counting their support, and the pruning techniques significantly reduce the size of the dataset as execution progresses are two main improvements. Multiple Projection Rule Pruning Algorithm (MPRPA) is used to create rule for frequent mined itemset. This algorithm arranges the rules in ascending order of support and confidence thresholds which is found from automatic support method based on the assumption that rules with high support and confidence values will have excellent information. The experimental results shows that the EAA methodfor frequent itemset generation for sessions identified from OLTP database application for characteristics analysis of web logs is effective and provides finest outcome when compared to Apriori and modified Apriori algorithm.