期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2017
卷号:95
期号:21
页码:5665
出版社:Journal of Theoretical and Applied
摘要:Web sequential pattern mining identifies frequent subsequences as patterns from large database. In this paper, a novel framework called Hectic Composition Mining based Approximate Pattern Tree (HCM-APT) to handle different access pattern by linking operation is presented. This framework extends the tree based indexing model, which dynamically adjusts links in the mining process using Composite Pattern Mining. A distinct feature of HCM-APT framework is that it clusters on a very limited and precisely predictable space which runs fast in memory based setting. As a result, the framework HCM-APT scales up to very large database through database segregation extensively minimizing the memory space. For dense base, competent Approximate Pattern Trees are constructed dynamically for obtaining rich properties by significantly reducing the execution time for obtaining rich properties. Finally, the proposed framework applies a scalable mining model for approximate patterns generated through tree using Variable Regression function for improving the scalability in mining large databases. Experimental results on Amazon Commerce reviews dataset show the proposed framework HCM-APT outperform other well-established methods in identifying hectic composition pattern. Experiment is conducted on factors such as execution time for obtaining rich properties, memory space consumption and scalability to mine the sequential patterns effectively.