期刊名称:International Journal of Electronics Communication and Computer Engineering
印刷版ISSN:2249-071X
电子版ISSN:2278-4209
出版年度:2013
卷号:4
期号:4
页码:1300-1305
出版社:IJECCE
摘要:Mining frequent patterns from large databases plays an essential role in many data mining tasks and has broad applications. Most of the previously proposed methods adopt apriori- like candidate-generation-and-test approaches. However, those methods may encounter serious challenges when mining datasets with prolific patterns and/or long patterns. In this work, we develop a class of novel and efficient pattern-growth methods for mining various frequent patterns from large databases. Pattern-growth methods adopt a divide- and-conquer approach to decompose both the mining tasks and the databases. Then, they use a pattern fragment growth method to avoid the costly candidate-generation-and-test processing completely. Moreover, effective data structures are proposed to compress crucial information about frequent patterns and avoid expensive, repeated database scans. A com- prehensive performance study shows that pattern-growth methods, FP-growth and H-mine, are efficient and scalable. They are faster than some recently reported new frequent pattern mining methods. Interestingly, pattern growth methods are not only efficient, but also effective. With pattern growth methods, many interesting patterns can also be mined efficiently, such as patterns with some tough non-anti-monotonic constraints and sequential patterns. These techniques have strong implications to many other data mining tasks