期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2016
卷号:7
期号:4
页码:211-216
语种:English
出版社:Ayushmaan Technologies
摘要:Sequential pattern mining is an important model in data mining. Its mining algorithms discover all item sets in the data that satisfy the user-specified minimum support (minsup) and minimum confidence (mincon) constraints. Minsup controls the minimum number of data cases that a rule must cover. Mincon controls the analytical strength of the rule. Since only one minsup is used for the whole database, the model completely assumes that all items in the data are of the same nature and have similar frequencies in the data. In many applications, some data items appear frequently in the data, while others rarely appeared. If minsup is set too high, those rules that involve rare data items will not be found. To find rules that involve both frequent and rare items, minsup has to be set very low. This may effect combinational explosion because those frequent items will be associated with one another in all possible ways. This problem is called the rare item problem. This paper proposes to solve this problem. The technique allows the user to specify multiple minimum supports (MMS) to reflect the natures of the items and their mixed frequencies in the database. In data mining, different rules may need to satisfy different minimum supports depending on what items are in the database. Experiment results show that the technique is very effective.
关键词:Mining Sequential Patterns;Multiple Minimum Support (MMS); Large Sequence Databases