文章基本信息

标题：Efficient Method for Mining Patterns from Highly Similar and Dense Database based on Prefix-Frequent-Items
本地全文：下载
作者：Han, Meng ; Wang, Zhihai ; Yuan, Jidong 等
期刊名称：Journal of Software
印刷版ISSN：1796-217X
出版年度：2014
卷号：9
期号：8
页码：2080-2086
DOI：10.4304/jsw.9.8.2080-2086
语种：English
出版社：Academy Publisher
摘要：In recent years, there are a great deal of efforts on sequential pattern mining, but some challenges have not been resolved, such as large search spaces and the ineffectiveness in handling highly similar, dense and long sequences. This paper mainly focuses on how to design some effective search space pruning methods to accelerate the mining process. We present a novel structure, Prefix-Frequent-Items Graph (PFI-Graph), which presents the prefix frequent items of other items in sequential patterns. An efficient algorithm PFI-PrefixSpan (Prefix-Frequent- Items PrefixSpan) based on PFI-Graph is proposed in this paper. It avoids redundant data scanning, and thus can effectively speed up the discovery process of new patterns. Extensive experimental results on some synthetic and real sequence datasets show that the proposed novel structure is substantially more efficient than PrefixSpan with physical-projection and pseudo-projection, especially for dense and highly similar sequence databases.
关键词：sequential pattern mining; dense database; highly similar sequence; long sequence; prefix frequent items