期刊名称:International Journal of Soft Computing & Engineering
电子版ISSN:2231-2307
出版年度:2012
卷号:2
期号:1
页码:185-193
出版社:International Journal of Soft Computing & Engineering
摘要:The concept of sequence Data Mining was first introduced by Rakesh Agrawal and Ramakrishnan Srikant in the year 1995. The problem was first introduced in the context of market analysis. It aimed to retrieve frequent patterns in the sequences of products purchased by customers through time ordered transactions. Later on its application was extended to complex applications like telecommunication, network detection, DNA research, etc. Several algorithms were proposed. The very first was Apriori algorithm, which was put forward by the founders themselves. Later more scalable algorithms for complex applications were developed. E.g. GSP, Spade, PrefixSpan etc. The area underwent considerable advancements since its introduction in a short span. In this paper, a systematic survey of the sequential pattern mining algorithms is performed. This paper investigates these algorithms by classifying study of sequential pattern-mining algorithms into two broad categories. First, on the basis of algorithms which are designed to increase efficiency of mining and second, on the basis of various extensions of sequential pattern mining designed for certain application. At the end, comparative analysis is done on the basis of important key features supported by various algorithms and current research challenges are discussed in this field of data mining.