摘要:Uncertainty is inherent in data streams, and present new challenges to data streams mining. For continuous arriving and large size of data streams, modeling sequences of uncertain time series data streams require significantly more space. Therefore, it is important to construct compressed representation for storing uncertain time series data. Based on granules, sequential sketches are created to store hash-compressed granules. And based on sliding windows, a sketch update strategy is given to store most resent granules. As the sequential sketches may be saturated with the increasing of data streams, this paper presents an optimization strategy to delete the absolute sparse patterns. Based on the sequential sketches, a sequential pattern mining algorithm is proposed for mining uncertain data streams. The experimental results illustrate the effectiveness of the pattern mining algorithm.
关键词:Sketch;Sequential Pattern;Granulation;Uncertain Time Series;Data Stream