出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Classification algorithms to mine data stream have been extensively studied in recent years.However, a lot of these algorithms are designed for supervised learning which requireslabeled instances. Nevertheless, the labeling of the data is costly and time-consuming.Because of this, alternative learning paradigms have been proposed to reduce the cost of thelabeling process without significant loss of model performance. Active learning is one of theseparadigms, whose main objective is to build classification models that request the lowestpossible number of labeled examples achieving adequate levels of accuracy. Therefore, thiswork presents the FASE-AL algorithm which induces classification models with non-labeledinstances using Active Learning. FASE-AL is based on the algorithm Fast Adaptive Stackingof Ensembles (FASE). FASE is an ensemble algorithm that detects and adapts the model whenthe input data stream has concept drift. FASE-AL was compared with four different strategiesof active learning found in the literature. Real and synthetic databases were used in theexperiments. The algorithm achieves promising results in terms of the percentage of correctlyclassified instances.
关键词:Ensemble; active learning; data stream and concept drift.