首页    期刊浏览 2024年11月06日 星期三
登录注册

文章基本信息

  • 标题:A New Approach for Detecting Concept Drift and Measuring its Intensity in Large Datasets
  • 本地全文:下载
  • 作者:Hisham Ogbah ; Abdallah Alashqur
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2016
  • 卷号:16
  • 期号:12
  • 页码:109-116
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:The importance of data mining in general and classification in particular has increased in recent years due to the overwhelming amount of digital data that is produced world-wide on a daily basis. In classification, data tuples are mapped to a limited number of classes. The classifier learns (or derives) a classification model from a pre-classified dataset. The learned classification model can be represented in different forms such as a decision tree, set of rules, or support vector machines, to name a few. After the classifier completes the learning phase, it can predict the class of newly added data based on the model that it learned. Quite often a concept drift may occur due to changes in the environment, style, trend, or for many other reasons. Data that used to map to, say, class_a before the drift, now maps to class_b. But based on the knowledge embodied in the model, the system will still wrongfully predict class_a for the same data. This difference between what the model would predict and the actual classification is a sign that a concept drift has occurred and the classification model has become obsolete. In this case, a new model needs to be generated. In this paper we introduce a new efficient algorithm for detecting the occurrence of a concept drift and introduce a way of measuring the intensity of the drift. Measuring the intensity of the drift is important because it impacts how we may choose to deal with it going forward.
  • 关键词:Classification; Concept Drift; Drift detection; Big Data
国家哲学社会科学文献中心版权所有