首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Genetic-based Summarization for Local Outlier Detection in Data Stream
  • 本地全文:下载
  • 作者:Mohamed Sakr ; Walid Atwa ; Arabi Keshk
  • 期刊名称:International Journal of Intelligent Systems and Applications
  • 印刷版ISSN:2074-904X
  • 电子版ISSN:2074-9058
  • 出版年度:2021
  • 卷号:13
  • 期号:1
  • 页码:60-70
  • DOI:10.5815/ijisa.2021.01.05
  • 出版社:MECS Publisher
  • 摘要:Outlier detection is one of the important tasks in data mining. Detecting outliers over streaming data has become an important task in many applications, such as network analysis, fraud detections, and environment monitoring. One of the well-known outlier detection algorithms called Local Outlier Factor (LOF). However, the original LOF has many drawbacks that can’t be used with data streams: 1- it needs a lot of processing power (CPU) and large memory to detect the outliers. 2- it deals with static data which mean that in any change in data the LOF recalculates the outliers from the beginning on the whole data. These drawbacks make big challenges for existing outlier detection algorithms in terms of their accuracies when they are implemented in the streaming environment. In this paper, we propose a new algorithm called GSILOF that focuses on detecting outliers from data streams using genetics. GSILOF solve the problem of large memory needed as it has fixed memory bound. GSILOF has two phases. First, the summarization phase that tries to summarize the past data arrived. Second, the detection phase detects the outliers from the new arriving data. The summarization phase uses a genetic algorithm to try to find the subset of points that can represent the whole original set. our experiments have been done over real datasets. Our experiments confirming the effectiveness of the proposed approach and the high quality of approximate solutions in a set of real-world streaming data.
  • 关键词:Outlier detection;data streams;local outlier factor;genetics
国家哲学社会科学文献中心版权所有