首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Effective Data Cleaning with Continuous Evaluation
  • 本地全文:下载
  • 作者:Ihab F. Ilyas
  • 期刊名称:Bulletin of the Technical Committee on Data Engineering
  • 出版年度:2016
  • 卷号:39
  • 期号:2
  • 页码:38
  • 出版社:IEEE Computer Society
  • 摘要:Enterprises have been acquiring large amounts of data from a variety of sources to build theirown “Data Lakes”, with the goal of enriching their data asset and enabling richer and more informedanalytics. The pace of the acquisition and the variety of the data sources make it impossible to clean thisdata as it arrives. This new reality has made data cleaning a continuous process and a part of day-to-daydata processing activities. The large body of data cleaning algorithms and techniques is strong evidenceof how complex the problem is, yet, it has had little success in being adopted in real-world data cleaningapplications. In this article we examine how the community has been evaluating the effectiveness of datacleaning algorithms, and if current data cleaning proposals are solving the right problems to enable thedevelopment of deployable and effective solutions.
国家哲学社会科学文献中心版权所有