首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:De-Duplication Scheduling Strategy in Real-Time Data Warehouse
  • 本地全文:下载
  • 作者:Jie Song ; Hui Liu ; JinBo Wu
  • 期刊名称:The Open Cybernetics & Systemics Journal
  • 电子版ISSN:1874-110X
  • 出版年度:2015
  • 卷号:9
  • 期号:1
  • 页码:37-43
  • DOI:10.2174/1874110X01509010037
  • 出版社:Bentham Science Publishers Ltd
  • 摘要:

    Data quality of the data warehouse is crucial to decision-makers. Data duplication is considered one of the critical factors that affect the data quality. Therefore, data de-duplication is an essential process for data warehousing. Particularly, for a real-time data warehouse, it is necessary to ensure not only the data quality in real-time, but also the performance of the front-end queries and analysis. The scheduling strategy of de-duplication in a real-time data warehouse should be well studied. In this paper, we firstly investigate the three kinds of data de-duplication scheduling strategies named De-duplication Prior scheduling Strategy (DPS), Real-time scheduling Strategy (RS) and ETL Prior scheduling Strategy (EPS); then propose a new Time-Triggered scheduling Strategy (TTS) which belongs to EPS; finally evaluate the performance of the proposed scheduling strategy through experiments. This work is contributed to the efficient data cleaning and application of real-time data warehouse.

国家哲学社会科学文献中心版权所有