首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Using Change Point Detection for Monitoring the Quality of Aggregate Data
  • 本地全文:下载
  • 作者:Ian Painter ; Julie Eaton ; Bill Lober
  • 期刊名称:Online Journal of Public Health Informatics
  • 电子版ISSN:1947-2579
  • 出版年度:2013
  • 卷号:5
  • 期号:1
  • 语种:English
  • 出版社:University of Illinois at Chicago
  • 摘要:Introduction: Data consisting of counts or indicators aggregated from multiple sources pose particular problems for data quality monitoring when the users of the aggregate data are blind to the individual sources. This arises when agencies wish to share data but for privacy or contractual reasons are only able to share data at an aggregate level. If the aggregators of the data are unable to guarantee the quality of either the sources of the data or the aggregation process then the quality of the aggregate data may be compromised. This situation arose in the Distribute surveillance system (1). Distribute was a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance for influenza-like-illness (ILI) that integrated data from existing state and local public health department surveillance systems, and operated from 2006 until mid 2012. Distribute was designed to work solely with aggregated data, with sites providing data aggregated from sources within their jurisdiction, and for which detailed information on the un-aggregated ‘raw’ data was unavailable. Previous work (2) on Distribute data quality identified several issues caused in part by the nature of the system: transient problems due to inconsistent uploads, problems associated with transient or long-term changes in the source make up of the reporting sites and lack of data timeliness due to individual site data accruing over time rather than in batch. Data timeliness was addressed using prediction intervals to assess the reliability of the partially accrued data (3). The types of data quality issues present in the Distribute data are likely to appear to some extent in any aggregate data surveillance system where direct control over the quality of the source data is not possible. In this work we present methods for detecting both transient and long-term changes in the source data makeup.
国家哲学社会科学文献中心版权所有