首页    期刊浏览 2024年07月09日 星期二
登录注册

文章基本信息

  • 标题:A Data Science Approach to Predictive Analytic Research and Knowledge Translation
  • 本地全文:下载
  • 作者:Stacey Fisher ; Robert Talarico ; Yulric Sequeira
  • 期刊名称:International Journal of Population Data Science
  • 电子版ISSN:2399-4908
  • 出版年度:2018
  • 卷号:3
  • 期号:4
  • 页码:1-1
  • DOI:10.23889/ijpds.v3i4.797
  • 出版社:Swansea University
  • 摘要:IntroductionCurrent approaches to the development and application of predictive studies is inefficient and difficult to reproduce. Thousands of predictive health algorithms have been developed; however, less than 2\% have been assessed outside their original setting and even fewer have been applied and evaluated in practice. Objectives and ApproachObjective: To develop a standardized workflow for algorithm development, dissemination and implementation. Existing predictive analytics workflow and open standards were adapted and expanded for health research and health care settings. The approach was designed to work within multidisciplinary teams and to improve research transparency, reproducibility, quality, efficiency and application. Key components include standardized algorithm description files, documentation and code libraries. All libraries and programming packages, which were created for/with open-source software, can be used for a wide range of statistical and machine learning models. Publicly-available repositories contain the algorithms, validation data, R code and other supporting infrastructure. ResultsAlgorithm development involves variable pre-specification and documentation of model variables, followed by creation of data preprocessing code to generate model variables from the study dataset. Preprocessing uses algorithm specification documentation and a function library, building upon and integrating with existing algorithms when possible to preventing code duplication. Models are output as a Predictive Modelling Markup Language (PMML) file, a portable industry standard for describing and scoring predictive models. A separate scoring "engine" is used to implement PMML-described algorithms in a range of settings, including algorithm validation at other research institutions. Algorithm applications currently include the Project Big Life (www.projectbiglife.ca) online calculators, population, health services and public health planning uses and an algorithm visualization tool. An API permits use of the calculator engine by other organizations. Conclusion/ImplicationsBarriers to the implementation of predictive analytics in real-world settings—such as within electronic medical records or decision aid applications—can be mitigated with well described algorithms that are easy to replicate and implement, especially as access to big health data increases and algorithms become increasingly complex.
国家哲学社会科学文献中心版权所有