首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Code List Library: A Solution to Improve Research Repeatability, Transparency, and Efficiency by Curating Lists of Clinical Codes
  • 本地全文:下载
  • 作者:Daniel Thayer ; David Bown ; Thomas Leake
  • 期刊名称:International Journal of Population Data Science
  • 电子版ISSN:2399-4908
  • 出版年度:2018
  • 卷号:3
  • 期号:4
  • 页码:1-1
  • DOI:10.23889/ijpds.v3i4.891
  • 出版社:Swansea University
  • 摘要:IntroductionSets of clinical codes that define conditions and events of interest are a key knowledge product in health data research. Documenting such lists is essential for transparency and repeatability, and there is great potential benefit in their sharing and reuse. We designed and implemented software to address these goals. Objectives and ApproachOur goals were threefold: Provide a graphical user interface (GUI) to allow easier creation of code lists, for less technical users. Allow clear documentation of code lists, preserving the history of their creation and capturing metadata about their meaning, provenance, and use. Facilitate programmatic access, so that the software is not just documentation but can be integrated into data preparation and analysis. To these ends, we developed a web application using Python and PostgreSQL that allows creating, editing, and accessing via a GUI, as well as a REST API for integration into SQL, R, and other environments. ResultsThe software allows users to view and create lists through a familiar web paradigm. Lists can be built by identifying codes in a variety of ways, including keyword searches, regular expressions, and more complex rules. A change history is stored. Information such as a description, whether there was clinical reviewed, and relevant publications is captured. The REST API allows access and use in a variety of settings. We have implemented a DB2 SQL interface to enable code lists to be used within database queries, and other interfaces such as an R package are planned for the future. It will be used within the SAIL Databank initially, with a public version for sharing across institutions planned. The code will be open source to enable further development. Conclusion/ImplicationsWe expect this tool to facilitate faster, higher-quality, more reproducible research in Wales and beyond. Hopefully it will be not just a standalone effort, but one small piece in a set of better tools and methods that will enable our field to truly realized the benefit of large linked datasets.
国家哲学社会科学文献中心版权所有