首页    期刊浏览 2025年07月12日 星期六
登录注册

文章基本信息

  • 标题:A large-scale study on research code quality and execution
  • 本地全文:下载
  • 作者:ana trisovic ; Matthew K.Lau ; thomas Pasquier
  • 期刊名称:Scientific Data
  • 电子版ISSN:2052-4463
  • 出版年度:2022
  • 卷号:9
  • 期号:1
  • 页码:1-16
  • DOI:10.1038/s41597-022-01143-6
  • 语种:English
  • 出版社:Nature Publishing Group
  • 摘要:this article presents a study on the quality and execution of research code from publicly-available replication datasets at the Harvard Dataverse repository. Research code is typically created by a group of scientists and published together with academic papers to facilitate research transparency and reproducibility. For this study, we defne ten questions to address aspects impacting research reproducibility and reuse . First, we retrieve and analyze more than 2000 replication datasets with over 9000 unique R fles published from 2010 to 2020. Second, we execute the code in a clean runtime environment to assess its ease of reuse . Common coding errors were identifed, and some of them were solved with automatic code cleaning to aid code execution . We fnd that 74% of R fles failed to complete without error in the initial execution, while 56% failed when code cleaning was applied, showing that many errors can be prevented with good coding practices. We also analyze the replication datasets from journals’ collections and discuss the impact of the journal policy strictness on the code re-execution rate. Finally, based on our results, we propose a set of recommendations for code dissemination aimed at researchers, journals, and repositories.
国家哲学社会科学文献中心版权所有