首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Building a PubMed knowledge graph
  • 本地全文:下载
  • 作者:Jian Xu ; Sunkyu Kim ; Min Song
  • 期刊名称:Scientific Data
  • 电子版ISSN:2052-4463
  • 出版年度:2020
  • 卷号:7
  • 期号:1
  • 页码:1-15
  • DOI:10.1038/s41597-020-0543-2
  • 语种:English
  • 出版社:Nature Publishing Group
  • 摘要:PubMed庐 is an essential resource for the medical domain, but useful concepts are either difficult to extract or are ambiguous, which has significantly hindered knowledge discovery. To address this issue, we constructed a PubMed knowledge graph (PKG) by extracting bio-entities from 29 million PubMed abstracts, disambiguating author names, integrating funding data through the National Institutes of Health (NIH) ExPORTER, collecting聽affiliation history and educational background of authors from ORCID庐, and identifying聽fine-grained affiliation data from MapAffil. Through the聽integration of these credible multi-source data, we could create connections among the bio-entities, authors, articles, affiliations, and funding. Data validation revealed that the BioBERT deep learning method of bio-entity extraction significantly outperformed the state-of-the-art models based on the F1 score (by 0.51%), with the author name disambiguation (AND) achieving an F1 score of 98.09%. PKG can trigger broader innovations, not only enabling us to measure scholarly impact, knowledge usage, and knowledge transfer, but also assisting us in profiling authors and organizations based on their connections with bio-entities.
国家哲学社会科学文献中心版权所有