首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:AUTOMATICALLY CLASSIFYING PUBMED ABSTRACTS AS BENCH OR BEDSIDE
  • 本地全文:下载
  • 作者:Daniel McDonald ; Michelle Ashton
  • 期刊名称:Issues in Information Systems
  • 印刷版ISSN:1529-7314
  • 出版年度:2017
  • 卷号:18
  • 期号:1
  • 页码:22-30
  • 出版社:International Association for Computer Information Systems
  • 摘要:The PubMed database contains over 20 million research abstracts ranging from lab experiments and gene arrays topatient-facing research. We introduce a classification task that groups PubMed abstracts into categories of basicscience and clinical research. We present a conditional probability and a decision tree algorithm and compare thetwo algorithms based on three different feature sets. The first feature set consists of semantic tags that appear as verbsin the abstract. The second feature set consists of tags that are nouns and appear as subjects or objects within asentence. The third feature set consists of the first two feature sets combined. Algorithms are evaluated using precision,recall and f-measure measurements. The decision tree algorithm with features made up of both verb tags and tagsfrom subjects and objects outperformed all other combinations achieving a precision of 97 percent and a recall of96.8 percent. The lack of fallback rules when using the conditional probability algorithm hurt its performance. Thedecision tree algorithm was more robust to testing abstracts of different lengths and unseen feature values.
  • 关键词:Information retrieval; document classification; text mining
国家哲学社会科学文献中心版权所有