首页    期刊浏览 2025年04月27日 星期日
登录注册

文章基本信息

  • 标题:Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents
  • 本地全文:下载
  • 作者:Georgios Drakopoulos
  • 期刊名称:Computation
  • 电子版ISSN:2079-3197
  • 出版年度:2017
  • 卷号:5
  • 期号:3
  • 页码:34
  • DOI:10.3390/computation5030034
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:Biomedicine is a pillar of the collective, scientific effort of human self-discovery, as well as a major source of humanistic data codified primarily in biomedical documents. Despite their rigid structure, maintaining and updating a considerably-sized collection of such documents is a task of overwhelming complexity mandating efficient information retrieval for the purpose of the integration of clustering schemes. The latter should work natively with inherently multidimensional data and higher order interdependencies. Additionally, past experience indicates that clustering should be semantically enhanced. Tensor algebra is the key to extending the current term-document model to more dimensions. In this article, an alternative keyword-term-document strategy, based on scientometric observations that keywords typically possess more expressive power than ordinary text terms, whose algorithmic cornerstones are third order tensors and MeSH ontological functions, is proposed. This strategy has been compared against a baseline using two different biomedical datasets, the TREC (Text REtrieval Conference) genomics benchmark and a large custom set of cognitive science articles from PubMed.
  • 关键词:humanistic data; higher order data; medical information retrieval; topic clustering; PubMed; MeSH Ontology; tensor algebra; tucker factorization humanistic data ; higher order data ; medical information retrieval ; topic clustering ; PubMed ; MeSH Ontology ; tensor algebra ; tucker factorization
国家哲学社会科学文献中心版权所有