首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Linguistic measures of chemical diversity and the “keywords” of molecular collections
  • 本地全文:下载
  • 作者:Michał Woźniak ; Agnieszka Wołos ; Urszula Modrzyk
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2018
  • 卷号:8
  • 期号:1
  • 页码:7598
  • DOI:10.1038/s41598-018-25440-6
  • 语种:English
  • 出版社:Springer Nature
  • 摘要:Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections ("corpora"), including those deposited on the Internet - indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic "chemical words" that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular "keywords" by which such collections are best characterized and annotated.
国家哲学社会科学文献中心版权所有