首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Text Mining as a Strategy in Profiling the Use of Influenza Virus Genome in Scientific Publications
  • 本地全文:下载
  • 作者:Fernanda C. R. Correa ; Aline A. Vanin ; Silvio C. Cazella
  • 期刊名称:International Journal of Computer and Information Technology
  • 印刷版ISSN:2279-0764
  • 出版年度:2016
  • 卷号:5
  • 期号:5
  • 页码:438-442
  • 出版社:International Journal of Computer and Information Technology
  • 摘要:The aim of this study was to profile the use and usage patterns of influenza virus genome from scientific publications in online databases using Natural Language Processing and Text Mining techniques. A systematic research was performed to select papers in PubMed electronic database using the keywords: ‘influenza’, ‘genome’, ‘database’. The 45 articles that presented free full text available were processed with the sofwares AntFileConverter and AntConc. Text Mining was performed with the software Weka. Association rules were expected between genome and influenza. Also, it was predicted that influenza genome and terms related directly to the application of genome databases would relate. However, the results revealed an association between influenza virus protein and mutation sequence/database. The discovery of different associations than the expected revealed the necessity of expanding the research in order to increase the size of the corpus and to improve the attributes selection for mining in Weka sofware.
  • 关键词:Data Mining; Natural Language Processing; Influenza A virus; Genome; Viral; Databases; Nucleic Acid
国家哲学社会科学文献中心版权所有