首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:A contemporary feature selection and classification framework for imbalanced biomedical datasets
  • 作者:Thulasi Bikku ; Sambasiva Rao Nandam ; Ananda Rao Akepogu
  • 期刊名称:Egyptian Informatics Journal
  • 印刷版ISSN:1110-8665
  • 出版年度:2018
  • 卷号:19
  • 期号:3
  • 页码:191-198
  • DOI:10.1016/j.eij.2018.03.003
  • 出版社:Elsevier
  • 摘要:

    Due to the availability of a large number of biomedical documents in the PubMed and Medline repositories, it is difficult to analyze, predict and interpret the document’s information using the traditional document clustering and classification models. Traditional document clustering and classification models were failed to analyze the document sets based on the user’s keyword and MESH terms. Due to the large number of feature sets, conventional models, such as SVM, Neural Networks, Multi-nominal naïve bayes have been used as feature classification, where additional text filtering measures are typically used as feature selection process. Also, as the size of the document’s increases, it becomes difficult to find the outliers using the document’s features and MESH terms. Biomedical document clustering and classification is one of the essential machine learning models for the knowledge extraction process of the real-time user recommended systems. In this paper, we developed a novel biomedical document feature clustering and classification model as a user recommended system for large document sets using the Hadoop framework. In this model, a novel gene feature clustering with ensemble document classification was implemented on biomedical repositories (PubMed and Medline) using the MapReduce framework. Experimental results show that the proposed model has a high computational cluster quality rate and true positive classification rate compared to traditional document clustering and classification models.

  • 关键词:Biomedical data ; Document clustering ; Document classification ; Bioinformatics ; User recommended system
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有