首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Predicting the Underlying Structure for Phylogenetic Trees Using Neural Networks and Logistic Regression
  • 本地全文:下载
  • 作者:Hassan W. Kayondo ; Samuel Mwalili
  • 期刊名称:Open Journal of Statistics
  • 印刷版ISSN:2161-718X
  • 电子版ISSN:2161-7198
  • 出版年度:2020
  • 卷号:10
  • 期号:2
  • 页码:239-251
  • DOI:10.4236/ojs.2020.102017
  • 出版社:Scientific Research Publishing
  • 摘要:Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ from those needed when a population is not structured. In this paper, we compared two supervised machine learning techniques, that is artificial neural network (ANN) and logistic regression models for prediction of an underlying structure for phylogenetic trees. We carried out parameter tuning for the models to identify optimal models. We then performed 10-fold cross-validation on the optimal models for both logistic regression and ANN. We also performed a non-supervised technique called clustering to identify the number of clusters that could be identified from simulated phylogenetic trees. The trees were from both structured and non-structured populations. Clustering and prediction using classification techniques were done using tree statistics such as Colless, Sackin and cophenetic indices, among others. Results from 10-fold cross-validation revealed that both logistic regression and ANN models had comparable results, with both models having average accuracy rates of over 0.75. Most of the clustering indices used resulted in 2 or 3 as the optimal number of clusters.
  • 关键词:Artificial Neural Networks;Logistic Regression;Phylogenetic Tree;Tree Statistics;Classification;Clustering
国家哲学社会科学文献中心版权所有