期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2004
卷号:101
期号:35
页码:12957-12962
DOI:10.1073/pnas.0402177101
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:The unambiguous footprint of positive Darwinian selection in protein-coding DNA sequences is revealed by an excess of nonsynonymous substitutions over synonymous substitutions compared with the neutral expectation. Methods for analyzing the patterns of nonsynonymous and synonymous substitutions usually rely on stochastic models in which the selection regime may vary across the sequence but remains constant across lineages for any amino acid position. Despite some work that has relaxed the constraint that selection patterns remain constant over time, no model provides a strong statistical framework to deal with switches between selection processes at individual sites during the course of evolution. This paper describes an approach that allows the site-specific selection process to vary along lineages of a phylogenetic tree. The parameters of the switching model of codon substitution are estimated by using maximum likelihood. The analysis of eight HIV-1 env homologous sequence data sets shows that this model provides a significantly better fit to the data than one that does not take into account switches between selection patterns in the phylogeny at individual sites. We also provide strong evidence that the strength and the frequency of occurrence of selection might not be estimated accurately when the site-specific variation of selection regimes is ignored.
关键词:positive selection ; codon-based model of nucleotide substitutions ; phylogeny ; maximum likelihood