期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2005
卷号:102
期号:22
页码:7900-7905
DOI:10.1073/pnas.0502790102
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:Sequence comparison across multiple organisms aids in the detection of regions under selection. However, resource limitations require a prioritization of genomes to be sequenced. This prioritization should be grounded in two considerations: the lineal scope encompassing the biological phenomena of interest, and the optimal species within that scope for detecting functional elements. We introduce a statistical framework for optimal species subset selection, based on maximizing power to detect conserved sites. Analysis of a phylogenetic star topology shows theoretically that the optimal species subset is not in general the most evolutionarily diverged subset. We then demonstrate this finding empirically in a study of vertebrate species. Our results suggest that marsupials are prime sequencing candidates.
关键词:hypothesis testing ; likelihood ratio ; sequence analysis