首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:A structural census of the current population of protein sequences
  • 本地全文:下载
  • 作者:Mark Gerstein ; Michael Levitt
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:1997
  • 卷号:94
  • 期号:22
  • 页码:11911-11916
  • DOI:10.1073/pnas.94.22.11911
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:We examine the occurrence of the {approx}300 known protein folds in different groups of organisms. To do this, we characterize a large fraction of the currently known protein sequences ({approx}140,000) in structural terms, by matching them to known structures via sequence comparison (or by secondary-structure class prediction for those without structural homologues). Overall, we find that an appreciable fraction of the known folds are present in each of the major groups of organisms (e.g., bacteria and eukaryotes share 156 of 275 folds), and most of the common folds are associated with many families of nonhomologous sequences (i.e., >10 sequence families for each common fold). However, different groups of organisms have characteristically distinct distributions of folds. So, for instance, some of the most common folds in vertebrates, such as globins or zinc fingers, are rare or absent in bacteria. Many of these differences in fold usage are biologically reasonable, such as the folds of metabolic enzymes being common in bacteria and those associated with extracellular transport and communication being common in animals. They also have important implications for database-based methods for fold recognition, suggesting that an unknown sequence from a plant is more likely to have a certain fold (e.g., a TIM barrel) than an unknown sequence from an animal.
  • 关键词:sequence analysis ; genome comparison ; fold family ; databank statistics ; protein evolution
国家哲学社会科学文献中心版权所有