首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Data-driven supervised learning of a viral protease specificity landscape from deep sequencing and molecular simulations
  • 作者:Manasi A. Pethe ; Manasi A. Pethe ; Aliza B. Rubenstein
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2019
  • 卷号:116
  • 期号:1
  • 页码:168-176
  • DOI:10.1073/pnas.1805256116
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:Biophysical interactions between proteins and peptides are key determinants of molecular recognition specificity landscapes. However, an understanding of how molecular structure and residue-level energetics at protein−peptide interfaces shape these landscapes remains elusive. We combine information from yeast-based library screening, next-generation sequencing, and structure-based modeling in a supervised machine learning approach to report the comprehensive sequence−energetics−function mapping of the specificity landscape of the hepatitis C virus (HCV) NS3/4A protease, whose function—site-specific cleavages of the viral polyprotein—is a key determinant of viral fitness. We screened a library of substrates in which five residue positions were randomized and measured cleavability of ∼30,000 substrates (∼1% of the library) using yeast display and fluorescence-activated cell sorting followed by deep sequencing. Structure-based models of a subset of experimentally derived sequences were used in a supervised learning procedure to train a support vector machine to predict the cleavability of 3.2 million substrate variants by the HCV protease. The resulting landscape allows identification of previously unidentified HCV protease substrates, and graph-theoretic analyses reveal extensive clustering of cleavable and uncleavable motifs in sequence space. Specificity landscapes of known drug-resistant variants are similarly clustered. The described approach should enable the elucidation and redesign of specificity landscapes of a wide variety of proteases, including human-origin enzymes. Our results also suggest a possible role for residue-level energetics in shaping plateau-like functional landscapes predicted from viral quasispecies theory.
  • 关键词:protease ; sequence−function mapping ; substrate specificity ; machine learning ; molecular modeling
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有