首页    期刊浏览 2025年02月26日 星期三
登录注册

文章基本信息

  • 标题:Codon optimization with deep learning to enhance protein expression
  • 本地全文:下载
  • 作者:Hongguang Fu ; Yanbing Liang ; Xiuqin Zhong
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2020
  • 卷号:10
  • 期号:1
  • 页码:1-9
  • DOI:10.1038/s41598-020-74091-z
  • 出版社:Springer Nature
  • 摘要:Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive.
  • 其他摘要:Abstract Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence annotation of corresponding amino acids with codon boxes. The codon optimization models for Escherichia Coli were trained by the Bidirectional Long-Short-Term Memory Conditional Random Field. Theoretically, deep learning is a good method to obtain the distribution characteristics of DNA. In addition to the comparison of the codon adaptation index, protein expression experiments for plasmodium falciparum  candidate vaccine and polymerase acidic protein were implemented for comparison with the original sequences and the optimized sequences from Genewiz and ThermoFisher. The results show that our method for enhancing protein expression is efficient and competitive.
国家哲学社会科学文献中心版权所有