首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately
  • 本地全文:下载
  • 作者:Inbal Budowski-Tal ; Yuval Nov ; Rachel Kolodny
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2010
  • 卷号:107
  • 期号:8
  • 页码:3481-3486
  • DOI:10.1073/pnas.0914097107
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"--a vector that counts the number of occurrences of each fragment--and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.
  • 关键词:evaluation of structure search ; fast structural search of Protein Data Bank ; filter and refine ; protein backbone fragments ; protein structure search
国家哲学社会科学文献中心版权所有