期刊名称:Computational and Structural Biotechnology Journal
印刷版ISSN:2001-0370
出版年度:2021
卷号:19
页码:2269-2278
DOI:10.1016/j.csbj.2021.04.028
出版社:Computational and Structural Biotechnology Journal
摘要:We develop a Regression-based Ranking by Pairwise Cluster Comparisons (RRPCC) method to rank clusters of similar protein complex conformations generated by an underlying docking program. The method leverages robust regression to predict the relative quality difference between any pair or clusters and combines these pairwise assessments to form a ranked list of clusters, from higher to lower quality. We apply RRPCC to clusters produced by the automated docking server ClusPro and, depending on the training/validation strategy, we show improvement by 24–100% in ranking acceptable or better quality clusters first, and by 15–100% in ranking medium or better quality clusters first. We compare the RRPCC–ClusPro combination to a number of alternatives, and show that very different machine learning approaches to scoring docked structures yield similar success rates. Finally, we discuss the current limitations on sampling and scoring, looking ahead to further improvements. Interestingly, some features important for improved scoring are internal energy terms that occur only due to the local energy minimization applied in the refinement stage following rigid body docking.