摘要:Selection of good models from a structural model pool is an important and challenging step in protein structure prediction. While various score functions have been developed, their applications in protein structure predictions are unsatisfactory. In this study, we developed a novel two-stage optimization method which effectively combines a set of basic scoring functions for improving the selection performance. In the first stage of protein-dependent optimization, this method combines seven scoring functions and optimizes the weights among them on the model pool of each protein. In the second stage, the method integrates scores with optimized protein-dependent weights, and then seeks correlations among these scores and structural features using a Support Vector Machine (SVM) to predict the quality of protein structures. Test results on two benchmarks from different model generation methods showed that the sum of basic scoring functions with optimized weights achieved better model selection performance than any individual scoring function or equal-weight combination of these scoring functions. A leave-one-out test demonstrated further improvement in the second stage over the score of the weighted sum.
关键词:protein model selection; score combination; scoring functions