摘要:Prediction of protein solubility is gaining importance with the growing use of protein molecules as therapeutics, and ongoing requirements for high level expression. We have investigated protein surface features that correlate with insolubility. Non-polar surface patches associate to some degree with insolubility, but this is far exceeded by the association with positively-charged patches. Negatively-charged patches do not separate insoluble/soluble subsets. The separation of soluble and insoluble subsets by positive charge clustering (area under the curve for a ROC plot is 0.85) has a striking parallel with the separation that delineates nucleic acid-binding proteins, although most of the insoluble dataset are not known to bind nucleic acid. Additionally, these basic patches are enriched for arginine, relative to lysine. The results are discussed in the context of expression systems and downstream processing, contributing to a view of protein solubility in which the molecular interactions of charged groups are far from equivalent.