文章基本信息

标题：A confidence predictor for logD using conformal regression and a support-vector machine
本地全文：下载
作者：Maris Lapins ; Staffan Arvidsson ; Samuel Lampa 等
期刊名称：Journal of Cheminformatics
印刷版ISSN：1758-2946
电子版ISSN：1758-2946
出版年度：2018
卷号：10
期号：1
页码：17
DOI：10.1186/s13321-018-0271-1
语种：English
出版社：BioMed Central
摘要：Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water–octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of $$\hbox {Q}^,=0.973$$ Q 2 = 0.973 and with the best performing nonconformity measure having median prediction interval of $$\pm ~0.39$$ ± 0.39 log units at 80% confidence and $$\pm ~0.60$$ ± 0.60 log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.
关键词：Conformal prediction ; Machine learning ; QSAR ; Support-vector machine ; LogD ; RDF