文章基本信息

标题：Correlation-centred variable selection of a gene expression signature to predict breast cancer metastasis
本地全文：下载
作者：Shiori Hikichi ; Masahiro Sugimoto ; Masaru Tomita 等
期刊名称：Scientific Reports
电子版ISSN：2045-2322
出版年度：2020
卷号：10
期号：1
页码：1-8
DOI：10.1038/s41598-020-64870-z
出版社：Springer Nature
摘要：Predictions of distant cancer metastasis based on gene signatures are studied intensively to realise precise diagnosis and treatments. Gene selection i.e. feature selection is a cornerstone to both establish accurate predictions and understand underlying pathologies. Here, we developed a simple but robust feature selection method using a correlation-centred approach to select minimal gene sets that have both high predictive and generalisation abilities. A multiple logistic regression model was used to predict 5-year metastases of patients with breast cancer. Gene expression data obtained from tumour samples of lymph node-negative breast cancer patients were randomly split into training and validation data. Our method selected 12 genes using training data and this showed a higher area under the receiver operating characteristic curve of 0.730 compared with 0.579 yielded by previously reported 76 genes. The signature with the predictive model was validated in an independent dataset, and its higher generalization ability was observed. Gene ontology analyses revealed that our method consistently selected genes with identical functions which frequently selected by the 76 genes. Taken together, our method identifies fewer gene sets bearing high predictive abilities, which would be versatile and applicable to predict other factors such as the outcomes of medical treatments and prognoses of other cancer types.