摘要:Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.