标题:Novel Monte Carlo approach quantifies data assemblage utility and reveals power of integrating molecular and clinical information for cancer prognosis
摘要:Current clinical practice in cancer stratifies patients based on tumour histology to determine prognosis. Molecular profiling has been hailed as the path towards personalised care, but molecular data are still typically analysed independently of known clinical information. Conventional clinical and histopathological data, if used, are added only to improve a molecular prediction, placing a high burden upon molecular data to be informative in isolation. Here, we develop a novel Monte Carlo analysis to evaluate the usefulness of data assemblages. We applied our analysis to varying assemblages of clinical data and molecular data in an ovarian cancer dataset, evaluating their ability to discriminate one-year progression-free survival (PFS) and three-year overall survival (OS). We found that Cox proportional hazard regression models based on both data types together provided greater discriminative ability than either alone. In particular, we show that proteomics data assemblages that alone were uninformative (p = 0.245 for PFS, p = 0.526 for OS) became informative when combined with clinical information (p = 0.022 for PFS, p = 0.048 for OS). Thus, concurrent analysis of clinical and molecular data enables exploitation of prognosis-relevant information that may not be accessible from independent analysis of these data types.