摘要:The identification of the mutational processes operating in tumour cells has implications for cancer diagnosis and therapy. These processes leave mutational patterns on the cancer genomes, which are referred to as mutational signatures. Recently, 81 mutational signatures have been inferred using computational algorithms on sequencing data of 23,879 samples. However, these published signatures may not always offer a comprehensive view on the biological processes underlying tumour types that are not included or underrepresented in the reference studies. To circumvent this problem, we designed CaMuS (Cancer Mutational Signatures) to construct de novo signatures while simultaneously fitting publicly available mutational signatures. Furthermore, we propose to estimate signature similarity by comparing probability distributions using the Hellinger distance. We applied CaMuS to infer signatures of mutational processes in poorly studied cancer types. We used whole genome sequencing data of 56 neuroblastoma, thus providing evidence for the versatility of CaMuS. Using simulated data, we compared the performance of CaMuS to sigfit, a recently developed algorithm with comparable inference functionalities. CaMuS and sigfit reconstructed the simulated datasets with similar accuracy; however two main features may argue for CaMuS over sigfit: (i) superior computational performance and (ii) a reliable parameter selection method to avoid spurious signatures.
其他摘要:Abstract The identification of the mutational processes operating in tumour cells has implications for cancer diagnosis and therapy. These processes leave mutational patterns on the cancer genomes, which are referred to as mutational signatures. Recently, 81 mutational signatures have been inferred using computational algorithms on sequencing data of 23,879 samples. However, these published signatures may not always offer a comprehensive view on the biological processes underlying tumour types that are not included or underrepresented in the reference studies. To circumvent this problem, we designed CaMuS (Cancer Mutational Signatures) to construct de novo signatures while simultaneously fitting publicly available mutational signatures. Furthermore, we propose to estimate signature similarity by comparing probability distributions using the Hellinger distance. We applied CaMuS to infer signatures of mutational processes in poorly studied cancer types. We used whole genome sequencing data of 56 neuroblastoma, thus providing evidence for the versatility of CaMuS . Using simulated data, we compared the performance of CaMuS to sigfit , a recently developed algorithm with comparable inference functionalities. CaMuS and sigfit reconstructed the simulated datasets with similar accuracy; however two main features may argue for CaMuS over sigfit : (i) superior computational performance and (ii) a reliable parameter selection method to avoid spurious signatures.