首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:A comparative study of multiple instance learning methods for cancer detection using T-cell receptor sequences
  • 本地全文:下载
  • 作者:Danyi Xiong ; Ze Zhang ; Tao Wang
  • 期刊名称:Computational and Structural Biotechnology Journal
  • 印刷版ISSN:2001-0370
  • 出版年度:2021
  • 卷号:19
  • 页码:3255-3268
  • DOI:10.1016/j.csbj.2021.05.038
  • 出版社:Computational and Structural Biotechnology Journal
  • 摘要:As a branch of machine learning, multiple instance learning (MIL) learns from a collection of labeled bags, each containing a set of instances. The learning process is weakly supervised due to ambiguous instance labels. Since its emergence, MIL has been applied to solve various problems including content-based image retrieval, object tracking/detection, and computer-aided diagnosis. In biomedical research, the use of MIL has been focused on medical image analysis and molecule activity prediction. We review and apply 16 methods to investigate the applicability of MIL to a novel biomedical application, cancer detection using T-cell receptor (TCR) sequences. This important application can be a viable approach for large-scale cancer screening, as TCRs can be easily profiled from a subject’s peripheral blood. We consider two feasible data-generating mechanisms, and for the purpose of performance evaluation, we simulate data under each mechanism, where we vary potentially important factors to mimic realistic situations. We also apply the methods to sequencing data of ten cancer types from The Cancer Genome Atlas, as an early proof of concept for distinguishing tumor patients from healthy individuals via TCR sequencing of peripheral blood. We find that given an appropriate MIL method is used, satisfactory performance with Area Under the Receiver Operating Characteristic Curve above 80% can be achieved for five in the ten cancers. Based on our numerical results, we make suggestions about selection of a proper method and avoidance of any method with poor performance. We further point out directions of future research as well as identify a pressing need of new MIL methodologies for improved performance (for some cancer types) and more explainable outcomes.
  • 关键词:Binary classification ; Primary instance ; T-cell receptor ; Witness rate ; Weakly supervised learning
国家哲学社会科学文献中心版权所有