首页    期刊浏览 2024年07月07日 星期日
登录注册

文章基本信息

  • 标题:HCMB: A stable and efficient algorithm for processing the normalization of highly sparse Hi-C contact data
  • 本地全文:下载
  • 作者:Honglong Wu ; Xuebin Wang ; Mengtian Chu
  • 期刊名称:Computational and Structural Biotechnology Journal
  • 印刷版ISSN:2001-0370
  • 出版年度:2021
  • 卷号:19
  • 页码:2637-2645
  • DOI:10.1016/j.csbj.2021.04.064
  • 出版社:Computational and Structural Biotechnology Journal
  • 摘要:The high-throughput genome-wide chromosome conformation capture (Hi-C) method has recently become an important tool to study chromosomal interactions where one can extract meaningful biological information including P(s) curve, topologically associated domains, A/B compartments, and other biologically relevant signals. Normalization is a critical pre-processing step of downstream analyses for the elimination of systematic and technical biases from chromatin contact matrices due to different mappability, GC content, and restriction fragment lengths. Especially, the problem of high sparsity puts forward a huge challenge on the correction, indicating the urgent need for a stable and efficient method for Hi-C data normalization. Recently, some matrix balancing methods have been developed to normalize Hi-C data, such as the Knight-Ruiz (KR) algorithm, but it failed to normalize contact matrices with high sparsity. Here, we presented an algorithm, Hi-C Matrix Balancing (HCMB), based on an iterative solution of equations, combining with linear search and projection strategy to normalize the Hi-C original interaction data. Both the simulated and experimental data demonstrated that HCMB is robust and efficient in normalizing Hi-C data and preserving the biologically relevant Hi-C features even facing very high sparsity. HCMB is implemented in Python and is freely accessible to non-commercial users at GitHub: https://github.com/HUST-DataMan/HCMB .
  • 关键词:Hi-C ; Normalization ; Matrix balancing ; Doubly stochastic matrix ; Sparsity
国家哲学社会科学文献中心版权所有