首页    期刊浏览 2025年03月16日 星期日
登录注册

文章基本信息

  • 标题:Better Greedy Sequence Clustering with Fast Banded Alignment
  • 本地全文:下载
  • 作者:Brian Brubach ; Jay Ghurye ; Mihai Pop
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2017
  • 卷号:88
  • 页码:3:1-3:13
  • DOI:10.4230/LIPIcs.WABI.2017.3
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:Comparing a string to a large set of sequences is a key subroutine in greedy heuristics for clustering genomic data. Clustering 16S rRNA gene sequences into operational taxonomic units (OTUs) is a common method used in studying microbial communities. We present a new approach to greedy clustering using a trie-like data structure and Four Russians speedup. We evaluate the running time of our method in terms of the number of comparisons it makes during clustering and show in experimental results that the number of comparisons grows linearly with the size of the dataset as opposed to the quadratic running time of other methods. We compare the clusters output by our method to the popular greedy clustering tool UCLUST. We show that the clusters we generate can be both tighter and larger.
  • 关键词:Sequence Clustering; Metagenomics; String Algorithms
国家哲学社会科学文献中心版权所有