首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:An Efficient Compression Algorithm for Forthcoming New Species
  • 本地全文:下载
  • 作者:Subhankar Roy ; Sudip Mondal ; Sunirmal Khatua
  • 期刊名称:International Journal of Hybrid Information Technology
  • 印刷版ISSN:1738-9968
  • 出版年度:2015
  • 卷号:8
  • 期号:11
  • 页码:323-332
  • DOI:10.14257/ijhit.2015.8.11.28
  • 出版社:SERSC
  • 摘要:Genomic repositories gradually increase individual and reference sequences, which shares long identical and near-identical strings of nucleotides. In this paper a lossless DNA data compression technique called Optimized Base Repeat Length DNA Compression (OBRLDNAComp) has been proposed, based upon redundancy of DNA sequences. For easy storage, retrieval time reducing and to find similarity within and between sequences compression is mandatory. OBRLDNAComp searches long identical and near-identical strings of nucleotides which are overlooked by other DNA specific compression algorithms. This technique is an optimal solution of longest possible exact repeat benefits towards compression ratio. It scans a sequence horizontally from left to right to find statistic of repeats then follow substitution technique to compress those repeats. The algorithm is straightforward and does not need any external reference file; it scans the individual file for compression and decompression. The achieved compression ratio 1.673 bpb outperforms many non-reference based compression methods
  • 关键词:Redundancy; Reference genome; Longest Exact Repeats; Non-repeat; LZ77; ; and Compression Ratio
国家哲学社会科学文献中心版权所有