首页    期刊浏览 2024年11月09日 星期六
登录注册

文章基本信息

  • 标题:Genomic Repeat Detection Using the Knuth-Morris-Pratt Algorithm on R High-Performance-Computing Package
  • 本地全文:下载
  • 作者:Lala Septem Riza ; Achmad Banyu Rachmat ; Munir
  • 期刊名称:International Journal of Advances in Soft Computing and Its Applications
  • 印刷版ISSN:2074-8523
  • 出版年度:2019
  • 卷号:11
  • 期号:1
  • 页码:94-111
  • 出版社:International Center for Scientific Research and Studies
  • 摘要:Genomic repeat, which is to find repeating base pairs inDeoxyribonucleic Acid (DNA) sequences, can be used to detectgenetic disease by analyzing the overload or over normal limits of therepetition. Since it takes very high computation cost, this researchbuilds a parallel-computing model and its implementation to solve it.It can be achieved by modifying and implementing the Knuth-Morris-Pratt algorithm (KMP) on the R High-Performance-Computing Package, namely ‘pbdMPI’. It contains the followingsteps: preprocessing and splitting DNA sequence, KMP on parallelcomputing with ‘pbdMPI’, combining all indices, and calculatinggenomic repeats. To validate the model and implementation, 114experiments involving human DNA sequences are conducted on thestandalone and parallel-computing scenarios. The results show thatthe proposed system can reduce the computation cost, which is morethan 100 times faster than the standalone computing. Somecomparisons of the computation cost in term of the numbers ofbatches and numbers of cores are presented along with the existingresearches. In summary, the proposed model provides the significantimprovement on the computational cost.
  • 关键词:DNA; human genom; genomic repeats; string matching; Knuth-;Morris-Pratt; high-performance computing.
国家哲学社会科学文献中心版权所有