首页    期刊浏览 2024年11月09日 星期六
登录注册

文章基本信息

  • 标题:Genomic repeats detection using Boyer-Moore algorithm on Apache Spark Streaming
  • 本地全文:下载
  • 作者:Lala Septem Riza ; Farhan Dhiyaa Pratama ; Erna Piantari
  • 期刊名称:TELKOMNIKA (Telecommunication Computing Electronics and Control)
  • 印刷版ISSN:2302-9293
  • 出版年度:2020
  • 卷号:18
  • 期号:2
  • 页码:783-791
  • DOI:10.12928/telkomnika.v18i2.14883
  • 出版社:Universitas Ahmad Dahlan
  • 摘要:Genomic repeats, i.e., pattern searching in the string processing process to find repeated base pairs in the order of deoxyribonucleic acid (DNA), requires a long processing time. This research builds a big-data computational model to look for patterns in strings by modifying and implementing the Boyer-Moore algorithm on Apache Spark Streaming for human DNA sequences from the ensemble site. Moreover, we perform some experiments on cloud computing by varying different specifications of computer clusters with involving datasets of human DNA sequences. The results obtained show that the proposed computational model on Apache Spark Streaming is faster than standalone computing and parallel computing with multicore. Therefore, it can be stated that the main contribution in this research, which is to develop a computational model for reducing the computational costs, has been achieved.
  • 关键词:Apache Spark Streaming; DNA; genomic repeats; human genom; string matching;
国家哲学社会科学文献中心版权所有