首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:APPROXIMATE MULTIPLE STRING MATCHING ALGORITHM
  • 本地全文:下载
  • 作者:J UJWALA REKHA
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2020
  • 卷号:98
  • 期号:11
  • 页码:1948-1956
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Approximate multiple string matching comprises of finding all appearances of a set of patterns within a larger text with at most k-mismatches or k-differences. It is a process that consumes large amounts of computational resources and has applications in many domains of computer science such as correction of spelling, spam filtering and computational biology to name a few. However, these applications require a fast algorithm for string matching which depends on the construction of an accurate shift table. When a mismatch occurs while searching a pattern, a shift table specifies the maximum number of characters that can be skipped in text while ensuring that a possible match is not overlooked. This study presents a simple and efficient algorithm that is a variant of Tarhio-Ukkonen algorithm generalized to approximate multiple string matching. The algorithm searches for patterns of different lengths with at most k-mismatches by skipping some of the characters, while not missing the occurrences of any pattern. While the space and time complexity for construction of shift table is and , the time complexity of search algorithm is , where is the number of patterns, is the number of characters in the alphabet, is the length of the text string and is the total length of all patterns. To evaluate the efficacy of the algorithm, experiments are conducted on random text and random patterns of different lengths and different values of k.
  • 关键词:Approximate String Matching;Multiple String Matching;Tarhio-Ukkonen Algorithm
国家哲学社会科学文献中心版权所有