首页    期刊浏览 2024年11月07日 星期四
登录注册

文章基本信息

  • 标题:Flexible Log File Parsing Using Hidden Markov Models
  • 本地全文:下载
  • 作者:Nadine Kuhnert ; Andreas Maier
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2019
  • 卷号:9
  • 期号:12
  • 页码:1-12
  • DOI:10.5121/csit.2019.91201
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:We aim to model unknown file processing. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured text by focusing only on those frequent patterns which lead to the desired output table similar to Vaarandi [10]. Second, we transform the found frequent patterns and the output stating the parsed table into a Hidden Markov Model (HMM). We use this HMM as a specific, however, flexible representation of a pattern for log file processing. With changes in the raw log file distorting learned patterns, we aim the model to adapt automatically in order to maintain high quality output. After training our model on one system type, applying the model and the resulting parsing rule to a different system with slightly different log file patterns, we achieve an accuracy over 99%.
  • 关键词:Hidden Markov Models; Parameter Extraction; Parsing; Text Mining; Information Retrieval
国家哲学社会科学文献中心版权所有