首页    期刊浏览 2024年07月09日 星期二
登录注册

文章基本信息

  • 标题:InvarNet-X: A Black-Box Invariant-Based Approach to Diagnosing Big Data Systems
  • 作者:Pengfei Chen ; Yong Qi ; Di Hou
  • 期刊名称:IEEE Transactions on Emerging Topics in Computing
  • 印刷版ISSN:2168-6750
  • 出版年度:2017
  • 卷号:5
  • 期号:4
  • 页码:450-465
  • DOI:10.1109/TETC.2015.2497143
  • 出版社:IEEE Publishing
  • 摘要:As big data spreads rapidly, performance problems in these systems become common concerns. As the first line of defending these problems, performance diagnosis plays an essential role in big data systems. It is notoriously difficult to conduct performance diagnosis in large distributed systems. Previous work either pinpoint the root causes by instrumenting the applications or runtime systems in a white-box way, which leads to a considerable overhead, or just provide some hints to the hidden root causes in a black-box way. Very few works focus on pinpointing the real root causes in a black-box way. To address this problem, this paper proposes a black-box invariant-based diagnosing approach and implements a proof-of-concept system named InvarNet-X. In this paper, performance diagnosis is formalized as a pattern recognition problem, meaning that each performance problem is identified by a specific pattern. The rationale of InvarNet-X is that the unobservable root causes of performance problems always expose themselves through the violations of the associations among directly observable performance metrics. Such observable associations are called likely invariants calculated by the maximal information criterion, and each performance problem is signified by a sparse distributed representation. A problem signature database is constructed by training multiple real performance problems in advance. Once a performance anomaly is detected, the diagnosing procedure is triggered. The root cause is pinpointed by retrieving similar signatures in the signature database. The experimental evaluations in a controlled big data system show that InvarNet-X can achieve a high accuracy in diagnosing some real performance problems reported in software bug repositories, which is superior to several state-of-the-art approaches. Moreover, the light-weight property makes InvarNet-X easily facilitated in large-scale big data systems in real time.
  • 关键词:Big data;Hadoop;invariant;maximal information criterion;performance diagnosis
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有