摘要:With increasing deployment of Web services, the research on the dependability and availability of Web service composition becomes more and more active. Since unexpected faults of Web service composition may occur in different levels at runtime, log analysis as a typical data-driven approach for fault diagnosis is more applicable and scalable in various architectures. Considering the trend that more and more service logs are represented using {XML} or {JSON} format which has good flexibility and interoperability, fault classification problem of semi-structured logs is considered as a challenging issue in this area. However, most existing approaches focus on the log content analysis but ignore the structural information and lead to poor performance. To improve the accuracy of fault classification, we exploit structural similarity of log files and propose a similarity based Bayesian learning approach for semi-structured logs in this paper. Our solution estimates degrees of similarity among structural elements from heterogeneous log data, constructs combined Bayesian network (CBN), uses similarity based learning algorithm to compute probabilities in CBN, and classifies test log data into most probable fault categories based on the generated CBN. Experimental results show that our approach outperforms other learning approaches on structural log datasets.