摘要:AbstractImprovements in sensing, connectivity and computing technologies mean that industrial processes now generate a vast amount of data from a variety of disparate sources. Data may take a number of different forms, from different time-domain signals, sampled at different rates using various types of sensors, through to more disparate sources such as alarm and event logs. New process and condition monitoring techniques are needed to be developed to tackle the new challenges of big and heterogeneous data. Although there are a few publicly available benchmark studies, e.g. the Tennessee Eastman process plant (Ricker, 1995), a multiphase flow benchmark case for statistical process monitoring (Ruiz-Cárcel et al., 2015), they provide only standard process data. This work presents a benchmark case on an industrial scale multiphase flow facility. Various operational conditions were tested under normal operating modes as well as with seeded faults. Heterogeneous data was collected from various sources, including process data, alarm data and high frequency ultrasonic and pressure data. Two different fault detection algorithms are applied to the data, a multivariate PCA-enhanced Canonical Variate Analysis (CVA) and a probabilistic Bayesian method. This benchmark case study with data from disparate sources can be used for algorithm development and validation for fault detection, fault identification, fault classification, fault severity detection, monitoring of fault evolution and prognostics.