摘要:Air pollution originating from anthropogenic emission, which is an important factor for environmental policy to regulate the sustainable development of enterprises and the environment. However, the missing or mislabeled discharge data make it impossible to apply this strategy in practice. In order to solve this challenge, we firstly discover that the energy consumption in a factory and the air pollutants are linearly related. Given this observation, we propose a support vector regression based Single-location recovery model to recover the air pollutant emission by using the energy consumption data in a factory. To further improve the precision of air pollutant emission estimation, we proposed a Gaussian process regression based multiple-location recovery model to estimate and recover the missing or mislabeled air pollutant emission from surrounding available air quality readings, collected by the government’s air quality monitoring station. Moreover, we optimally combine the two approaches to achieve the accurate air air pollutant emission estimation. To our best of knowledge, this is the first paper for monitoring the air pollutant emission taking both a factory’s energy consumption and government’s air quality readings into account. The research model in this article uses actual data(10,406,880 entries of data including weather, PM 2.5, date, etc.) from parts of Shandong Province, China. The dataset contains 33 factories (5 types) and we use the co-located air quality monitoring station as ground truth. The results show that, our proposed single-location recovery, multi-location recovery, and combined method could acquire the mean absolute error of 8.45, 9.69, and 7.25, respectively. The method has consistent accurate prediction behavior among 5 different factory types, shows a promising potential to be applied in broader locations and application areas, and outperforms the existing spatial interpolation based methods by 43.8%.