期刊名称:GI_FORUM - Journal for Geographic Information Science
电子版ISSN:2308-1708
出版年度:2018
卷号:1
页码:65-81
DOI:10.1553/giscience2018_01_s65
出版社:ÖAW Verlag, Wien
摘要:Today, we have access to a vast amount of weather, air quality, noise or radioactivity data collected by individuals around the globe. This volunteered geographic information often contains data of uncertain and of heterogeneous quality, in particular when compared to official in-situ measurements. This limits their application, as rigorous, work-intensive data-cleaning has to be performed, which reduces the amount of data and cannot be performed in real-time. In this paper, we propose a method to evaluate dynamically learning the quality of individual sensors by optimizing a weighted Gaussian process regression using an evolutionary algorithm. The evaluation was carried out in south-west Germany in August 2016 for temperature data from the Wunderground network and the Deutsche Wetter Dienst (DWD), in total 1,561 stations. Using a 10-fold cross-validation scheme based on the DWD ground truth, we show significant improvements for the predicted sensor readings: we obtained a 12.5% improvement on the mean absolute error.
关键词:crowdsourcing air temperature; data quality assessment; Evolutionary Learning; Gaussian process regression; volunteered geographic information.