摘要:Owing to urbanization, the output of construction waste is increasing yearly. Garbage treatment plays a vital role in urban development and construction. The accuracy and integrity of data are important for the implementation of construction waste treatment. Abnormal detection and incomplete filling occur when traditional cleaning algorithms are used. To improve the cleaning of construction waste data, a data cleaning algorithm based on multi-type construction waste was presented in this study. First, a multi-algorithm constraint model was designed to achieve accurate matching between the cleaning content and cleaning model. Thereafter, a natural language data cleaning model was proposed, and the spatial location data were separated from the general data through the content separation mechanism to effectively frame the area to be cleaned. Finally, a time series data cleaning model was constructed. By integrating “check” and “fill”, large-span and large-capacity time series data cleaning was realized. This algorithm was applied to the data collected by the pilot cities, which had precision and recall rates of 93.87% and 97.90% respectively, compared with the traditional algorithm, ultimately exhibiting a certain progressiveness. The algorithm proposed herein can be applied to urban environmental governance. Furthermore, this algorithm can markedly improve the control ability and work efficiency of construction waste treatment, and reduce the restriction of construction waste on the sustainable development of urban environments.
关键词:multi type data; construction waste; data cleaning; multi algorithm constraint model