摘要:In many studies, search engine data were efficient to analyze and forecast as an explanatory variable, including the tourism volumes predictions. However, the search data and the tourism volumes were always interfered by the noise. Without noise-processing, the predictive ability of search engine data might be weak, even invalid. As a method of noise-processing, Hilbert-Huang Transform (HHT) could deal with non-linear and non-stationary data. This study proposed a model with denoising and forecasting by search engine data, namely CLSI-HHT. The search queries were composited into an index first, then the noise were extracted from the index and tourism volumes sequences by HHT. The study further forecast the tourism volumes with the effective series. The results demonstrated that CLSI-HHT model outperformed the baselines significantly while the index model without denoising performs nearly same as the time series model. Moreover, wavelet transform and filtering were compared with HHT on denoising and the results implied that HHT had higher signal noise ratio (SNR) and forecast more accurately. The study concluded that noise-processing was necessary for the tourism forecasting with search engine data, and HHT could be an effective method on denoising.