首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:"Google Flu Trends" and emergency department triage data predicted the 2009 pandemic H1N1 waves in Manitoba.
  • 作者:Malik, Mohammad Tufail ; Gumel, Abba ; Thompson, Laura H.
  • 期刊名称:Canadian Journal of Public Health
  • 印刷版ISSN:0008-4263
  • 出版年度:2011
  • 期号:July
  • 语种:English
  • 出版社:Canadian Public Health Association
  • 摘要:The 2009 H1N1 influenza pandemic provides a unique opportunity to evaluate the utility of these indicators in predicting and monitoring influenza outbreaks. We examined the performance of influenza syndromic indicators, based on Google Flu Trends (GFT) and ED data, with respect to the early detection and monitoring of the H1N1 pandemic waves in Manitoba. Syndromic indicator data were compared to reference data defined as the weekly count of laboratory- confirmed H1N1 cases in Manitoba during the 2009 pandemic.
  • 关键词:Database searching;Databases;Epidemics;Hospital emergency services;Hospitals;Influenza;Internet/Web search services;Online searching;Public health;Swine influenza

"Google Flu Trends" and emergency department triage data predicted the 2009 pandemic H1N1 waves in Manitoba.


Malik, Mohammad Tufail ; Gumel, Abba ; Thompson, Laura H. 等


Traditional methods of public health surveillance based on clinical or laboratory case reports are often expensive to implement and maintain; not sensitive enough to detect the early stages of an outbreak; and not suitable to detect outbreaks of novel pathogens. (1) Syndromic surveillance is emerging as a practical alternative approach to monitor influenza disease activity that does not rely on collecting data on diagnosed cases. Syndromic surveillance involves the use of "health-related data that precede diagnosis and signal sufficient probability of a case or an outbreak to warrant further public health response". (2) Clinical syndromes, such as influenza-like illness (ILI), and other proxies, such as the number of emergency department (ED) visits for ILIs and the volume of influenza-related Internet search engine queries, (3-7) are used to monitor disease activity in order to detect new outbreaks and predict the trajectory and impact of ongoing outbreaks.

The 2009 H1N1 influenza pandemic provides a unique opportunity to evaluate the utility of these indicators in predicting and monitoring influenza outbreaks. We examined the performance of influenza syndromic indicators, based on Google Flu Trends (GFT) and ED data, with respect to the early detection and monitoring of the H1N1 pandemic waves in Manitoba. Syndromic indicator data were compared to reference data defined as the weekly count of laboratory- confirmed H1N1 cases in Manitoba during the 2009 pandemic.

METHODS

Emergency department data

Information on ED visits to Winnipeg hospitals was obtained from the database of the Emergency Department Information System (EDIS) for the period from December 2008 to June 2010. EDIS is a real-time ED monitoring system implemented across Winnipeg hospitals that captures information on every ED visit, including patient demographics and 'chief complaints.' We obtained aggregated daily data on the number of ED visits attributed to ILI and the total number of visits (for any reason) to all EDs included in EDIS. A visit was attributed to ILI if the patient's chief complaint was listed as weakness, shortness of breath, cough, headache, fever, sore throat, upper respiratory tract infection, or respiratory arrest. This definition likely overestimates the actual number of ILI visits, as none of these complaints are specific to the ILI syndrome. However, this definition has been used consistently throughout the study period, so time trends may still reflect changes over time in ED use due to ILI. Using ED data, we defined two syndromic indicators: 1) weekly count of all ED visits triaged as ILI (ED ILI volume), and 2) percentage of all ED visits that were triaged as an ILI (ED ILI percent).

Google Flu Trends data

GFT data for Manitoba for the duration of the two waves of the H1N1 pandemic in Manitoba, between April 2009 and January 2010, were downloaded from the GFT website. (8) GFT uses a previously-alidated algorithm (9) and Google's aggregated search query data to provide jurisdiction-specific estimates of influenza disease activity in near real-time. (8) In Canada, these estimates were calibrated using publicly available data on number of ILI cases per 100,000 physician visits as provided by the FluWatch sentinel surveillance system, (10) which uses a network of primary care practitioners across Canada to monitor physician visits for ILI illness. Hence, GFT flu activity estimates are presented as the number of ILI cases per 100,000 physician visits. (8) A team of Google researchers recently reported that GFT data predicted peaks in influenza activity in the United States sooner than traditional flu surveillance systems. (9)

[FIGURE 1 OMITTED]

Virologic data

Weekly numbers of laboratory-confirmed H1N1 influenza cases occurring in Manitoba during the period of January 2009 to January 2010 were obtained from the Flu Surveillance Website of Manitoba Health. (11) In Manitoba, a laboratory-confirmed case of pandemic H1N1 influenza was defined as an individual who tested positive for H1N1 influenza A virus by real-time reverse-transcriptase PCR or viral culture. (11)

Statistical analysis

To assess the strength of the correlation between each of the three indicators (predictor variables) and virologic data (response variable), we fitted the following linear model separately for each indicator:

[[gamma].sub.t] = [[beta].sub.0] + [[beta].sub.1][x.sub.t-[tau]] (Model 1)

where [[gamma].sub.t] is the number of laboratory-confirmed H1N1 cases occurring during week t, and [x.sub.t-[tau]] is the weekly value for the predictor variable (GFT, ED ILI volume or ED ILI percent) in the week t-[tau]. Correlations with weekly virologic data were calculated for different lag periods ([tau] = 0,1,2,3, 4 weeks, where [tau] = 0 indicates no lag). We used Matlab (12) to estimate the linear regression coefficients corresponding to the least-squares solution of the system of equations describing the model. The coefficient of determination, [R.sup.2] (0 [less than or equal to] [R.sup.2] [less than or equal to] 1), was used as a measure of the goodness of fit of our models to the observed data,13 with a larger value of [R.sup.2] (closer to 1) reflecting a better linear model fit. Because there is only a single explanatory variable in Model 1, [R.sup.2] is equivalent to the square of the Pearson correlation coefficient measuring the strength of association between the response and the explanatory variables.

[FIGURE 2 OMITTED]

In anticipation of differences in the patterns of ED visits and Internet searches for health information between the two pandemic waves, we fitted Model 1 separately to Wave 1 (April to October, 2009) and Wave 2 data (after October 2009 to January 2010). We also assessed whether GFT data could predict the volume and proportion of ED ILI visits by fitting Model 1 to the data with GFT data as the predictor variable and either of ED ILI volume or ED ILI percent as the response variable.

RESULTS

Figure 1 shows the time series for the weekly counts of laboratory-confirmed H1N1 cases (the epidemic curve) in Manitoba plotted against the three syndromic indicators: GFT data, ED ILI volume, and ED ILI percent. Like many jurisdictions in the northern hemisphere, Manitoba experienced two pandemic waves in 2009. The first wave began in early May 2009, and the second and much larger one in early October 2009. The presence of two waves is evident in the time-series curves for the GFT and ED indicators (Figure 1). All three indicators peaked earlier than the epidemic curve of laboratory-confirmed cases, especially during the second wave where the peak of the epidemic curve lagged behind the peak of the other curves by about 1-2 weeks.

These observations were confirmed by the results of the linear regression analysis (Table 1) based on Model 1. For the GFT data (left panel), [R.sup.2] (and therefore the correlation coefficient) was highest when the GFT data are shifted ahead by two weeks ([R.sup.2]=0.686), indicating that the best-fitting model is the one with about a 2-week lag period. Similarly, the best-fitting models for both ED indicators were observed for a time lag of 1-2 weeks (Table 1), with the ED ILI volume indicator slightly outperforming the ED ILI percent indicator.

Table 1 also shows that all three indicators performed better as predictors of the virologic time trends during the second wave than during the first wave, although the strongest correlations were still present in models with a 1- to 2-week lag. For example, the model based on the GFT indicator with a 2-week time lag had an [R.sup.2] of 0.733 for Wave 2 data and an [R.sup.2] of 0.558 for Wave 1 data. For the model based on the ED ILI percent data, the best-fitting model was with a 2-week lag in Wave 1 ([R.sup.2]=0.469) and with a 1-week lag in Wave 2 ([R.sup.2]=0.605). The better linear fit of the GFT indicator is shown in Figure 2.

Figure 1 too shows a strong congruence between the time series of the GFT and both the ED ILI volume and the ED ILI percent indicators. The results of corresponding linear regression analysis are shown in Table 2. The best-fitting model was the one for GFT (predictor variable) and ED ILI volume (dependent variable) with no time lag ([R.sup.2]= 0.86).

DISCUSSION

We found that syndromic indicators based on GFT and ED data were strongly correlated with each other and with virologic data during both waves of the 2009 H1N1 pandemic in Manitoba. The epidemic curve based on laboratory-confirmed cases generally lagged behind the time series of these syndromic indicators by 1-2 weeks.

These findings confirm previous reports demonstrating the utility of ED data in the detection of influenza outbreaks in the general population. (14-16) Our results are also consistent with the findings of three recently published studies that evaluated the performance of GFT data in predicting levels of influenza activity during the 2009 pandemic. (5,6,17) However, in all these studies, syndromic indicators were validated against national sentinel ILI surveillance data rather than actual counts of laboratory-confirmed cases.

Our findings are also consistent with studies performed during pre-pandemic influenza seasons which showed that ILI-related Internet search queries were strongly correlated with conventional influenza surveillance indicators. For instance, one study found that Yahoo ILI-related search queries were strongly correlated with the number of culture-positive influenza cases and with mortality from pneumonia and influenza during the 2004-08 flu seasons in the US. (18) Similar results were reported for analyses based on Google search queries, (9) Google Trends, (7) Twitter messages (19) and other social media Web sites. (20)

Compared with conventional methods of influenza surveillance, GFT has several advantages. (21) First, GFT information is free, easily accessible, and is provided using an intuitive simple-to-use interface. Second, the information is updated daily permitting near real-time monitoring of influenza activity which could facilitate early detection of community outbreaks. This is a significant advantage over conventional influenza surveillance systems, where information dissemination is hampered by unavoidable delays in the reporting and collation of data. As virologic data tend to correlate with increased utilization of health care resources (e.g., ED visits, hospitalizations), GFT information might be a useful tool in predicting and planning for increased demands for health care. Third, GFT does not require voluntary reporting by laboratories or health care professionals. GFT information is likely to remain available even in the event of a severe pandemic that overwhelms health care resources. Last, GFT information is currently available for more than 20 countries around the world, permitting easy tracking and comparison of flu activity worldwide.

Like other syndromic indicators, concerns have been raised about the lack of specificity of GFT data, e.g., influenza-related news stories may result in spikes in Internet searches. (21) The resulting false alarms could be avoided by simultaneously using multiple syndromic indicators (e.g., ED data, calls to health lines) to assess levels of influenza activity. On the other hand, GFT data may also be of low sensitivity, especially in the detection of small localized outbreaks. In addition, the sensitivity of GFT data may depend on local levels of Internet utilization. For example, Valdivia et al. found weaker correlations between GFT data and sentinel physician surveillance data in countries with lesser reliance on the Internet as a source of health information. (17)

Our study had several limitations. As only Manitoba data were included, findings may not be applicable to other provinces. The reference standard (number of laboratory-confirmed cases) likely underestimated the incidence of influenza in the population, because the number of detected cases largely reflects the proportion of symptomatic patients who were tested for the infection, and is influenced by accessibility of medical care, physicians' practices, and laboratory testing guidelines. (22) Midway through the second wave, laboratory testing of mild ILI cases was suspended in Manitoba. A significant drop in the number of laboratory-confirmed cases during the latter half of the second wave is obvious in the epidemic curve, and may have affected the strength of the measured association. The EDIS is not available for regions outside Winnipeg and for some smaller hospitals in Winnipeg, and this may have also weakened the strength of association between EDIS indicators and virologic data.

In conclusion, during an influenza season characterized by high levels of disease activity, GFT and ED indicators provided a good indication of weekly counts of laboratory-confirmed influenza cases in Manitoba 1-2 weeks in advance. Syndromic surveillance using GFT and ED represents a timely and cost-effective addition to conventional influenza surveillance, capable of predicting disease incidence and related increases in health care utilization.

Financial support: This work was partially supported by CIHR Pandemic Outbreak Team Leader Grant (PTL-97126).

Disclaimer: The interpretation and conclusions contained herein do not necessarily represent those of the Winnipeg Regional Health Authority.

Conflict of Interest: None to declare.

Received: November 14, 2010

Accepted: March 17, 2011

REFERENCES

(1.) Elliot A. Syndromic surveillance: The next phase of public health monitoring during the H1N1 influenza pandemic. Euro Surveill 2009;14:44.

(2.) Centers for Disease Control and Prevention. Syndromic Surveillance: An Applied Approach to Outbreak Detection. 2008. Available at: http://www.cdc.gov/ncphi/disss/nndss/syndromic.htm (Accessed October 1, 2010).

(3.) Carneiro Herman A, Mylonakis E. Google Trends: A web-based tool for real-time surveillance of disease outbreaks. Clin Infect Dis 2009;49(10):1557-64.

(4.) Seifter A, Schwarzwalder A, Geis K, Aucott J. The utility of "Google Trends" for epidemiological research: Lyme disease as an example. Geospatial Health 2010;4(2):135-37.

(5.) Wilson N, Mason K, Tobias M, Peacey M, Huang Q, Baker M. Interpreting Google Flu Trends data for pandemic H1N1 influenza: The New Zealand experience. Euro surveillance: Bulletin europeen sur les maladies transmissibles= European Communicable Disease Bulletin 2009;14(44).

(6.) Kelly H, Grant K. Interim analysis of pandemic influenza (H1N1) 2009 in Australia: Surveillance trends, age of infection and effectiveness of seasonal vaccination. Euro Surveill 2009;14(31):2.

(7.) Pelat C, Turbelin C, Bar-Hen A, Flahault A, Valleron A-J. More diseases tracked by using Google trends. (Letter to the editor). Emerg Infect Dis 2009;15(8):1327(2).

(8.) Google Flu Trends-Canada. Available at: www.google.org/ (Accessed May 9, 2010).

(9.) Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature [10.1038/nature07634]. 2009;457(7232):1012-14.

(10.) Public Health Agency of Canada. Flu Watch for September 6, 2009 to September 12, 2009 (Week 36). 2009. Available at: http://www.phacaspc.gc.ca/fluwatch/09-10/w36_09/index-eng.php (Accessed September 23, 2009).

(11.) Manitoba Health. General Information on Lab-Confirmed Cases of Pandemic H1N1 Influenza. 2010. Available at: http://www.gov.mb.ca/health/publichealth/sri/stats1.html (Accessed July 10, 2010).

(12.) MATLAB version 7.10.0 (R2010a). Natick, MA: The MathWorks Inc., 2010.

(13.) Canavos G. Applied Probability and Statistical Methods. New York, NY: Little, Brown and Company, 1984.

(14.) Shimoni Z, Niven M, Kama N, Dusseldorp N, Froom P. Increased complaints of fever in the emergency room can identify influenza epidemics. Eur J Intern Med 2008;19(7):494-98.

(15.) Irvin CB, Nouhan PP, Rice K. Syndromic analysis of computerized emergency department patients' chief complaints: An opportunity for bioterrorism and influenza surveillance. Ann Emerg Med 2003;41(4):447-52.

(16.) Heffernan R, Mostashari F, Das D, Karpati A, Kulidorff M, Weiss D. Syndromic surveillance in public health practice, New York City. Emerg Infect Dis 2004;10(5):858-64.

(17.) Valdivia A, Lopez-Alcalde J, Vicente M, Pichiule M, Ruiz M, Ordobas M, et al. Monitoring influenza activity in Europe with Google Flu Trends: Comparison with the findings of sentinel physician networks-results for 2009-10. Euro surveillance: Bulletin europeen sur les maladies transmissibles= European Communicable Disease Bulletin 2010;15(29).

(18.) Polgreen PM, Chen Y, Pennock DM, Nelson FD. Using internet searches for influenza surveillance. Clin Infect Dis 2008;47(11):1443-48.

(19.) Culotta A. Detecting influenza outbreaks by analyzing Twitter messages. CoRR [serial on the Internet]. 2010. Available at: http://arxiv.org/abs/1007.4748 (Accessed November 15, 2010).

(20.) Corley CD, Cook DJ, Mikler AR, Singh KP. Text and structural data mining of influenza mentions in web and social media. Int J Environ Res Public Health 2010;7(2):596.

(21.) Wilson K, Brownstein JS. Early detection of disease outbreaks using the Internet. CMAJ2009;180(8):829-31.

(22.) Mahmud SM, Becker M, Keynan Y, Elliott L, Thompson LH, Fowke K, et al. Estimated cumulative incidence of pandemic (H1N1) influenza among pregnant women during the first wave of the 2009 pandemic. CMAJ 2010;182(14):1522-24.

Correspondence: Dr. Salaheddin Mahmud, Department of Community Health Sciences, University of Manitoba, S111--750 Bannatyne Avenue, Winnipeg, MB R3E 0W3, E-mail: salah.mahmud@gmail.com

Mohammad Tufail Malik, PhD, (1) Abba Gumel, PhD, (1) Laura H. Thompson, MSc, (2) Trevor Strome, MSc, (3) Salaheddin M. Mahmud, MD, PhD (2,3)

Author Affiliations

[1.] Department of Mathematics, University of Manitoba, Winnipeg, MB

[2.] Department of Community Health Sciences, University of Manitoba, Winnipeg, MB

[3.] Winnipeg Regional Health Authority, Winnipeg, MB
Table 1. Results From Linear Regression Analysis, Based on Model 1,
for the Weekly Counts of Laboratory-confirmed H1N1 Cases
(Dependent Variable) in Manitoba With Each of Google Flu Trend Data,
ED ILI Volume and ED ILI Percent, by Wave

Lag ([tau])                        GFT
Weeks         [[beta].sub.0]   [[beta].sub.1]   [R.sup.2]

Both Waves
0                -40.43           0.019        0.419
1                -68.35           0.024        0.658
2                -71.60           0.025        0.686
3                -32.49           0.018        0.358
4                15.28            0.009        0.099
Wave 1
0                -18.08           0.014        0.212
1                -47.74           0.022        0.522
2                -50.23           0.023        0.558
3                -32.38           0.019        0.358
4                -16.57           0.014        0.204
Wave 2
0                -67.35           0.022        0.394
1               -138.79           0.029        0.687
2               -155.42           0.031        0.733
3                -45.57           0.019        0.265
4                100.03           0.003        0.007

Lag ([tau])                ED ILI Volume
Weeks         [[beta].sub.0]   [[beta].sub.1]   [R.sup.2]

Both Waves
0               -210.99            0.27        0.311
1               -304.62            0.36        0.547
2               -313.98            0.37        0.563
3               -257.40            0.31        0.411
4                -72.59            0.14        0.078
Wave 1
0                -52.33           0.093        0.067
1               -147.93           0.195        0.295
2               -188.62           0.239        0.443
3               -191.29           0.244        0.445
4               -168.03           0.220        0.360
Wave 2
0               -282.63           0.343        0.350
1               -446.97           0.477        0.627
2               -465.90           0.487        0.595
3               -328.12           0.371        0.332
4                166.77           -0.03        0.003

Lag ([tau])                   ED Percent ILI
Weeks         [[beta].sub.0]   [[beta].sub.1]   [R.sup.2]

Both Waves
0               -294.35           19.02        0.338
1               -383.09           23.66        0.523
2               -376.52           23.29        0.503
3               -304.65           19.57        0.357
4               -104.50            9.12        0.077
Wave 1
0                -95.33            7.49        0.099
1               -209.75           14.10        0.354
2               -246.84           16.27        0.469
3               -231.77           15.48        0.416
4               -202.47           13.81        0.330
Wave 2
0               -467.77           27.15        0.389
1               -656.76           35.49        0.605
2               -635.57           34.31        0.521
3               -421.33           24.64        0.267
4                203.26           -3.49        0.006

Table 2. Results From Regression Analysis for ED ILI Volume
and ED ILI Percent (Dependent Variables) and
Google Flu Trend Data (Predictor Variable), by
Wave

Lag ([tau])                 ED ILI Volume
Weeks         [[beta].sub.0]   [[beta].sub.1]   [R.sup.2]

Both Waves
0                712.05           0.057          0.86
1                753.21           0.049         0.651
2                817.71           0.038         0.384
3                894.87           0.024         0.159
Wave 1
0                649.32           0.077         0.832
1                709.36           0.063         0.540
2                790.64           0.041         0.233
Wave 2
0                693.5            0.055         0.876
1                761.33           0.047         0.609
2                897.82           0.032         0.268

Lag ([tau])   ED Percent ILI
Weeks         [[beta].sub.0]   [[beta].sub.1]   [R.sup.2]

Both Waves
0                14.32            0.0008        0.852
1                14.75            0.0008        0.704
2                15.49            0.0006        0.479
3                 16.5            0.0005        0.249
Wave 1
0                 12.9            0.0012        0.872
1                13.68            0.001         0.615
2                14.81            0.0007        0.305
Wave 2
0                15.45            0.0007        0.825
1                16.04            0.0006        0.634
2                17.48            0.0005        0.341
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有