文章基本信息

标题：Nowcasting using the Chicago Fed National Activity Index.
作者：Brave, Scott A. ; Butters, R. Andrew
期刊名称：Economic Perspectives
印刷版ISSN：1048-115X
出版年度：2014
期号：March
语种：English
出版社：Federal Reserve Bank of Chicago
摘要：The Chicago Fed National Activity Index (CFNAI) is a monthly index of U.S. economic activity constructed from 85 data series (or indicators) classified into four groups: production and income; employment, unemployment, and hours; personal consumption and housing; and sales, orders, and inventories. (1) The index is estimated as the first principal component of the 85 data series, (2) and is essentially a weighted average of the indicators, with their individual weights representing the relative degree to which each indicator explains the overall variation among them. The CFNAI is also normalized to reflect deviations around a long-term historical rate of economic growth. As such, a zero value of the index indicates that growth in economic activity is proceeding along its long-term historical path; a negative value indicates below-average growth, while a positive value indicates above-average growth.
关键词：United States economic conditions

Nowcasting using the Chicago Fed National Activity Index.

Brave, Scott A. ; Butters, R. Andrew

Introduction and summary

The Chicago Fed National Activity Index (CFNAI) is a monthly index of U.S. economic activity constructed from 85 data series (or indicators) classified into four groups: production and income; employment, unemployment, and hours; personal consumption and housing; and sales, orders, and inventories. (1) The index is estimated as the first principal component of the 85 data series, (2) and is essentially a weighted average of the indicators, with their individual weights representing the relative degree to which each indicator explains the overall variation among them. The CFNAI is also normalized to reflect deviations around a long-term historical rate of economic growth. As such, a zero value of the index indicates that growth in economic activity is proceeding along its long-term historical path; a negative value indicates below-average growth, while a positive value indicates above-average growth.

The CFNAI, which premiered in March 2001, was originally designed as a leading indicator for inflation (Stock and Watson, 1999; and Fisher, 2000). However, much of its current value derives from its ability to capture U.S. business cycles (that is, the periodic fluctuations in economic activity around its long-term historical trend) and nowcast (3) U.S. real gross domestic product (GDP) growth (Evans, Liu, and Pham-Kanter, 2002; and Brave and Butters, 2010). The index has been shown to align with the historical timing of U.S. recessions according to the National Bureau of Economic Research (NBER), with close to 95 percent accuracy (Berge and Jorda, 2011). Moreover, the CFNAI has the ability to signal in real time the onset and end of a recession--for instance, the index did this for the 2001 and 2007-09 recessions within one to three months of the NBER dates, with an average lead time of one year prior to the official NBER announcements (Brave and Butters, 2010). The CFNAI's success has been more mixed in terms of predicting real GDP growth, although for the 2004-09 period its performance was on par with the median current quarter forecast from the Federal Reserve Bank of Philadelphia's Survey of Professional Forecasters (Brave and Butters, 2010).

In this article, we consider an alternative version of the CFNAI that is chiefly constructed using the methodology developed in Brauning and Koopman (2014). Their method of collapsed dynamic factor (CDF) analysis4 offers several advantages over the CFNAI's traditional methodology--principal components analysis (PCA)--when it comes to estimating the index: first, through its incorporation of the dynamic properties of the time series for the index (which PCA cannot exploit) and, second, by further disentangling common drivers of the variation in the underlying data series from idiosyncratic ones. Common drivers of the CFNAI indicators are the types of macroeconomic shocks generally associated with the business cycle, while idiosyncratic drivers include shocks typically isolated to various specific sectors of the U.S. economy, as captured in the four broad categories of the CFNAI indicators. Moreover, the methodology of Brauning and Koopman (2014) makes it possible to directly link the CFNAI to broad economic indicators constructed at a lower frequency, such as quarterly real GDP growth.

[FIGURE 1 OMITTED]

Figure 1 plots the history of the traditional monthly CFNAI and the alternative CFNAI, which is largely based on applying the methodology of Brauning and Koopman (2014), from March 1967 through February 2014. The shaded periods in the figure represent U.S. recessions as identified by the NBER. The alternative CFNAI shown here produces a superior in-sample fit and out-of-sample projections of current quarter real GDP growth while correlating more closely with NBER recessions than the traditional CFNAI. These improvements depend on both the way in which the correlation structure of the 85 underlying data series (at a certain point in time and across time) is taken into account in the estimation procedure and the particular way in which real GDP growth and its dynamic properties are incorporated. We establish this fact by drawing comparisons between the static factor model for the CFNAI and several dynamic factor models that include quarterly real GDP growth but differ from the Brauning and Koopman (2014) methodology in how they include it.

In the process of updating the CFNAI with the Brauning and Koopman (2014) methodology, we also learn something about the nature of the recovery from the most recent recession. Our application of CDF analysis provides us with some additional context for the uneven pattern of growth during the recovery: First, we show that there has been a moderate decline in the trend rate of real GDP growth since December 2007; and second, we note that the share of variation among the underlying data series for the CFNAI (particularly those in the personal consumption and housing category and employment, unemployment, and hours category) due to idiosyncratic drivers increased during the 2007-09 recession and the subsequent recovery. The second finding leads our estimate of the alternative CFNAI during much of the recent recovery to be higher than that of the traditional CFNAI given the weakness of these data series, although the first finding somewhat offsets the impact on U.S. real GDP growth of this upward revision in our assessment of economic activity.

Box 1 Principal components analysis and factor analysis

Here, we explain the mathematics behind using PCA to construct the
traditional CFNAI. Let x denote the N x 1 column vector of N data
series at time t. The first step is to form the N x T matrix of
data vectors [X.sub.t], where each row of this matrix contains T
observations normalized to have a mean of zero and a standard
deviation of one. [1] The eigenvector-eigenvalue decomposition of
the variance-covariance matrix [X.sub.t] [X.sub.i]/N then produces
a set of time-invariant weights referenced by the 1 x 85 row vector
w resulting from a transformation of the eigenvector associated
with the largest eigenvalue of this matrix. These weights are then
used to construct a weighted average of the x such that the
resulting index is given by 1 = wx.

The underlying assumption about [X.sub.t] necessary to produce this
variance decomposition is that it admits a factor model
representation. This means that it can be additively decomposed
into the product of two vectors--an N x 1 column vector of
time-invariant factor loadings [GAMMA] and a 1 x T time-varying
latent factor [F.sub.t]--and a normally distributed mean zero
random variable [[epsilon].sub.t] with variance-covariance matrix
[[sigma].sup.2]I:

[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t].

The values of [F.sub.t] and [GAMMA] are then jointly estimated by
maximizing Tr[[GAMMA]'[X.sub.t][X.sub.t]T] subject to a
normalization constraint. [2] This linear optimization problem is
solved by setting the estimator [??]/N equal to w. [3] The
estimated factor in our example, given by [MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII] corresponds to the CFNAI. As such, it is
the principal component common to all N = 85 indicators that
explains the largest amount of variation among them.

[1] Underlying the normalization of the data is the concept of
stationarity, or in this case the first and second moment
restrictions that the mean and variance of each indicator do not
vary over time. Each data series first receives a transformation to
make it stationary prior to its normalization. A list of
transformations can be found at
www.chicagofed.org/digital_assets/publications/
cfnai/background/cfnaiindicatorsjist.pdf.

[2] The normalization constraints most commonly used with one
factor are [GAMMA]'[GAMMA]/N = 1 and [F'.sub.t][F.sub.t]/T = 1.

[3] See Stock and Watson (2002) for more details on the connection
between PCA and factor analysis. Also, note that identification
here is achieved only up to the scale provided by the normalization
constraint on the factor loadings To be able to interpret the
CFNAI, we use the normalization [F'.sub.t][F.sub.t]/T = 1.

In the next section, we detail the traditional and several alternative methods of constructing the CFNAI. Then, we further explain the implications of our proposed update to the CFNAI using the Brauning and Koopman (2014) framework. Next, we describe what is driving the differences in timing of U.S. recessions across the Chicago Fed National Activity Index's three-month moving average (CFNAI-MA3) and the alternative CFNAI's three-month moving average (alternative CFNAI-MA3). Afterward, we show how the alternative CFNAI-MA3 can be used to nowcast annualized quarterly real GDP growth more accurately than the traditional CFNAI-MA3 and several other dynamic-factor-based indexes. Finally, we present our conclusions and comment on what they may imply for U.S. economic activity over the near term.

Traditional and alternative methods of constructing the CFNAI

The traditional method of constructing the CFNAI-- principal components analysis--proceeds by means of an eigenvector-eigenvalue decomposition of the variance-covariance matrix of the 85 underlying data series. This static factor model description of the data, detailed in box 1, produces a principal component for each eigenvalue of the variance-covariance matrix. The eigenvector associated with the largest eigenvalue of the matrix constitutes the weights applied to the data series that are used to construct the first principal component, or what we call the CFNAI. Stock and Watson (2002) show that this method of constructing the CFNAI is capable of producing a consistent estimate of the underlying static factor model of the data as the number of data series and the number of time periods become large.

The CFNAI is estimated monthly and released near the middle of each month with a history from March 1967 through the month preceding that of the release date. The lag of approximately one month between the last month of the index and the release date is necessary because of limitations on data availability. In addition to this one-month production lag, many of the data series themselves are only available at a further one- to two-month lag. The limited availability of data at the time of estimation results in a variance-covariance matrix of less than full rank, thereby making principal components analysis infeasible. To circumvent this issue, we forecast each incomplete data series separately up to the month in which the index is produced according to individual autoregressive processes with five lags before the index is constructed.

This method of completing the panel of indicators in order to construct a first principal component is very flexible but not unique or even necessary. Stock and Watson (2002) also demonstrate how to produce an index estimate when data are missing with the same desirable statistical properties as PCA. Their methodology relies explicitly on the incomplete data methods of the expectation-maximization (EM) algorithm made popular by Watson and Engle (1983); and although it does not take into account the serial correlation properties of each data series like the current CFNAI procedure for inferring missing data or the dynamic properties of the index itself, it does account for the data's underlying factor representation to both estimate the index and impute missing values. (5)

The Stock and Watson (2002) EM algorithm uses the information from the complete, or "balanced," panel of indicators to make the best possible prediction of the incomplete, or "unbalanced," panel of indicators. When applied to the construction of the CFNAI, it begins by performing PCA on the subset of data series that are available in all time periods. Missing values are then predicted based upon linear regressions of each of the 85 data series on the first principal component. Finally, PCA is repeated on the balanced panel of data, which combines the observed and predicted data. This process continues until the difference in the sum of the squared prediction errors between iterations reaches a desired level of convergence.

Since the inception of the CFNAI in March 2001, several alternative methods of constructing economic activity indexes that build on PCA have been proposed. Each of these is also an example of factor analytic methods; the differences across the methodologies mainly depend on how variation due to common drivers versus idiosyncratic ones is decomposed across data series and time periods. Here, we briefly describe a few of these alternative methodologies, contrasting them with the traditional methodology used for the CFNAI and each other before we explain the collapsed dynamic factor model of Brauning and Koopman (2014). Boxes 2 and 3 present many of the technical details of these methods that are omitted in the discussion that follows.

Giannone, Reichlin, and Small (2008) extended the static representation of the factor model of Stock and Watson (2002) into a dynamic factor model by incorporating both information from the cross section of data series (at each point in time) and information on data series across time into the process of estimating the index and imputing missing values. Doz, Giannone, and Reichlin (2012) then subsequently provided an alternative EM algorithm with which to estimate the dynamic factor model. In the first step, PCA is performed up to the point in time for which all data series are available. The first principal component from this static factor model is then used to obtain the initial parameter values for the dynamic factor model shown in box 2. The estimation of the CFNAI in this case proceeds by means of the Kalman filter and smoother equations applied to this model. The resulting index is then used to reestimate parameter values, and the process is repeated until convergence of the model's log-likelihood is achieved as shown in box 2.

In the methodology of Doz, Giannone, and Reichlin (2012), data series that are unavailable each month are ignored for inferring the value of the index, but are forecasted using information on the dynamic properties of the index via the Kalman filter. Furthermore, unlike PCA, the idiosyncratic error structure of the data can be relaxed to accommodate unequal variances across unobserved idiosyncratic drivers of the data series (that is, heteroskedasticity). These modifications of the underlying factor model for the CFNAI are not costless, however, as they come at the price of estimating a much larger number of parameters. Besides the obvious increase in complexity and in the time and computing power necessary to estimate and construct the index using dynamic factor methods versus static factor methods, other potential drawbacks from this richer class of factor models include the additional uncertainty introduced when using the index to make out-of-sample projections of inflation and economic growth as in Fisher (2000) and Brave and Butters (2010).

Collapsed dynamic factor analysis as presented in Jungbacker, Koopman, and van der Wei (2011) and explained in box 3 minimizes these costs by transforming the static portion of the dynamic factor model in such a way as to significantly reduce the number of estimated parameters needed to run the Kalman filter and compute projections of auxiliary variables of interest, such as real GDP growth. We follow their methodology in order to incorporate serial correlation within data series (idiosyncratic autocorrelation) in addition to heteroskedasticity into the static factor model. All of the alternative methods for constructing the CFNAI that we have presented thus far, however, remain sensitive to the use of PCA as a starting point for estimation. (6) If the PCA estimate of the index is biased even after accounting for its dynamics, none of the dynamic factor models considered here is guaranteed to produce an unbiased estimate of the index.

Brauning and Koopman (2014) provide an alternative transformation of the static factor model that does not assume that PCA produces an unbiased estimate of the index. In their framework, each data series' factor loadings are fixed at their PCA values. Then, by treating the PCA estimate of the index as a noisy indicator of the "true" measure, their method reoptimizes the index such that it explains the largest percentage of the variation in the PCA estimate of the index that is consistent with both its own estimated dynamics and that of an auxiliary "target" variable. The target variable is described as a more comprehensive but perhaps less frequently available indicator of the information set over which the other data series span. The target variable is also unique in that it alone follows its own estimated dynamics and loads directly on both current and past values of the index, whereas in our application the dynamics of the index do not depend on the target variable. In our application and theirs, real GDP growth is used as the target variable.

BOX 2
Dynamic factor analysis

PCA and traditional factor analysis are static estimation
methods in that they do not incorporate information
from both the cross section of data series and the
information from across time. Dynamic factor analysis
instead makes use of variation in both forms. To do
so, it relies on signal extraction methods, such as the
Kalman filter, applied to a system of equations relating
the latent factor, or the CFNAI as in the example
in box 1 (p. 21), to both the cross section of data series
at each point in time (a "measurement" or "observation"
equation) and the dynamic factors that drive its
fluctuations over time (a "state" equation).

Mathematically, this involves specifying the
following state-space representation:

[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t],

[F.sub.t] = A[F.sub.t-1] + [v.sub.t],

where F is the 1 x T latent factor capturing a time-varying
common source of variation in the 85 x T
matrix of indicators [X.sub.t]; [GAMMA] is the 85 x 1 loadings onto
the factor; and A is the transition matrix describing
the evolution of the latent factor over time. We write
the A parameter of the model assuming a first-order
autoregressive process (AR(1)) for [F.sub.t], which can be
generalized to an arbitrary number of lags, p. [1] The
static factor model representation of the CFNAI
described in box 1 thus forms the measurement equation
of the state-space representation of the dynamic
factor model. Adding dynamics of some finite order
to [F.sub.t] yields its state equation.

Both [[epsilon].sub.t] and [v.sub.t] are assumed to be
independently
normally distributed mean zero random variables. We
follow the dynamic factor model of Doz, Giannone,
and Reichlin (2012) and assume that Var([[epsilon].sub.t]) = H (an
85 x 85 diagonal matrix) and Var([v.sub.t])() = 1. [2] The signal
extraction methods of the Kalman filter and smoother
are capable of estimating such a model given the coefficient
matrices of the measurement and state equations,
that is, [GAMMA] and A, and the idiosyncratic error variances
along the diagonal of H. All of these parameters can
be consistently estimated from linear regressions involving
[X.sub.t] and the smoothed or PCA estimate of [F.sub.t] as
demonstrated in Giannone, Reichlin, and Small (2008).

With the model in state-space form and initial estimates
of the system matrices, the expectation-maximization
(EM) algorithm outlined by Shumway and Stoffer
(1982) can be used to estimate the latent factor [F.sub.t]. At
each iteration of the algorithm, one pass of the data
through the Kalman filter and smoother is made followed
by reestimating the system matrices. [3] The log-likelihood
that results is nondecreasing, and convergence is governed
by its stability. [4] This iterative estimation process
combines the efficiency of likelihood-based estimation
of the latent factor with the consistency of ordinary
least squares (OLS) parameter estimates.

To see the relationship between the static and
dynamic factor models, consider the case where the
transition matrix of the state equation, A, is the zero
matrix. That is, nullify the impact of dynamics for the
latent factor. Notice that if we specify the variance-covariance
matrix of the measurement equation's error
term is proportional to the identify matrix (and based
on the description of PCA discussed in box 1), we end
up with an estimate of the latent factor that is proportional
to the first principal component. For this reason,
our traditional methodology for the CFNAI can be considered
a special case of the dynamic factor model with
a zero transition matrix and a homoskedastic idiosyncratic
error structure (that is, the assumption of equal
variances across unobserved idiosyncratic drivers of
the underlying data series).

[1] We choosep depending on the model being estimated, but all
models use either three or four lags.

[2] The latter restriction acts to set the scale of the dynamic
factor
model just as the normalization on the scale of the factor loadings
used in PCA does for the static factor model.

[3] In addition, a small alteration in the least-squares step is
required
to account for the fact that the unobserved components of the
model must first be estimated. See Durbin and Koopman (2012)
for further details

[4] Our stability criterion where k references iteration is as
follows:
|logL(k)-logL(k-1)/((\log L(k) + \log L(k) +\log L(k-1))/2)<
[10.sup.-6].

Our use of the Brauning and Koopman (2014) methodology is motivated by several persistent criticisms

Scott A. Brave is a senior business economist in the Economic Research Department at the Federal Reserve Bank of Chicago, and R. Andrew Butters is a graduate student at the Kellogg School of Management, Northwestern University. The authors thank Alejandro Justiniano, Dick Porter, and an anonymous referee for helpful comments and suggestions.

Economic Perspectives is published by the Economic Research Department of the Federal Reserve Bank of Chicago. The views expressed are the authors' and do not necessarily reflect the views of the Federal Reserve Bank of Chicago or the Federal Reserve System.

Charles L. Evans, President, Daniel G. Sullivan, Executive Vice President and Director of Research; Spencer Krane, Senior Vice President and Economic Advisor, David Marshall, Senior Vice President, financial markets group, Daniel Aaronson, Vice President, microeconomic policy research', Jonas D. M. Fisher, Vice President, macroeconomic policy research', Richard Heckinger, Vice President, markets team, Anna L. Paulson, Vice President, finance team', William A. Testa, Vice President, regional programs', Richard D. Porter, Vice President and Economics Editor, Helen Koshy and Han Y. Choi, Editors', Rita Molloy and Julia Baker, Production Editors', Sheila A Mangier, Editorial Assistant.

Economic Perspectives articles may be reproduced in whole or in part, provided the articles are not reproduced or distributed for commercial gain and provided the source is appropriately credited. Prior written permission must be obtained for any other reproduction, distribution, republication, or creation of derivative works of Economic Perspectives articles. To request permission, please contact Helen Koshy, senior editor, at 312-322-5830 or email Helen.Koshy@chi.frb.org.

ISSN 0164-0682

of perceived bias in the CFNAI. One source of bias in the CFNAI could stem from not including enough variables to span the space of U.S. economic activity--for instance, by omitting international trade or government spending indicators (which inform real GDP growth) as the CFNAI currently does. Another source of bias in the CFNAI could be due to a preponderance of data confined to one or more sectors of the economy--for instance, the potential overweighting of manufacturing data series that dominate the production and income category of indicators and the CFNAI. Yet another source of bias in the CFNAI could result from the omission of any additional common components in the CFNAI data series in the estimation of the dynamic factor model. Here, we consider only the likelihood of the first two potential sources of bias, but note that our results remain sensitive to the possibility of the last one. See box 3 for more details on how the Brauning and Koopman (2014) methodology helps to correct for these potential sources of bias in the CFNAI.

BOX 3 Collapsed dynamic factor analysis

Collapsed dynamic factor analysis reflects its name. It begins by
applying a transformation to the measurement equation of the
dynamic factor model's statespace representation in order to
collapse its size to match the typically smaller size of the state
equation. In the context of the Jungbacker, Koopman, and van der
Wei (2011) model applied to the CFNAI, this amounts to
premultiplying the 85 x T matrix of indicators [X.sub.t] by the
transformation [A.sub.L] =
[([GAMMA]'[[OMEGA].sup.-1][GAMMA]).sup.-][GAMMA]'[[OMEGA].sup.-1]],
where [OMEGA] is the variance-covariance matrix of
[[epsilon].sub.t]:

[X.sup.L.sub.t] = [F.sub.t] + [u.sub.t],

The transformed measurement equation, shown here, then relates a
scalar, [X.sup.L.sub.t] = [A.sub.L][X.sub.t], with a unit factor
loading to the latent factor, [F.sub.t], and a mean zero normally
distributed random scalar [u.sub.t] with variance H =
[([GAMMA]'[[OMEGA].sup.-1]).sup.-1]. The state equation is
unaltered from the example in box 2 (p. 23).

Notice that the transformation here when applied to [X.sub.t] takes
the familiar form of the generalized least squares (GLS) solution
for the latent factor [F.sub.t] with [OMEGA] as the weight matrix.
Brauning and Koopman (2014) suggest the use of an alternative
transformation. In their example, [A.sub.L] = [??]/N, where [??],
is the PCA estimate of the factor loadings of the static factor
model, [X.sup.L.sub.T] = [??][X.sub.t]/N is the traditional CFNAI
as shown in box 1 (p. 21), and [u.sub.t] = [??][[epsilon].sub.t]/N.
Furthermore, H is not assumed to be a predetermined function of the
dynamic factor model's factor loadings and the variance-covariance
matrix of its idiosyncratic errors. It is instead estimated as an
additional parameter. The estimation of H is made possible by the
inclusion of an additional measurement equation containing a
"target" variable, which is real GDP growth in their example and
ours.

The random scalar [u.sub.t] in this context has the interpretation
of a "measurement error" between the PCA estimate of the CFNAI and
its dynamic factor counterpart. Notice that it is also a weighted
average of the idiosyncratic disturbances of the static factor
model, with the weights corresponding to the PCA factor loadings.
The implicit assumption maintained by Brauning and Koopman (2014)
to derive their transformation is that [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII]. Deviations from this assumption will
produce some approximation error as well in [u.sub.t].

We modify the Brauning and Koopman (2014) methodology for our
purposes by applying the transformation to an alternative
representation of the indicators, [[??].sub.t] = [X.sub.t] -
[rho][X.sub.t-1]. This modification allows us to draw finer
comparisons with the collapsed dynamic factor model of Jungbacker,
Koopman, and van der Wei (2011), which also allows for
heteroskedasticity and serial correlation in the idiosyncratic
errors [[epsilon].sub.t] but assumes PCA produces an unbiased
estimate of the latent factor. To make this modification operative,
we first estimate the collapsed dynamic factor model of Jungbacker,
Koopman, and van der Wei (2011) to obtain estimates of the p vector
and construct [[??].sub.t]. [1] We then apply PCA to the covariance
matrix of [[??].sub.t] to obtain [A.sub.L] = [??]/N and proceed as
described earlier in this box. [2]

Our application of the Brauning and Koopman (2014) methodology also
requires an additional measurement equation relating quarterly real
GDP growth, [Y.sub.t], to its own lagged value, [Y.sub.t-3];
current and past values of the three-month moving average of the
CFNAI, [F.sup.3.sub.t]; and a time-varying intercept,
[T.sup.3.sub.t]. Real GDP growth in this framework acts to "clean"
the PCA estimate of the three-month moving average of the monthly
index by apportioning it in each quarter into a fragment that is
correlated with quarterly real GDP growth,
[[gamma].sub.0][F.sup.3.sub.t] +
[[summation].sup.3.sub.k=1][[gamma].sub.k][F.sup.3.sub.t-k], and a
fragment that is not, [T.sup.3.sub.t] + [gamma][Y.sub.t-3] +
[v.sub.t]. based on the regression coefficients, [[gamma].sub.k]
and [delta]. The mean zero normally distributed random variables
[u.sub.t] and [v.sub.t] are assumed to be independent. Box 4
provides more details on this particular nowcasting specification:

[Y.sub.t], = [T.sup.3.sub.t] + [gamma][Y.sub.t-3] +
[[gamma].sub.0][F.sup.3.sub.t] + [3.summation over
(k=1)][[gamma].sub.k][F.sup.3.sub.t-k] + [v.sub.t].

This errors-in-variables framework is estimated by Brauning and
Koopman (2014) by full maximum likelihood techniques. Here, to
maintain consistency with the way the other dynamic and collapsed
dynamic factor models are estimated, we instead use a variant of
the EM algorithm described in box 2 to estimate the transformed
state-space representation. In order to use the Jungbacker,
Koopman, and van der Wei (2011) estimate of the smoothed latent
factor in the first step, this process requires a restricted
least-squares regression of the PCA estimate of the factor on the
smoothed latent factor and an additional linear regression for the
target variable equation. Additional details on the estimation
process can be found in box 4.

[1] The maintained assumption in this exercise in order for our
estimate of [rho] to be unbiased is that Cov([X.sub.t-1],,
[[xi].sub.t]) = [[xi].sub.t], where is a composite error term
comprising [[epsilon].sub.t] and the contemporaneous measurement
error in the estimated factor.

[2] Because the indicators have already been demeaned and
standardized, they are measured in common units. Thus, obtaining
principal components from the covariance matrix instead of the
correlation matrix of [[??].sub.t], allows us to incorporate
unequal variances across the indicators.

Implications of the update for the CFNAI

The estimation of the dynamic factor models for the CFNAI requires only slight modifications to existing methods as shown in box 3. (7) To be able to compare indexes based on alternative methodologies, we include real GDP growth as an additional indicator for each of the dynamic factor alternatives to the CFNAI's traditional methodology discussed previously. This way we can highlight the joint role played by including real GDP growth along with the dynamic factor elements discussed in the previous section. Moreover, to capture the role played by allowing for dynamics in the estimation process instead of relaxing various PCA restrictions, we use a variant of the Giannone, Reichlin, and Small (2008) methodology. In this case, the factor model for the CFNAI is estimated using the Doz, Giannone, and Reichlin (2012) EM algorithm, preserving the PCA restrictions on the idiosyncratic error structure of the data but allowing for a dynamic process of the index to be estimated.

Another benefit of the alternative estimation frameworks presented in the previous section is that, following Brauning and Koopman (2014), it becomes feasible to decompose real GDP growth into its trend and cyclical components. (8) Based on our past work (Brave and Butters, 2010, 2013), this ability to decompose real GDP growth has turned out to be vital to capturing changes in average real GDP growth over long periods. Given this finding, we developed a specification that allows for a time-varying intercept in the equation for quarterly real GDP growth to capture changes over time in its trend rate of growth (see box 4). To capture cyclical movements, we follow Brave and Butters (2010) in using one lag of quarterly real GDP growth in addition to current and past values of the three-month moving average of the monthly index.

Figure 2 plots in separate panels the difference between the CFNAI and each of the four dynamic-factor-based indexes. Simply adding dynamic elements to the static factor model, as well as quarterly real GDP growth in the construction of the index, produces small differences from the traditional CFNAI. This can be seen in the difference between the CFNAI and the first dynamic-factor-based index (labeled DF in panel A of figure 2). Further relaxing the PCA restrictions on the idiosyncratic error structure of the data has a more pronounced effect; this is apparent in the difference between the CFNAI and the dynamic-factor-based index with heteroskedastic errors (labeled DF-HC in panel B of figure 2) and between the CFNAI and the dynamic factor-based index with heteroskedastic and serially correlated errors (labeled DF-HAC in panel C). However, the difference from the traditional CFNAI is most prominent for the dynamic-factor-based index based on the methodology of Brauning and Koopman (2014) (labeled CDF-H AC in panel D of figure 2)--which we refer to as the alternative CFNAI in figure 1 (p. 20). These results are detailed further in table 1 (p. 28), which displays the cumulative effect on the explained variance of the 85 underlying data series for the traditional CFNAI from altering the various assumptions underlying its static factor model. Each successive addition to the static factor model for the CFNAI--from dynamics and real GDP growth (first row, second column) to heteroskedastic errors (first row, third column) and serially correlated idiosyncratic errors (first row, fourth column)--reduces the explained variance of the 85 underlying data series by the index, but none more so than the Brauning and Koopman (2014) methodology (first row, fifth column), which corrects for bias arising from the use of PCA. It is important to note here that the reductions in explained variance do not reflect a failure of the dynamic factor model to account for variation among these data series at a certain point in time or within them across time. Instead, such reductions reflect the fact that more of the variation in these series is estimated to arise from idiosyncratic drivers (including potential bias due to the use of PCA) rather than common ones. The alternative CFNAI (first row, fifth column) explains only 20 percent of the total variance of the 85 data series--a reduction of almost one-third of the total variance explained by the traditional CFNAI (first row, first column) and a reduction of almost one-fourth of the total variance explained by its closest counterpart, the DF-FLAC index (first row, fourth column).

BOX 4

The model for nowcasting real GDP growth

Our dynamic factor model for the CFNAI is given by the system of
equations in box 2 (p. 23), and is repeated here for convenience:

[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t], [F.sub.t] =
A[F.sub.t] + [v.sub.t],

To obtain the collapsed dynamic factor models discussed in the
text, we substitute the measurement equations described in box 3
(p. 24) for the first equation here.

The variant of this system based on Giannone, Reichlin, and Small
(2008) parameterizes the variance-covariance matrix of
[[epsilon].sub.t], or H, as [[sigma].sup.2]I, in accordance with
the description of PCA in box 1 (p. 21). The variant based on Doz,
Giannone, and Reichlin (2012) instead assumes a heteroskedastic H
with diagonal elements equal to o[[sigma].sup.2.sub.i]. In addition
to allowing for heteroskedasticity, the variant based on
Jungbacker, Koopman, and van der Wei (2011) allows for
idiosyncratic serial correlation up to the first order, where we
choose the degree of serial correlation for each of the 85 data
series prior to estimating according to the Bayesian information
criterion. The CDF variant referenced in the text estimates H as a
scalar parameter according to Brauning and Koopman (2014).

We append to this model a nowcasting equation relating annualized
quarterly real GDP growth, [Y.sub.t], in each time period to its
own lagged value, [Y.sub.t-3]; current and past values of the
three-month moving average of the latent factor [F.sup.3.sub.t],
and a time-varying intercept, [T.sup.3.sub.t] We only observe
[Y.sub.t] in the third month of each quarter, so that this equation
strictly relates each quarterly realization of real GDP growth to
only the corresponding end-of-quarter value of [T.sup.3.sub.t];

[Y.sub.t] + [T.sup.3.sub.t] + [delta][Y.sub.t-3] + [[gamma].sub.0]
[F.sup.3.sub.t] + [3.summation over
(k=1)][[gamma].sub.k][F.sup.3.sub.t-k] + [v.sub.t].

To be able to estimate the model, we must first specify a dynamic
process for the latent time-varying intercept, [T.sup.3.sub.t], by
adding a second state equation to the model. We assume that it is
the quarterly average of a monthly process [T.sub.t] that follows a
random walk with drift parameter [alpha]:

[T.sub.t] = [alpha] + [T.sub.t-1] + [[eta].sub.t],

As such, [T.sup.3.sub.t] represents the time-varying mean of
quarterly real GDP growth conditional on the previous quarter's
value of real GDP growth [Y.sub.t-3] and current and past values of
[F.sup.3.sub.t], and can be interpreted as trend real GDP growth.
Furthermore, we assume that [v.sub.t] and [[eta].sub.t], are mean
zero normally distributed random variables with variances V and W,
respectively, that are uncorrelated with each other,
[[epsilon].sub.t], and [v.sub.t].

This particular specification of the nowcasting equation expands on
Brave and Butters (2010), in which we used the CFNAI to nowcast
real GDP growth, and is largely taken from the follow-up discussion
in Brave and Butters (2013). It is based on a decomposition of
trend and cyclical components for real GDP growth as in Brauning
and Koopman (2014), where the cyclical dynamics of real GDP growth
are assumed to be captured by lagged real GDP growth and current
and past values of the three-month moving average of the latent
factor. However, it also represents a departure from the
specification considered by Brauning and Koopman (2014), which uses
a different method of aggregation to relate real GDP growth to the
monthly latent factor, includes additional lags of real GDP growth,
and does not include a time-varying intercept.

Our model is estimated using a variant of the EM algorithms
described in boxes 2 and 3. The use of the Kalman filter requires
that we specify initial values for the mean and variance of
[F.sub.t] and [T.sub.t] Here, we use the exact initialization
procedure described in Harvey (1989) for [F.sub.t], as well as a
diffuse initialization for [T.sub.t] by assuming that its initial
mean value is the estimated constant in the presample regression of
annualized quarterly real GDP growth on a constant in the 20
quarters prior to our sample beginning in March 1967 and setting
its initial variance to the variance of this estimate. From the
in-sample regression of annualized quarterly real GDP growth on a
constant, one lag of itself, and current and previous values of the
CFNAIMA3, we then obtain our initial parameter estimates of
[delta], [gamma], and V. Initializing a at zero, we then obtain our
initial estimate of W according to the median unbiased estimation
procedure described in Stock and Watson (1998) applied to a
local-level unobserved components model for quarterly real GDP
growth. At subsequent iterations, a and W are then reestimated by
restricted linear regression using our estimate of [T.sub.t].

Overall, correcting for bias is of greater importance than any other modification in explaining the differences between the traditional CFNAI and the alternative CFNAI according to the results in table 1. However, the other modifications to the underlying static factor model for the CFNAI reflected in the table are also worth highlighting. For instance, the various dynamicfactor-based indexes exhibit very different shares of explained variance by the index across the four broad categories of indicators. Allowing for heteroskedastic errors shifts explained variance toward the production and income category of indicators and away from the other three categories (see second through fifth rows, differences between second and third columns). Additionally allowing for idiosyncratic autocorrelation has a similar effect but also boosts the share of explained variance due to the sales, orders, and inventories category (see second through fifth rows, differences between third and fourth columns). The employment, unemployment, and hours category and personal consumption and housing category are particularly affected by the modifications to the idiosyncratic error structure of the static factor model for the CFNAI. (9) For these reasons (and as explained in box 3, p. 24), we deviate slightly from the Brauning and Koopman (2014) model by continuing to account for both heteroskedastic and serially correlated errors in the CDF-FIAC index.

[FIGURE 2 OMITTED]

Additionally, allowing (and correcting) for bias from using PCA in the estimation of the CFNAI, as shown in the fifth column of table 1, serves to reapportion the explained variance shares slightly more equally among the remaining three categories at the expense of the production and income category of indicators. In fact, much of the bias in the CFNAI that we estimate can be traced back to the contribution of the production and income category of indicators. Hence, the concern over potential overweighting of manufacturing data sources that dominate this category of indicators appears to be valid. The end result is an index (that is, the CDF-HAC index) that puts slightly more weight on the sales, orders, and inventories and production and income categories than the traditional CFNAI does (despite the correction for bias arising from the latter category) and less weight on the personal consumption and housing and employment, unemployment, and hours categories. Furthermore, we should point out that although the difference in the personal consumption and housing category's share of the fraction of data variance explained by the CFNAI and the alternative CFNAI (the CDF-HAC index) may at first seem small, its economic significance is anything but small given the outsized contribution of this category to the weakness in economic activity during the recent recession and subsequent recovery. In fact, we find that a sizable portion of the upward revision seen in the alternative CFNAI during the recovery can be traced back to this result, as we discuss in the next section.

Capturing business cycles

One of the CFNAI's key successes has been its use as an indicator of U.S. business cycles. Traditionally, the three-month moving average of the index-- the CFNAI-MA3--has been used for this purpose in the past on account of the volatile nature of the monthly CFNAI. We follow this precedent here, but note that one clear benefit of the Brauning and Koopman (2014) methodology is that it mitigates to some degree the concern about the volatility of the monthly index. Using the nonparametric method developed in Berge and Jorda (2011), we can quantify the accuracy of both the CFNAI-MA3 and the three-month moving average of the alternative CFNAI in capturing U.S. expansions and recessions as defined by the NBER.10 The receiver operating characteristic (ROC) analysis framework that Berge and Jorda describe produces a simple summary statistic in this regard (the area under the receiver operating characteristic curve, or AUROC). We briefly explain how we use this method next, while technical details for our ROC analysis can be found in box 5.

Our use of ROC analysis can be explained graphically by a histogram, as shown in figure 3. This figure plots the relative frequency of every observed value of the alternative CFNAI-MA3 separately for values that occur during NBER recessions and expansions. One

can see from figure 3 that the alternative CFNAI-MA3 is in fact quite accurate at separating recessions from expansions, as the empirical distributions seldom overlap. The AUROC statistic measures the degree of separation of the two distributions, such that the more accurate an index is at distinguishing expansions from recessions, the higher its AUROC value will be. As noted in box 5, it is even possible to compare two AUROC values to assess whether or not their differences are statistically significant. The CFNAI-MA3 has 94 percent accuracy in describing NBER expansions and recessions, so surpassing its level of accuracy in this respect is a tall task for any of the dynamic-factor-based indexes to achieve; however, one--the three-month moving average of the alternative CFNAI (CDF-HAC index)--does in fact surpass the CFNAI-MA3's accuracy at the 95 percent confidence level, with an AUROC of 98 percent. None of the other three-month moving averages of the dynamic-factor-based indexes we considered were able to produce a statistically significant improvement in AUROC compared with the CFNAI-MA3, as shown in the first column of table 2. Yet, it was true for the alternative CFNAI regardless of whether or not we smoothed through some of the monthly volatility by applying a three-month moving average transformation prior to calculating the AUROC statistic. Thus, the ability to capture U.S. business cycle properties that the NBER deems most important appears to be a unique feature of the Brauning and Koopman (2014) collapsed dynamic factor methodology.

BOX 5 Receiver operating characteristics analysis

ROC analysis applied to the CFNAI and its dynamicfactor-based
alternatives requires that we categorize each observation of an
index as falling within a recession or expansion. Following the
dating conventions for U.S. business cycles of the NBER, we then
need to construct these conditional probabilities:

TP(c) = P[[I.sub.t] [greater than or equal to] c|[S.sub.t] = 1],

FP(c) = P[[I.sub.t] [greater than or equal to] c|[S.sub.t] = 0],

with [S.sub.t] [member of] {0, 1} indicating recessions and
expansions, respectively. TP(c) is typically referred to as the
true positive rate, and FP(c) is known as the false positive rate
for an index [I.sub.t] and particular observed value c. The relationship
between the two is described by the ROC curve. With the Cartesian
convention, this curve is given by

[{ROC(r),r}.sup.1.sub.r=0]l,

where ROC(r) = TP(c) and r = FP(c). In what follows, we describe
how to construct the ROC curve.

Using the data in figure 4 (p. 32), we find the fraction of
observations that fall outside and inside the shaded regions
denoting U.S. recessions according to the NBER for the alternative
CFNA1-MA3. These fractions are the unconditional probabilities
associated with expansions and recessions. To obtain conditional
probabilities, we use the following algorithm: For each value
between the minimum and maximum observations of an index, we find
the fraction of observations where that value and all subsequently
higher values fall outside the shaded regions. We then do the same
to find the fraction of observations that fall inside the shaded
regions. These two statistics are equivalent to the true and false
positive rates for separating expansions from recessions defined
previously. By plotting the true and false positive rates against
each other for every historical value of an index, we produce a
nonparametric estimate of its ROC curve.

Berge and Jorda (2011) show that by calculating the AUROC we arrive
at an estimate of the ability of the index to delineate recessions
from expansions. As the area under the curve approaches 1, the more
predictive it is of U.S. expansions and recessions; its statistical
significance is judged relative to the area under the line from the
origin extending at a 45-degree angle (see the next paragraph for
more details). [1] It is also possible to compare the area under
two different curves to distinguish the statistical significance of
differences in predictive ability. This technique is commonly used
in the medical statistics literature to evaluate the ability of a
procedure or medical test to distinguish patients afflicted with a
condition from those who are not. [2]

[FIGURE B1 OMITTED]

Figure B1 displays the ROC curve for the alternative CFNAI-MA3
along with a line from the origin at a 45-degree angle. By
construction, this line has an AUROC equal to 0.5. The more the ROC
curve deviates in total above this 45-degree line, the higher an
index's AUROC will be. In addition, for an index's AUROC to exceed
0.5, it must have a slope greater than 1 at some point on the ROC
curve such that, for a given increase in the true positive rate,
the associated increase in the false positive rate is smaller. The
red dot on the curve marks the point at which it is no longer
possible to increase the true positive rate without producing more
false positives than are consistent with the observed relative
frequency of expansions and recessions.

Baker and Kramer (2007) show that the point on the curve denoted in
figure B1 by the red dot meets the decision-theoretic criteria for
a threshold rule, c, that equally penalizes type I (false positive)
and type II (false negative) classification errors for recessions
and expansions. To see this, consider the following utility
function:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [U.sub.ij] is the utility (or disutility) associated with the
prediction / given that the true state of the business cycle,
[S.sub.t], is j, with {i,j} [member of] {0, 1 }and where [pi] is
the unconditional probability of an expansion. Utility maximization
implies the following first-order condition determining [c.sup.*]:

[partial derivative]ROC/[partial derivative]r =
[U.sub.00]-[U.sub.10] 1 -[pi]/[U.sub.11]-[U.sub.01] [pi]

If we set the leading ratio of utilities to 1, this threshold
equates the slope of the ROC curve to the ratio of the
unconditional probabilities of expansion and recession. In doing
so, one is essentially equally weighting the net benefit of making
a type I error versus a type II error relative to correctly
predicting the true state of the business cycle.

[1] The procedure for evaluating statistical significance is
described in DeLong, DeLong, and Clarke-Pearson (1988).

[2] See Brave and Butters (2012a, 2012b) for further examples using
this approach to predict financial crises.

Figure 4 plots the time series of the CFNAI-MA3 and the alternative CFNAI-MA3 with NBER recession shading. Comparing the two indexes in figure 4, we note that the alternative CFNAI-MA3's improvement in AUROC over the CFNAI-MA3 stems largely from its ability to more accurately capture the timing of U.S. recessions prior to 1990. One way in which to see this is to examine periods where the alternative CFNAI-MA3 falls below the dashed line in figure 4. As described in box 5, the ROC framework can also be used to arrive at a single threshold value distinguishing NBER expansions from recessions that equally weights the desire to correctly capture both. The dashed line in the figure is our estimate of this threshold. At -0.7, this threshold for the alternative CFNAI-MA3 is in line with the value first put forth in Evans, Liu, and Pham-Kanter (2002) that has been used as a threshold for the CFNAI-MA3 and slightly above the value computed by Berge and Jorda (2011) using the same ROC methodology (-0.8). Examining values above and below -0.7 during the NBER recessions for the alternative CFNAI-MA3, we note an improvement in AUROC, largely resulting from the fact that it produces fewer false positives and false negatives during the 1969-70, 1973-75, 1980, and 1981-82 recessions. This is the case even though it is slightly ahead of the CFNAI-MA3 in the timing of several remaining recessions.

We can get a sense of what is driving the differences in timing of U.S. recessions across the two measures by breaking down the difference between the CFNAI-MA3 and the alternative CFNAI-MA3 into contributions from the various assumptions of the dynamic factor models building up to the alternative CFNAI-MA3. In essence, this calculation amounts to redisplaying the information already presented in figure 2 (p. 27) in a slightly different manner in order to highlight the impact of the cumulative changes across three-month moving averages of the CFNAI and DF-HAC and CDF-HAC indexes (that is, CFNAIMA3, DF-HAC-MA3, and CDFHAC-MA3) discussed previously around business cycle turning points. We can decompose the difference between the CFNAI and alternative CFNAI into 1) the difference between the CFNAI and DF-HAC index and 2) the difference between the DF-HAC index and the CDF-HAC index.11 To arrive at the same measure for the difference between the CFNAI-MA3 and alternative CFNAI-MA3, we take three-month moving averages of all three indexes.

[FIGURE 3 OMITTED]

Figure 5 displays our decomposition of the difference between the CFNAI-MA3 and alternative CFNAIMA3 into two components. The bars in the figure represent contributions to the total difference (represented by the dashed line in the figure) by these two components. The red bars capture the cumulative effect of incorporating dynamics in the static factor model and real

GDP growth along with relaxing the PCA assumptions on the idiosyncratic error structure of the static factor model--that is, the CFNA1-MA3 minus the DF-HAC-MA3. We refer to this component as HAC in the figure as it is primarily the latter feature that dominates the contribution to the total difference. The blue bars capture the marginal effect of the measurement error (ME) we estimate in the Brauning and Koopman (2014) model that arises from bias in the use of PCA--that is, the DF-HAC-MA3 minus the CDF-HAC-MA3. While ME is mean zero by construction over the entire sample period, the large magnitude of many of its realizations in this figure suggests that the CFNAI-MA3 is likely biased. One can see from figures 4 and 5 that HAC primarily accounts for the better fit of the alternative CFNAI-MA3 (relative to the CFNAI-MA3) for the 1969-70 and 1973-75 recessions, while ME is mostly responsible for the improvement in fit for the 1980 and 1981-82 recessions.

[FIGURE 4 OMITTED]

More recently, measurement error has begun to play more of a secondary role in explaining the discrepancies between the CFNAI-MA3 and alternative CFNAI-MA3. This has primarily to do with the way in which each index accounts for the protracted weakness of personal consumption and housing indicators during the recovery from the 2007-09 recession and, to a lesser extent, the employment-related indicators as well. The alternative CFNAI-MA3 reinterprets what is due to idiosyncratic drivers of variance in the underlying data series versus what is due to common drivers on the basis of how it has related historically to real GDP growth. For the HAC component to be so strongly negative in figure 5 since 2007 implies that the alternative CFNAI-MA3 indicates growth in economic activity due to personal consumption and housing during the recovery has been greater than what has been indicated by the traditional CFNAI-MA3.

However, since 2007, real GDP growth on average has been weak enough in comparison with the alternative CFNAI-MA3 to suggest that the trend rate of real GDP growth has fallen. This result can be seen in figure 6, with our estimate of the trend rate of real GDP growth decreasing from 2.9 percent in the fourth quarter of2007 to 2.4 percent in the fourth quarter of 2013. As a point of comparison, the Congressional Budget Office's (CBO) estimate of potential real GDP growth is also displayed in figure 6. Our estimate of the decrease in the trend rate of real GDP growth is somewhat smaller than the concurrent change in the CBO's estimate of potential real GDP growth--a decline from 2.4 percent in the fourth quarter of 2007 to 1.7 percent in the fourth quarter of 2013. That said, our estimate of trend GDP growth is also slightly higher than the CBO's estimate of potential real GDP growth for much of the past decade. This is the case even though over the full sample period (1967:Q1-2013:Q4) they exhibit a correlation coefficient of 0.85. The large negative HAC values from personal consumption and housing indicators since 2007 have been often wholly or partially offset by large positive measurement errors from the production and income indicators. This feature of the data has prevented the alternative CFNA1-MA3 from being even further above the CFNAI-MA3 during this period, masking the implied inference for the decline in trend GDP growth.

[FIGURE 5 OMITTED]

With recent HAC values near zero or slightly positive (see figure 5), the alternative CFNAI-MA3 suggests that the pervasiveness of the weakness in the household sector is currently more limited than previously thought according to the CFNAI-MA3. This development is a good omen for the continued expansion of the U.S. economy in 2014 if the ongoing recovery in the housing market persists. Recent negative ME values (see figure 5) also suggest that the impact of the weakness in production and income indicators in early 2014 on the CFNAIMA3 will likely be transitory. While the alternative CFNAI-MA3 also fell into negative territory in February 2014 (see figure 4), it remained much closer to its historical average than the CFNAI-MA3. Using the nowcasting model for real GDP growth described in box 4 (p. 26) as of March 20, 2014, we estimate that real GDP in the first quarter of 2014 increased at an annual rate of 1.8 percent, which is 0.5 percentage points below our current estimate of 2.3 percent for trend real GDP growth. By comparison, the Blue Chip Economic Indicators consensus forecast for first quarter real GDP growth on March 10, 2014, was 1.9 percent. In the next section, we evaluate the historical performance of our nowcasting model.

[FIGURE 6 OMITTED]

Nowcasting real GDP growth

Real GDP is the broadest measure of U.S. economic activity, but it is produced with a significant lag of up to three months. Therefore, linking its current quarter growth rate to the more readily available monthly CFNAI with a nowcasting model has a natural appeal. Furthermore, nowcasts of real GDP growth produced using the CFNAI and similar indexes have been shown to be quite accurate in several instances. (12) To generate nowcasts, we incorporate annualized quarterly real GDP growth into all of our dynamic factor models. However, we show here that only the alternative CFNAI, which is based on the Brauning and Koopman (2014) methodology, significantly boosts the explanatory power of the dynamic factor model for real GDP growth, further suggesting that the PCA estimate of the index is indeed biased because of a lack of international trade, government spending, and other indicators that inform real GDP growth.

This can be seen in the second column of table 2 (p. 31), which displays in-sample root mean squared error (RMSE) ratios for the nowcasts from the three-month moving averages of the DF, DF-HC, DF-HAC, and CDF-HAC indexes. A number less than 1 indicates an improvement in fit for the quarterly real GDP growth data relative to traditional CFNAI-MA3 nowcasts based on the nowcasting model described in Brave and Butters (2013). While all of the dynamic-factor-based indexes demonstrate an in-sample RMSE ratio of less than 1, the improvement in relative fit for the CDF-HAC index, or alternative CFNAI, at 42 percent dwarfs the others. This is perhaps not surprising given the flexibility of the CDF method in matching the index to observed real GDP growth. A more convincing test of the ability of the Brauning and Koopman (2014) methodology to correct for potential bias due to a lack of international trade, government spending, and other indicators informing real GDP growth would be to test its ability to nowcast when current quarter real GDP growth is not observed.

As it turns out, the Brauning and Koopman (2014) methodology is also important for improving the out-of-sample accuracy of our nowcasts, though its relative improvement is not much larger than that achieved with the methodology for the DF-HAC index. To arrive at this conclusion, we estimated three-month moving averages of all four dynamic-factor-based indexes using a real-time archive of the CFNAI data series covering the period December 2003 through April 2013 and the available "vintage" of real GDP growth in those months from the Federal Reserve Bank of Philadelphia's Real-Time Data Set for Macroeconomists. (13) We then compared our nowcasts made in the months of each vintage of real GDP growth to the subsequent real-time GDP release being forecasted to compute out-of-sample RMSE ratios similar to the in-sample fits discussed before. (14) As the basis for comparison in this exercise, we used similarly constructed RMSE values based on the within-quarter nowcasting model described in Brave and Butters (2010). The out-of-sample RMSE ratios are shown in the third column of table 2 (p. 31). While all of the ratios are again less than 1, the three-month moving average of the CDF-HAC index (alternative CFNAI) is still the best model in this real-time setting, with a 19 percent improvement in forecast accuracy compared with the Brave and Butters (2010) nowcast.

The differential accuracy of the CDF-HAC-M A3 nowcasts in our real-time out-of-sample nowcasting exercise is not as large in comparison to what we find based on in-sample evidence. This result suggests to us that the advantage provided by allowing for measurement error in the CFNAI-MA3 in nowcasting real GDP growth is somewhat limited given our current nowcasting framework. In fact, when we correlate the forecast errors from our real-time exercise with the real-time contribution of net exports and government spending to real GDP growth, we obtain a correlation coefficient of 0.4. In many ways, however, we are not making full use of the flexibility provided by the Brauning and Koopman (2014) methodology. In future research, we plan to explore ways in which to improve on our results--by incorporating additional factors, by adding international trade and government spending indicators to the current list of 85, or by employing estimation methods that allow for more informed dynamics and/or parameter shrinkage.

Conclusion

By building on the existing framework of the CFNAI with Brauning and Koopman's (2014) method of collapsed dynamic factor analysis, we are able to readily extend and improve our existing methodology. Given the resulting alternative CFNAI's superior past performance in predicting current quarter U.S. real GDP growth and very high correlation with NBER recessions, it may very well be a better method to both nowcast real GDP growth and assess the state of U.S. business cycles than the current CFNAI. Brauning and Koopman's methodology also allows us to address several of the persistent criticisms of the CFNAI, including the problem of overweighting certain sectors of the U.S. economy and the important omissions of certain data series (for example, those concerning international trade and government spending) in nowcasting real GDP growth.

Another benefit of Brauning and Koopman's (2014) methodology is that it makes it possible to produce both current quarter predictions of real GDP growth and an estimate of the trend rate of real GDP growth with each new index release. As of March 20, 2014, we estimate that real GDP in the first quarter of 2014 increased at an annual rate of 1.8 percent, which is 0.5 percentage points below our current estimate of 2.3 percent for trend real GDP growth. While we are still in the process of investigating the best nowcasting model with which to achieve both of these goals, our work so far suggests that this is a promising direction for future research with the CFNAI. Our analysis here also has implications for the current interpretation of the index. While the alternative CFNAI fell into negative territory in early 2014, it suggests that the pervasiveness of the weakness in the household sector (as well as its drag on U.S. economic activity) is more limited than previously thought according to the traditional CFNAI and that the impact of the recent weakness in the production and income indicators on the index is likely to be transitory.

REFERENCES

Baker, S. G., and B. S. Kramer, 2007, "Peirce, Youden, and receiver operating characteristic curves," American Statistician, Vol. 61, No. 4, November, pp. 343-346.

Berge, T. J., and O. Jorda, 2011, "Evaluating the classification of economic activity into expansions and recessions," American Economic Journal: Macroeconomics, Vol. 3, No. 2, April, pp. 246--277.

Brauning, E, and S. J. Koopman, 2014, "Forecasting macroeconomic variables using collapsed dynamic factor analysis," International Journal of Forecasting, forthcoming.

Brave, S. A., 2008, "Economic trends and the Chicago Fed National Activity Index," Chicago Fed Letter, Federal Reserve Bank of Chicago, No. 250, May.

Brave, S. A., and R. A. Butters, 2013, "Estimating the trend rate of economic growth using the CFNAI," Chicago Fed Letter, Federal Reserve Bank of Chicago, No. 311, June.

--, 2012a, "Detecting early signs of financial instability," Chicago Fed Letter, Federal Reserve Bank of Chicago, No. 305, December.

--, 2012b, "Diagnosing the financial system: Financial conditions and financial stress," International Journal of Central Banking, Vol. 8, No. 2, June, pp. 191-239.

--, 2010, "Chicago Fed National Activity Index turns ten--Analyzing its first decade of performance," Chicago Fed Letter, Federal Reserve Bank of Chicago, No. 273, April.

DeLong, E. R., D. M. DeLong, and D. L. ClarkePearson, 1988, "Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach," Biometrics, Vol. 44, No. 3, September, pp. 837-845.

Doz, C., D. Giannone, and L. Reichlin, 2012, "A quasi-maximum likelihood approach for large, approximate dynamic factor models," Review of Economics and Statistics, Vol. 94, No. 4, November, pp. 1014-1024.

Durbin, J., and S. J. Koopman, 2012, Time Series Analysis by State Space Methods, 2nd ed., Oxford Statistical Science Series, Vol. 38, Oxford, UK: Oxford University Press.

Evans, C. L., C. T. Liu, and G. Pham-Kanter, 2002, "The 2001 recession and the Chicago Fed National Activity Index: Identifying business cycle turning points," Economic Perspectives, Federal Reserve Bank of Chicago, Third Quarter, pp. 26-13, available at www.chicagofed.org/digital_assets/publications/ economic_perspectives/2002/3qepart2.pdf.

Fisher, J. D. M., 2000, "Forecasting inflation with a lot of data," Chicago Fed Letter, Federal Reserve Bank of Chicago, No. 151, March.

Giannone, D., L. Reichlin, and D. Small, 2008, "Nowcasting: The real-time informational content of macroeconomic data," Journal of Monetary Economics, Vol. 55, No. 4, May, pp. 665-676.

Harvey, A. C., 1989, Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge, UK: Cambridge University Press.

Jungbacker, B., S. J. Koopman, and M. van der Wei, 2011, "Maximum likelihood estimation for dynamic factor models with missing data," Journal of Economic Dynamics and Control, Vol. 35, No. 8, August, pp. 1358-1368.

Shumway, R. H., and D. S. Stoffer, 1982, "An approach to time series smoothing and forecasting using the EM algorithm," Journal of Time Series Analysis, Vol. 3, No. 4, July, pp. 253-264.

Stock, J. H., and M. W. Watson, 2002, "Macroeconomic forecasting using diffusion indexes," Journal of Business & Economic Statistics, Vol. 20, No. 2, April, pp. 147-162.

--, 1999, "Forecasting inflation," Journal of Monetary Economics, Vol. 44, No. 2, October, pp. 293-335.

--, 1998, "Median unbiased estimation of coefficient variance in a time-varying parameter model," Journal of the American Statistical Association, Vol. 93, No. 441, March, pp. 349-358.

Watson, M. W., and R. F. Engle, 1983, "Alternative algorithms for the estimation of dynamic factor, mimic and varying coefficient regression models," Journal of Econometrics, Vol. 23, No. 3, December, pp. 385--400.

NOTES

(1) Additional background information on the CFNAI and its method of construction is available at www.chicagofed.org/digital_assets/ publications/cfnai/background/cfnai_background.pdf. A complete list of the 85 indicators, their associated categories, and their respective weights in the overall index is available at www.chicagofed.org/ digital_assets/publications/cfnai/background/cfnai_indicators_list.pdf.

(2) See the next section and box 1 (p. 21) for details on principal components analysis (PCA) and on how the CFNAI is the first principal component of the 85 data series (that is, the single component common to each data series that explains the most variation across all 85).

(3) The terms nowcast and nowcasting are derived from combining the words now and forecasting. Nowcasting techniques are commonly used in economics nowadays because they permit economists today to predict the present (and recent past) of standard measures of the economy (such as real gross domestic product, or GDP), which are often determined after a long delay.

(4) See box 3 (p. 24) for more details on CDF analysis.

(5) See box 1 (p. 21) for further details on the factor model representation of the CFNAI.

(6) Jungbacker, Koopman, and van der Wei (2011) also describe a full maximum likelihood estimator of the collapsed dynamic factor model In order to make direct comparisons across methodologies, we only consider their EM algorithm estimation method in our article.

(7) Technically, the Stock and Watson (2002) methodology can also incorporate mixed-frequency data However, because of that methodology's lack of dynamics, this takes place as an additional transformation of the data in its algorithm

(8) The trend component captures long-run factors, such as potential growth in productivity, capital, and labor. In contrast, the cyclical component captures medium-run factors driving economic growth and is generally associated with the business cycle--the periodic fluctuations in economic activity around its long-term historical trend.

(9) Interestingly, Brave (2008) found similar results for the same two categories (namely, the employment, unemployment, and hours category and personal consumption and housing category) when looking at the impact of slow-moving changes in the average values of their data series over time.

(10) See Brave and Butters (2012a, 2012b) for examples with financial data.

(11) In other words, the difference between the CFNAI and alternative CFNAI is the sum of the difference between the CFNAI and DF-HAC index and the difference between the DF-HAC index and the CDF-HAC index.

(12) See, for instance, Brave and Butters (2010). Other examples using factor models to forecast GDP growth are Stock and Watson (2002) and Giannone, Reichlin, and Small (2008).

(13) This data set is available at www.philadelphiafed.org/research-anddata/real-time-center/real-time-data/.

(14) In making these comparisons, we eliminated quarters where GDP is subject to annual and benchmark revisions.

TABLE 1
Fraction of data variance explained by the index

                            CFNAI    DF    DF-HC   DF-HAC   CDF-HAC

Total                       0.29    0.28   0.27     0.26     0.20
Production and income       0.39    0.38   0.46     0.50     0.43
Employment, unemployment,   0.36    0.37   0.33     0.29     0.32
  and hours
Personal consumption and    0.08    0.08   0.05     0.03     0.04
  housing
Sales, orders, and          0.17    0.17   0.16     0.18     0.21
  inventories

Notes: The table displays the fraction of the overall variance of the
85 underlying indicators in the Chicago Fed National Activity Index
(CFNAI) that is explained by the CFNAI and each of the four
dynamic-factor-based indexes over the period March 1967 through
February 2014. In addition, it decomposes this fraction into the
share explained by each of the four broad categories of indicators
listed here. The four dynamic-factor-based indexes--DF; DF-HC,
DF-HAC, and CDF-HAC--are derived from methodologies based on
Giannone, Reichlin, and Small (2008), Doz, Giannone, and Reichlin
(2012), Jungbacker, Koopman, and van der Wei (2011), and Brauning and
Koopman (2014), respectively (see the text for further details).

Source: Authors' calculations based on data from Haver Analytics.

TABLE 2
AUROC for NBER recessions and RMSE ratios
for current quarter GDP growth predictions

                  In-sample    Out-of-sample
          AUROC   RMSE ratio    RMSE ratio

DF        0.95       0.89          0.98
DF-HC     0.95       0.93          0.99
DF-HAC    0.95       0.93          0.85
CDF-HAC   0.98       0.58          0.81

Notes: The table displays areas under the receiver operating
characteristic (ROC) curve (AUROC) and root mean squared error (RMSE)
ratios for current quarter real gross domestic product (GDP) growth
forecasts based on the three-month moving averages of the four
dynamic-factor-based alternatives to the Chicago Fed National
Activity Index (CFNAI). The four dynamic-factor-based indexes--DF,
DF-HC, DF-HAC, and CDF-HAC--are derived from methodologies based on
Giannone, Reichlin, and Small (2008), Doz, Giannone, and Reichlin
(2012), Jungbacker, Koopman, and van der Wei (2011), and Brauning and
Koopman (2014), respectively (see the text for further details). The
closer the AUROC value is to 1, the more accurate a
dynamic-factor-based index is in signaling U.S. recessions and
expansions as determined by the National Bureau of Economic Research
(NBER). An RMSE value of less than 1 indicates a dynamic-factor-based
index's forecast that is more accurate than a similar forecast based
on the traditional CFNAI using the nowcasting models described in
Brave and Butters (2013) for in-sample comparisons over the period
March 1967 through February 2014 and Brave and Butters (2010) for
out-of-sample comparisons over the period December 2003 through April
2013 (see the text for further details).

Sources: Authors' calculations based on data from the Federal Reserve
Bank of Philadelphia, Real-Time Data Set for Macroeconomists; and
Haver Analytics.