Nowcasting using the Chicago Fed National Activity Index.
Brave, Scott A. ; Butters, R. Andrew
Introduction and summary
The Chicago Fed National Activity Index (CFNAI) is a monthly index
of U.S. economic activity constructed from 85 data series (or
indicators) classified into four groups: production and income;
employment, unemployment, and hours; personal consumption and housing;
and sales, orders, and inventories. (1) The index is estimated as the
first principal component of the 85 data series, (2) and is essentially
a weighted average of the indicators, with their individual weights
representing the relative degree to which each indicator explains the
overall variation among them. The CFNAI is also normalized to reflect
deviations around a long-term historical rate of economic growth. As
such, a zero value of the index indicates that growth in economic
activity is proceeding along its long-term historical path; a negative
value indicates below-average growth, while a positive value indicates
above-average growth.
The CFNAI, which premiered in March 2001, was originally designed
as a leading indicator for inflation (Stock and Watson, 1999; and
Fisher, 2000). However, much of its current value derives from its
ability to capture U.S. business cycles (that is, the periodic
fluctuations in economic activity around its long-term historical trend)
and nowcast (3) U.S. real gross domestic product (GDP) growth (Evans,
Liu, and Pham-Kanter, 2002; and Brave and Butters, 2010). The index has
been shown to align with the historical timing of U.S. recessions
according to the National Bureau of Economic Research (NBER), with close
to 95 percent accuracy (Berge and Jorda, 2011). Moreover, the CFNAI has
the ability to signal in real time the onset and end of a recession--for
instance, the index did this for the 2001 and 2007-09 recessions within
one to three months of the NBER dates, with an average lead time of one
year prior to the official NBER announcements (Brave and Butters, 2010).
The CFNAI's success has been more mixed in terms of predicting real
GDP growth, although for the 2004-09 period its performance was on par
with the median current quarter forecast from the Federal Reserve Bank
of Philadelphia's Survey of Professional Forecasters (Brave and
Butters, 2010).
In this article, we consider an alternative version of the CFNAI
that is chiefly constructed using the methodology developed in Brauning
and Koopman (2014). Their method of collapsed dynamic factor (CDF)
analysis4 offers several advantages over the CFNAI's traditional
methodology--principal components analysis (PCA)--when it comes to
estimating the index: first, through its incorporation of the dynamic
properties of the time series for the index (which PCA cannot exploit)
and, second, by further disentangling common drivers of the variation in
the underlying data series from idiosyncratic ones. Common drivers of
the CFNAI indicators are the types of macroeconomic shocks generally
associated with the business cycle, while idiosyncratic drivers include
shocks typically isolated to various specific sectors of the U.S.
economy, as captured in the four broad categories of the CFNAI
indicators. Moreover, the methodology of Brauning and Koopman (2014)
makes it possible to directly link the CFNAI to broad economic
indicators constructed at a lower frequency, such as quarterly real GDP
growth.
[FIGURE 1 OMITTED]
Figure 1 plots the history of the traditional monthly CFNAI and the
alternative CFNAI, which is largely based on applying the methodology of
Brauning and Koopman (2014), from March 1967 through February 2014. The
shaded periods in the figure represent U.S. recessions as identified by
the NBER. The alternative CFNAI shown here produces a superior in-sample
fit and out-of-sample projections of current quarter real GDP growth
while correlating more closely with NBER recessions than the traditional
CFNAI. These improvements depend on both the way in which the
correlation structure of the 85 underlying data series (at a certain
point in time and across time) is taken into account in the estimation
procedure and the particular way in which real GDP growth and its
dynamic properties are incorporated. We establish this fact by drawing
comparisons between the static factor model for the CFNAI and several
dynamic factor models that include quarterly real GDP growth but differ
from the Brauning and Koopman (2014) methodology in how they include it.
In the process of updating the CFNAI with the Brauning and Koopman
(2014) methodology, we also learn something about the nature of the
recovery from the most recent recession. Our application of CDF analysis
provides us with some additional context for the uneven pattern of
growth during the recovery: First, we show that there has been a
moderate decline in the trend rate of real GDP growth since December
2007; and second, we note that the share of variation among the
underlying data series for the CFNAI (particularly those in the personal
consumption and housing category and employment, unemployment, and hours
category) due to idiosyncratic drivers increased during the 2007-09
recession and the subsequent recovery. The second finding leads our
estimate of the alternative CFNAI during much of the recent recovery to
be higher than that of the traditional CFNAI given the weakness of these
data series, although the first finding somewhat offsets the impact on
U.S. real GDP growth of this upward revision in our assessment of
economic activity.
Box 1 Principal components analysis and factor analysis
Here, we explain the mathematics behind using PCA to construct the
traditional CFNAI. Let x denote the N x 1 column vector of N data
series at time t. The first step is to form the N x T matrix of
data vectors [X.sub.t], where each row of this matrix contains T
observations normalized to have a mean of zero and a standard
deviation of one. [1] The eigenvector-eigenvalue decomposition of
the variance-covariance matrix [X.sub.t] [X.sub.i]/N then produces
a set of time-invariant weights referenced by the 1 x 85 row vector
w resulting from a transformation of the eigenvector associated
with the largest eigenvalue of this matrix. These weights are then
used to construct a weighted average of the x such that the
resulting index is given by 1 = wx.
The underlying assumption about [X.sub.t] necessary to produce this
variance decomposition is that it admits a factor model
representation. This means that it can be additively decomposed
into the product of two vectors--an N x 1 column vector of
time-invariant factor loadings [GAMMA] and a 1 x T time-varying
latent factor [F.sub.t]--and a normally distributed mean zero
random variable [[epsilon].sub.t] with variance-covariance matrix
[[sigma].sup.2]I:
[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t].
The values of [F.sub.t] and [GAMMA] are then jointly estimated by
maximizing Tr[[GAMMA]'[X.sub.t][X.sub.t]T] subject to a
normalization constraint. [2] This linear optimization problem is
solved by setting the estimator [??]/N equal to w. [3] The
estimated factor in our example, given by [MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII] corresponds to the CFNAI. As such, it is
the principal component common to all N = 85 indicators that
explains the largest amount of variation among them.
[1] Underlying the normalization of the data is the concept of
stationarity, or in this case the first and second moment
restrictions that the mean and variance of each indicator do not
vary over time. Each data series first receives a transformation to
make it stationary prior to its normalization. A list of
transformations can be found at
www.chicagofed.org/digital_assets/publications/
cfnai/background/cfnaiindicatorsjist.pdf.
[2] The normalization constraints most commonly used with one
factor are [GAMMA]'[GAMMA]/N = 1 and [F'.sub.t][F.sub.t]/T = 1.
[3] See Stock and Watson (2002) for more details on the connection
between PCA and factor analysis. Also, note that identification
here is achieved only up to the scale provided by the normalization
constraint on the factor loadings To be able to interpret the
CFNAI, we use the normalization [F'.sub.t][F.sub.t]/T = 1.
In the next section, we detail the traditional and several
alternative methods of constructing the CFNAI. Then, we further explain
the implications of our proposed update to the CFNAI using the Brauning
and Koopman (2014) framework. Next, we describe what is driving the
differences in timing of U.S. recessions across the Chicago Fed National
Activity Index's three-month moving average (CFNAI-MA3) and the
alternative CFNAI's three-month moving average (alternative
CFNAI-MA3). Afterward, we show how the alternative CFNAI-MA3 can be used
to nowcast annualized quarterly real GDP growth more accurately than the
traditional CFNAI-MA3 and several other dynamic-factor-based indexes.
Finally, we present our conclusions and comment on what they may imply
for U.S. economic activity over the near term.
Traditional and alternative methods of constructing the CFNAI
The traditional method of constructing the CFNAI-- principal
components analysis--proceeds by means of an eigenvector-eigenvalue
decomposition of the variance-covariance matrix of the 85 underlying
data series. This static factor model description of the data, detailed
in box 1, produces a principal component for each eigenvalue of the
variance-covariance matrix. The eigenvector associated with the largest
eigenvalue of the matrix constitutes the weights applied to the data
series that are used to construct the first principal component, or what
we call the CFNAI. Stock and Watson (2002) show that this method of
constructing the CFNAI is capable of producing a consistent estimate of
the underlying static factor model of the data as the number of data
series and the number of time periods become large.
The CFNAI is estimated monthly and released near the middle of each
month with a history from March 1967 through the month preceding that of
the release date. The lag of approximately one month between the last
month of the index and the release date is necessary because of
limitations on data availability. In addition to this one-month
production lag, many of the data series themselves are only available at
a further one- to two-month lag. The limited availability of data at the
time of estimation results in a variance-covariance matrix of less than
full rank, thereby making principal components analysis infeasible. To
circumvent this issue, we forecast each incomplete data series
separately up to the month in which the index is produced according to
individual autoregressive processes with five lags before the index is
constructed.
This method of completing the panel of indicators in order to
construct a first principal component is very flexible but not unique or
even necessary. Stock and Watson (2002) also demonstrate how to produce
an index estimate when data are missing with the same desirable
statistical properties as PCA. Their methodology relies explicitly on
the incomplete data methods of the expectation-maximization (EM)
algorithm made popular by Watson and Engle (1983); and although it does
not take into account the serial correlation properties of each data
series like the current CFNAI procedure for inferring missing data or
the dynamic properties of the index itself, it does account for the
data's underlying factor representation to both estimate the index
and impute missing values. (5)
The Stock and Watson (2002) EM algorithm uses the information from
the complete, or "balanced," panel of indicators to make the
best possible prediction of the incomplete, or "unbalanced,"
panel of indicators. When applied to the construction of the CFNAI, it
begins by performing PCA on the subset of data series that are available
in all time periods. Missing values are then predicted based upon linear
regressions of each of the 85 data series on the first principal
component. Finally, PCA is repeated on the balanced panel of data, which
combines the observed and predicted data. This process continues until
the difference in the sum of the squared prediction errors between
iterations reaches a desired level of convergence.
Since the inception of the CFNAI in March 2001, several alternative
methods of constructing economic activity indexes that build on PCA have
been proposed. Each of these is also an example of factor analytic
methods; the differences across the methodologies mainly depend on how
variation due to common drivers versus idiosyncratic ones is decomposed
across data series and time periods. Here, we briefly describe a few of
these alternative methodologies, contrasting them with the traditional
methodology used for the CFNAI and each other before we explain the
collapsed dynamic factor model of Brauning and Koopman (2014). Boxes 2
and 3 present many of the technical details of these methods that are
omitted in the discussion that follows.
Giannone, Reichlin, and Small (2008) extended the static
representation of the factor model of Stock and Watson (2002) into a
dynamic factor model by incorporating both information from the cross
section of data series (at each point in time) and information on data
series across time into the process of estimating the index and imputing
missing values. Doz, Giannone, and Reichlin (2012) then subsequently
provided an alternative EM algorithm with which to estimate the dynamic
factor model. In the first step, PCA is performed up to the point in
time for which all data series are available. The first principal
component from this static factor model is then used to obtain the
initial parameter values for the dynamic factor model shown in box 2.
The estimation of the CFNAI in this case proceeds by means of the Kalman
filter and smoother equations applied to this model. The resulting index
is then used to reestimate parameter values, and the process is repeated
until convergence of the model's log-likelihood is achieved as
shown in box 2.
In the methodology of Doz, Giannone, and Reichlin (2012), data
series that are unavailable each month are ignored for inferring the
value of the index, but are forecasted using information on the dynamic
properties of the index via the Kalman filter. Furthermore, unlike PCA,
the idiosyncratic error structure of the data can be relaxed to
accommodate unequal variances across unobserved idiosyncratic drivers of
the data series (that is, heteroskedasticity). These modifications of
the underlying factor model for the CFNAI are not costless, however, as
they come at the price of estimating a much larger number of parameters.
Besides the obvious increase in complexity and in the time and computing
power necessary to estimate and construct the index using dynamic factor
methods versus static factor methods, other potential drawbacks from
this richer class of factor models include the additional uncertainty
introduced when using the index to make out-of-sample projections of
inflation and economic growth as in Fisher (2000) and Brave and Butters
(2010).
Collapsed dynamic factor analysis as presented in Jungbacker,
Koopman, and van der Wei (2011) and explained in box 3 minimizes these
costs by transforming the static portion of the dynamic factor model in
such a way as to significantly reduce the number of estimated parameters
needed to run the Kalman filter and compute projections of auxiliary
variables of interest, such as real GDP growth. We follow their
methodology in order to incorporate serial correlation within data
series (idiosyncratic autocorrelation) in addition to heteroskedasticity
into the static factor model. All of the alternative methods for
constructing the CFNAI that we have presented thus far, however, remain
sensitive to the use of PCA as a starting point for estimation. (6) If
the PCA estimate of the index is biased even after accounting for its
dynamics, none of the dynamic factor models considered here is
guaranteed to produce an unbiased estimate of the index.
Brauning and Koopman (2014) provide an alternative transformation
of the static factor model that does not assume that PCA produces an
unbiased estimate of the index. In their framework, each data
series' factor loadings are fixed at their PCA values. Then, by
treating the PCA estimate of the index as a noisy indicator of the
"true" measure, their method reoptimizes the index such that
it explains the largest percentage of the variation in the PCA estimate
of the index that is consistent with both its own estimated dynamics and
that of an auxiliary "target" variable. The target variable is
described as a more comprehensive but perhaps less frequently available
indicator of the information set over which the other data series span.
The target variable is also unique in that it alone follows its own
estimated dynamics and loads directly on both current and past values of
the index, whereas in our application the dynamics of the index do not
depend on the target variable. In our application and theirs, real GDP
growth is used as the target variable.
BOX 2
Dynamic factor analysis
PCA and traditional factor analysis are static estimation
methods in that they do not incorporate information
from both the cross section of data series and the
information from across time. Dynamic factor analysis
instead makes use of variation in both forms. To do
so, it relies on signal extraction methods, such as the
Kalman filter, applied to a system of equations relating
the latent factor, or the CFNAI as in the example
in box 1 (p. 21), to both the cross section of data series
at each point in time (a "measurement" or "observation"
equation) and the dynamic factors that drive its
fluctuations over time (a "state" equation).
Mathematically, this involves specifying the
following state-space representation:
[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t],
[F.sub.t] = A[F.sub.t-1] + [v.sub.t],
where F is the 1 x T latent factor capturing a time-varying
common source of variation in the 85 x T
matrix of indicators [X.sub.t]; [GAMMA] is the 85 x 1 loadings onto
the factor; and A is the transition matrix describing
the evolution of the latent factor over time. We write
the A parameter of the model assuming a first-order
autoregressive process (AR(1)) for [F.sub.t], which can be
generalized to an arbitrary number of lags, p. [1] The
static factor model representation of the CFNAI
described in box 1 thus forms the measurement equation
of the state-space representation of the dynamic
factor model. Adding dynamics of some finite order
to [F.sub.t] yields its state equation.
Both [[epsilon].sub.t] and [v.sub.t] are assumed to be
independently
normally distributed mean zero random variables. We
follow the dynamic factor model of Doz, Giannone,
and Reichlin (2012) and assume that Var([[epsilon].sub.t]) = H (an
85 x 85 diagonal matrix) and Var([v.sub.t])() = 1. [2] The signal
extraction methods of the Kalman filter and smoother
are capable of estimating such a model given the coefficient
matrices of the measurement and state equations,
that is, [GAMMA] and A, and the idiosyncratic error variances
along the diagonal of H. All of these parameters can
be consistently estimated from linear regressions involving
[X.sub.t] and the smoothed or PCA estimate of [F.sub.t] as
demonstrated in Giannone, Reichlin, and Small (2008).
With the model in state-space form and initial estimates
of the system matrices, the expectation-maximization
(EM) algorithm outlined by Shumway and Stoffer
(1982) can be used to estimate the latent factor [F.sub.t]. At
each iteration of the algorithm, one pass of the data
through the Kalman filter and smoother is made followed
by reestimating the system matrices. [3] The log-likelihood
that results is nondecreasing, and convergence is governed
by its stability. [4] This iterative estimation process
combines the efficiency of likelihood-based estimation
of the latent factor with the consistency of ordinary
least squares (OLS) parameter estimates.
To see the relationship between the static and
dynamic factor models, consider the case where the
transition matrix of the state equation, A, is the zero
matrix. That is, nullify the impact of dynamics for the
latent factor. Notice that if we specify the variance-covariance
matrix of the measurement equation's error
term is proportional to the identify matrix (and based
on the description of PCA discussed in box 1), we end
up with an estimate of the latent factor that is proportional
to the first principal component. For this reason,
our traditional methodology for the CFNAI can be considered
a special case of the dynamic factor model with
a zero transition matrix and a homoskedastic idiosyncratic
error structure (that is, the assumption of equal
variances across unobserved idiosyncratic drivers of
the underlying data series).
[1] We choosep depending on the model being estimated, but all
models use either three or four lags.
[2] The latter restriction acts to set the scale of the dynamic
factor
model just as the normalization on the scale of the factor loadings
used in PCA does for the static factor model.
[3] In addition, a small alteration in the least-squares step is
required
to account for the fact that the unobserved components of the
model must first be estimated. See Durbin and Koopman (2012)
for further details
[4] Our stability criterion where k references iteration is as
follows:
|logL(k)-logL(k-1)/((\log L(k) + \log L(k) +\log L(k-1))/2)<
[10.sup.-6].
Our use of the Brauning and Koopman (2014) methodology is motivated
by several persistent criticisms
Scott A. Brave is a senior business economist in the Economic
Research Department at the Federal Reserve Bank of Chicago, and R.
Andrew Butters is a graduate student at the Kellogg School of
Management, Northwestern University. The authors thank Alejandro
Justiniano, Dick Porter, and an anonymous referee for helpful comments
and suggestions.
Economic Perspectives is published by the Economic Research
Department of the Federal Reserve Bank of Chicago. The views expressed
are the authors' and do not necessarily reflect the views of the
Federal Reserve Bank of Chicago or the Federal Reserve System.
Charles L. Evans, President, Daniel G. Sullivan, Executive Vice
President and Director of Research; Spencer Krane, Senior Vice President
and Economic Advisor, David Marshall, Senior Vice President, financial
markets group, Daniel Aaronson, Vice President, microeconomic policy
research', Jonas D. M. Fisher, Vice President, macroeconomic policy
research', Richard Heckinger, Vice President, markets team, Anna L.
Paulson, Vice President, finance team', William A. Testa, Vice
President, regional programs', Richard D. Porter, Vice President
and Economics Editor, Helen Koshy and Han Y. Choi, Editors', Rita
Molloy and Julia Baker, Production Editors', Sheila A Mangier,
Editorial Assistant.
Economic Perspectives articles may be reproduced in whole or in
part, provided the articles are not reproduced or distributed for
commercial gain and provided the source is appropriately credited. Prior
written permission must be obtained for any other reproduction,
distribution, republication, or creation of derivative works of Economic
Perspectives articles. To request permission, please contact Helen
Koshy, senior editor, at 312-322-5830 or email Helen.Koshy@chi.frb.org.
ISSN 0164-0682
of perceived bias in the CFNAI. One source of bias in the CFNAI
could stem from not including enough variables to span the space of U.S.
economic activity--for instance, by omitting international trade or
government spending indicators (which inform real GDP growth) as the
CFNAI currently does. Another source of bias in the CFNAI could be due
to a preponderance of data confined to one or more sectors of the
economy--for instance, the potential overweighting of manufacturing data
series that dominate the production and income category of indicators
and the CFNAI. Yet another source of bias in the CFNAI could result from
the omission of any additional common components in the CFNAI data
series in the estimation of the dynamic factor model. Here, we consider
only the likelihood of the first two potential sources of bias, but note
that our results remain sensitive to the possibility of the last one.
See box 3 for more details on how the Brauning and Koopman (2014)
methodology helps to correct for these potential sources of bias in the
CFNAI.
BOX 3 Collapsed dynamic factor analysis
Collapsed dynamic factor analysis reflects its name. It begins by
applying a transformation to the measurement equation of the
dynamic factor model's statespace representation in order to
collapse its size to match the typically smaller size of the state
equation. In the context of the Jungbacker, Koopman, and van der
Wei (2011) model applied to the CFNAI, this amounts to
premultiplying the 85 x T matrix of indicators [X.sub.t] by the
transformation [A.sub.L] =
[([GAMMA]'[[OMEGA].sup.-1][GAMMA]).sup.-][GAMMA]'[[OMEGA].sup.-1]],
where [OMEGA] is the variance-covariance matrix of
[[epsilon].sub.t]:
[X.sup.L.sub.t] = [F.sub.t] + [u.sub.t],
The transformed measurement equation, shown here, then relates a
scalar, [X.sup.L.sub.t] = [A.sub.L][X.sub.t], with a unit factor
loading to the latent factor, [F.sub.t], and a mean zero normally
distributed random scalar [u.sub.t] with variance H =
[([GAMMA]'[[OMEGA].sup.-1]).sup.-1]. The state equation is
unaltered from the example in box 2 (p. 23).
Notice that the transformation here when applied to [X.sub.t] takes
the familiar form of the generalized least squares (GLS) solution
for the latent factor [F.sub.t] with [OMEGA] as the weight matrix.
Brauning and Koopman (2014) suggest the use of an alternative
transformation. In their example, [A.sub.L] = [??]/N, where [??],
is the PCA estimate of the factor loadings of the static factor
model, [X.sup.L.sub.T] = [??][X.sub.t]/N is the traditional CFNAI
as shown in box 1 (p. 21), and [u.sub.t] = [??][[epsilon].sub.t]/N.
Furthermore, H is not assumed to be a predetermined function of the
dynamic factor model's factor loadings and the variance-covariance
matrix of its idiosyncratic errors. It is instead estimated as an
additional parameter. The estimation of H is made possible by the
inclusion of an additional measurement equation containing a
"target" variable, which is real GDP growth in their example and
ours.
The random scalar [u.sub.t] in this context has the interpretation
of a "measurement error" between the PCA estimate of the CFNAI and
its dynamic factor counterpart. Notice that it is also a weighted
average of the idiosyncratic disturbances of the static factor
model, with the weights corresponding to the PCA factor loadings.
The implicit assumption maintained by Brauning and Koopman (2014)
to derive their transformation is that [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII]. Deviations from this assumption will
produce some approximation error as well in [u.sub.t].
We modify the Brauning and Koopman (2014) methodology for our
purposes by applying the transformation to an alternative
representation of the indicators, [[??].sub.t] = [X.sub.t] -
[rho][X.sub.t-1]. This modification allows us to draw finer
comparisons with the collapsed dynamic factor model of Jungbacker,
Koopman, and van der Wei (2011), which also allows for
heteroskedasticity and serial correlation in the idiosyncratic
errors [[epsilon].sub.t] but assumes PCA produces an unbiased
estimate of the latent factor. To make this modification operative,
we first estimate the collapsed dynamic factor model of Jungbacker,
Koopman, and van der Wei (2011) to obtain estimates of the p vector
and construct [[??].sub.t]. [1] We then apply PCA to the covariance
matrix of [[??].sub.t] to obtain [A.sub.L] = [??]/N and proceed as
described earlier in this box. [2]
Our application of the Brauning and Koopman (2014) methodology also
requires an additional measurement equation relating quarterly real
GDP growth, [Y.sub.t], to its own lagged value, [Y.sub.t-3];
current and past values of the three-month moving average of the
CFNAI, [F.sup.3.sub.t]; and a time-varying intercept,
[T.sup.3.sub.t]. Real GDP growth in this framework acts to "clean"
the PCA estimate of the three-month moving average of the monthly
index by apportioning it in each quarter into a fragment that is
correlated with quarterly real GDP growth,
[[gamma].sub.0][F.sup.3.sub.t] +
[[summation].sup.3.sub.k=1][[gamma].sub.k][F.sup.3.sub.t-k], and a
fragment that is not, [T.sup.3.sub.t] + [gamma][Y.sub.t-3] +
[v.sub.t]. based on the regression coefficients, [[gamma].sub.k]
and [delta]. The mean zero normally distributed random variables
[u.sub.t] and [v.sub.t] are assumed to be independent. Box 4
provides more details on this particular nowcasting specification:
[Y.sub.t], = [T.sup.3.sub.t] + [gamma][Y.sub.t-3] +
[[gamma].sub.0][F.sup.3.sub.t] + [3.summation over
(k=1)][[gamma].sub.k][F.sup.3.sub.t-k] + [v.sub.t].
This errors-in-variables framework is estimated by Brauning and
Koopman (2014) by full maximum likelihood techniques. Here, to
maintain consistency with the way the other dynamic and collapsed
dynamic factor models are estimated, we instead use a variant of
the EM algorithm described in box 2 to estimate the transformed
state-space representation. In order to use the Jungbacker,
Koopman, and van der Wei (2011) estimate of the smoothed latent
factor in the first step, this process requires a restricted
least-squares regression of the PCA estimate of the factor on the
smoothed latent factor and an additional linear regression for the
target variable equation. Additional details on the estimation
process can be found in box 4.
[1] The maintained assumption in this exercise in order for our
estimate of [rho] to be unbiased is that Cov([X.sub.t-1],,
[[xi].sub.t]) = [[xi].sub.t], where is a composite error term
comprising [[epsilon].sub.t] and the contemporaneous measurement
error in the estimated factor.
[2] Because the indicators have already been demeaned and
standardized, they are measured in common units. Thus, obtaining
principal components from the covariance matrix instead of the
correlation matrix of [[??].sub.t], allows us to incorporate
unequal variances across the indicators.
Implications of the update for the CFNAI
The estimation of the dynamic factor models for the CFNAI requires
only slight modifications to existing methods as shown in box 3. (7) To
be able to compare indexes based on alternative methodologies, we
include real GDP growth as an additional indicator for each of the
dynamic factor alternatives to the CFNAI's traditional methodology
discussed previously. This way we can highlight the joint role played by
including real GDP growth along with the dynamic factor elements
discussed in the previous section. Moreover, to capture the role played
by allowing for dynamics in the estimation process instead of relaxing
various PCA restrictions, we use a variant of the Giannone, Reichlin,
and Small (2008) methodology. In this case, the factor model for the
CFNAI is estimated using the Doz, Giannone, and Reichlin (2012) EM
algorithm, preserving the PCA restrictions on the idiosyncratic error
structure of the data but allowing for a dynamic process of the index to
be estimated.
Another benefit of the alternative estimation frameworks presented
in the previous section is that, following Brauning and Koopman (2014),
it becomes feasible to decompose real GDP growth into its trend and
cyclical components. (8) Based on our past work (Brave and Butters,
2010, 2013), this ability to decompose real GDP growth has turned out to
be vital to capturing changes in average real GDP growth over long
periods. Given this finding, we developed a specification that allows
for a time-varying intercept in the equation for quarterly real GDP
growth to capture changes over time in its trend rate of growth (see box
4). To capture cyclical movements, we follow Brave and Butters (2010) in
using one lag of quarterly real GDP growth in addition to current and
past values of the three-month moving average of the monthly index.
Figure 2 plots in separate panels the difference between the CFNAI
and each of the four dynamic-factor-based indexes. Simply adding dynamic
elements to the static factor model, as well as quarterly real GDP
growth in the construction of the index, produces small differences from
the traditional CFNAI. This can be seen in the difference between the
CFNAI and the first dynamic-factor-based index (labeled DF in panel A of
figure 2). Further relaxing the PCA restrictions on the idiosyncratic
error structure of the data has a more pronounced effect; this is
apparent in the difference between the CFNAI and the
dynamic-factor-based index with heteroskedastic errors (labeled DF-HC in
panel B of figure 2) and between the CFNAI and the dynamic factor-based
index with heteroskedastic and serially correlated errors (labeled
DF-HAC in panel C). However, the difference from the traditional CFNAI
is most prominent for the dynamic-factor-based index based on the
methodology of Brauning and Koopman (2014) (labeled CDF-H AC in panel D
of figure 2)--which we refer to as the alternative CFNAI in figure 1 (p.
20). These results are detailed further in table 1 (p. 28), which
displays the cumulative effect on the explained variance of the 85
underlying data series for the traditional CFNAI from altering the
various assumptions underlying its static factor model. Each successive
addition to the static factor model for the CFNAI--from dynamics and
real GDP growth (first row, second column) to heteroskedastic errors
(first row, third column) and serially correlated idiosyncratic errors
(first row, fourth column)--reduces the explained variance of the 85
underlying data series by the index, but none more so than the Brauning
and Koopman (2014) methodology (first row, fifth column), which corrects
for bias arising from the use of PCA. It is important to note here that
the reductions in explained variance do not reflect a failure of the
dynamic factor model to account for variation among these data series at
a certain point in time or within them across time. Instead, such
reductions reflect the fact that more of the variation in these series
is estimated to arise from idiosyncratic drivers (including potential
bias due to the use of PCA) rather than common ones. The alternative
CFNAI (first row, fifth column) explains only 20 percent of the total
variance of the 85 data series--a reduction of almost one-third of the
total variance explained by the traditional CFNAI (first row, first
column) and a reduction of almost one-fourth of the total variance
explained by its closest counterpart, the DF-FLAC index (first row,
fourth column).
BOX 4
The model for nowcasting real GDP growth
Our dynamic factor model for the CFNAI is given by the system of
equations in box 2 (p. 23), and is repeated here for convenience:
[X.sub.t] = [GAMMA][F.sub.t] + [[epsilon].sub.t], [F.sub.t] =
A[F.sub.t] + [v.sub.t],
To obtain the collapsed dynamic factor models discussed in the
text, we substitute the measurement equations described in box 3
(p. 24) for the first equation here.
The variant of this system based on Giannone, Reichlin, and Small
(2008) parameterizes the variance-covariance matrix of
[[epsilon].sub.t], or H, as [[sigma].sup.2]I, in accordance with
the description of PCA in box 1 (p. 21). The variant based on Doz,
Giannone, and Reichlin (2012) instead assumes a heteroskedastic H
with diagonal elements equal to o[[sigma].sup.2.sub.i]. In addition
to allowing for heteroskedasticity, the variant based on
Jungbacker, Koopman, and van der Wei (2011) allows for
idiosyncratic serial correlation up to the first order, where we
choose the degree of serial correlation for each of the 85 data
series prior to estimating according to the Bayesian information
criterion. The CDF variant referenced in the text estimates H as a
scalar parameter according to Brauning and Koopman (2014).
We append to this model a nowcasting equation relating annualized
quarterly real GDP growth, [Y.sub.t], in each time period to its
own lagged value, [Y.sub.t-3]; current and past values of the
three-month moving average of the latent factor [F.sup.3.sub.t],
and a time-varying intercept, [T.sup.3.sub.t] We only observe
[Y.sub.t] in the third month of each quarter, so that this equation
strictly relates each quarterly realization of real GDP growth to
only the corresponding end-of-quarter value of [T.sup.3.sub.t];
[Y.sub.t] + [T.sup.3.sub.t] + [delta][Y.sub.t-3] + [[gamma].sub.0]
[F.sup.3.sub.t] + [3.summation over
(k=1)][[gamma].sub.k][F.sup.3.sub.t-k] + [v.sub.t].
To be able to estimate the model, we must first specify a dynamic
process for the latent time-varying intercept, [T.sup.3.sub.t], by
adding a second state equation to the model. We assume that it is
the quarterly average of a monthly process [T.sub.t] that follows a
random walk with drift parameter [alpha]:
[T.sub.t] = [alpha] + [T.sub.t-1] + [[eta].sub.t],
As such, [T.sup.3.sub.t] represents the time-varying mean of
quarterly real GDP growth conditional on the previous quarter's
value of real GDP growth [Y.sub.t-3] and current and past values of
[F.sup.3.sub.t], and can be interpreted as trend real GDP growth.
Furthermore, we assume that [v.sub.t] and [[eta].sub.t], are mean
zero normally distributed random variables with variances V and W,
respectively, that are uncorrelated with each other,
[[epsilon].sub.t], and [v.sub.t].
This particular specification of the nowcasting equation expands on
Brave and Butters (2010), in which we used the CFNAI to nowcast
real GDP growth, and is largely taken from the follow-up discussion
in Brave and Butters (2013). It is based on a decomposition of
trend and cyclical components for real GDP growth as in Brauning
and Koopman (2014), where the cyclical dynamics of real GDP growth
are assumed to be captured by lagged real GDP growth and current
and past values of the three-month moving average of the latent
factor. However, it also represents a departure from the
specification considered by Brauning and Koopman (2014), which uses
a different method of aggregation to relate real GDP growth to the
monthly latent factor, includes additional lags of real GDP growth,
and does not include a time-varying intercept.
Our model is estimated using a variant of the EM algorithms
described in boxes 2 and 3. The use of the Kalman filter requires
that we specify initial values for the mean and variance of
[F.sub.t] and [T.sub.t] Here, we use the exact initialization
procedure described in Harvey (1989) for [F.sub.t], as well as a
diffuse initialization for [T.sub.t] by assuming that its initial
mean value is the estimated constant in the presample regression of
annualized quarterly real GDP growth on a constant in the 20
quarters prior to our sample beginning in March 1967 and setting
its initial variance to the variance of this estimate. From the
in-sample regression of annualized quarterly real GDP growth on a
constant, one lag of itself, and current and previous values of the
CFNAIMA3, we then obtain our initial parameter estimates of
[delta], [gamma], and V. Initializing a at zero, we then obtain our
initial estimate of W according to the median unbiased estimation
procedure described in Stock and Watson (1998) applied to a
local-level unobserved components model for quarterly real GDP
growth. At subsequent iterations, a and W are then reestimated by
restricted linear regression using our estimate of [T.sub.t].
Overall, correcting for bias is of greater importance than any
other modification in explaining the differences between the traditional
CFNAI and the alternative CFNAI according to the results in table 1.
However, the other modifications to the underlying static factor model
for the CFNAI reflected in the table are also worth highlighting. For
instance, the various dynamicfactor-based indexes exhibit very different
shares of explained variance by the index across the four broad
categories of indicators. Allowing for heteroskedastic errors shifts
explained variance toward the production and income category of
indicators and away from the other three categories (see second through
fifth rows, differences between second and third columns). Additionally
allowing for idiosyncratic autocorrelation has a similar effect but also
boosts the share of explained variance due to the sales, orders, and
inventories category (see second through fifth rows, differences between
third and fourth columns). The employment, unemployment, and hours
category and personal consumption and housing category are particularly
affected by the modifications to the idiosyncratic error structure of
the static factor model for the CFNAI. (9) For these reasons (and as
explained in box 3, p. 24), we deviate slightly from the Brauning and
Koopman (2014) model by continuing to account for both heteroskedastic
and serially correlated errors in the CDF-FIAC index.
[FIGURE 2 OMITTED]
Additionally, allowing (and correcting) for bias from using PCA in
the estimation of the CFNAI, as shown in the fifth column of table 1,
serves to reapportion the explained variance shares slightly more
equally among the remaining three categories at the expense of the
production and income category of indicators. In fact, much of the bias
in the CFNAI that we estimate can be traced back to the contribution of
the production and income category of indicators. Hence, the concern
over potential overweighting of manufacturing data sources that dominate
this category of indicators appears to be valid. The end result is an
index (that is, the CDF-HAC index) that puts slightly more weight on the
sales, orders, and inventories and production and income categories than
the traditional CFNAI does (despite the correction for bias arising from
the latter category) and less weight on the personal consumption and
housing and employment, unemployment, and hours categories. Furthermore,
we should point out that although the difference in the personal
consumption and housing category's share of the fraction of data
variance explained by the CFNAI and the alternative CFNAI (the CDF-HAC
index) may at first seem small, its economic significance is anything
but small given the outsized contribution of this category to the
weakness in economic activity during the recent recession and subsequent
recovery. In fact, we find that a sizable portion of the upward revision
seen in the alternative CFNAI during the recovery can be traced back to
this result, as we discuss in the next section.
Capturing business cycles
One of the CFNAI's key successes has been its use as an
indicator of U.S. business cycles. Traditionally, the three-month moving
average of the index-- the CFNAI-MA3--has been used for this purpose in
the past on account of the volatile nature of the monthly CFNAI. We
follow this precedent here, but note that one clear benefit of the
Brauning and Koopman (2014) methodology is that it mitigates to some
degree the concern about the volatility of the monthly index. Using the
nonparametric method developed in Berge and Jorda (2011), we can
quantify the accuracy of both the CFNAI-MA3 and the three-month moving
average of the alternative CFNAI in capturing U.S. expansions and
recessions as defined by the NBER.10 The receiver operating
characteristic (ROC) analysis framework that Berge and Jorda describe
produces a simple summary statistic in this regard (the area under the
receiver operating characteristic curve, or AUROC). We briefly explain
how we use this method next, while technical details for our ROC
analysis can be found in box 5.
Our use of ROC analysis can be explained graphically by a
histogram, as shown in figure 3. This figure plots the relative
frequency of every observed value of the alternative CFNAI-MA3
separately for values that occur during NBER recessions and expansions.
One
can see from figure 3 that the alternative CFNAI-MA3 is in fact
quite accurate at separating recessions from expansions, as the
empirical distributions seldom overlap. The AUROC statistic measures the
degree of separation of the two distributions, such that the more
accurate an index is at distinguishing expansions from recessions, the
higher its AUROC value will be. As noted in box 5, it is even possible
to compare two AUROC values to assess whether or not their differences
are statistically significant. The CFNAI-MA3 has 94 percent accuracy in
describing NBER expansions and recessions, so surpassing its level of
accuracy in this respect is a tall task for any of the
dynamic-factor-based indexes to achieve; however, one--the three-month
moving average of the alternative CFNAI (CDF-HAC index)--does in fact
surpass the CFNAI-MA3's accuracy at the 95 percent confidence
level, with an AUROC of 98 percent. None of the other three-month moving
averages of the dynamic-factor-based indexes we considered were able to
produce a statistically significant improvement in AUROC compared with
the CFNAI-MA3, as shown in the first column of table 2. Yet, it was true
for the alternative CFNAI regardless of whether or not we smoothed
through some of the monthly volatility by applying a three-month moving
average transformation prior to calculating the AUROC statistic. Thus,
the ability to capture U.S. business cycle properties that the NBER
deems most important appears to be a unique feature of the Brauning and
Koopman (2014) collapsed dynamic factor methodology.
BOX 5 Receiver operating characteristics analysis
ROC analysis applied to the CFNAI and its dynamicfactor-based
alternatives requires that we categorize each observation of an
index as falling within a recession or expansion. Following the
dating conventions for U.S. business cycles of the NBER, we then
need to construct these conditional probabilities:
TP(c) = P[[I.sub.t] [greater than or equal to] c|[S.sub.t] = 1],
FP(c) = P[[I.sub.t] [greater than or equal to] c|[S.sub.t] = 0],
with [S.sub.t] [member of] {0, 1} indicating recessions and
expansions, respectively. TP(c) is typically referred to as the
true positive rate, and FP(c) is known as the false positive rate
for an index [I.sub.t] and particular observed value c. The relationship
between the two is described by the ROC curve. With the Cartesian
convention, this curve is given by
[{ROC(r),r}.sup.1.sub.r=0]l,
where ROC(r) = TP(c) and r = FP(c). In what follows, we describe
how to construct the ROC curve.
Using the data in figure 4 (p. 32), we find the fraction of
observations that fall outside and inside the shaded regions
denoting U.S. recessions according to the NBER for the alternative
CFNA1-MA3. These fractions are the unconditional probabilities
associated with expansions and recessions. To obtain conditional
probabilities, we use the following algorithm: For each value
between the minimum and maximum observations of an index, we find
the fraction of observations where that value and all subsequently
higher values fall outside the shaded regions. We then do the same
to find the fraction of observations that fall inside the shaded
regions. These two statistics are equivalent to the true and false
positive rates for separating expansions from recessions defined
previously. By plotting the true and false positive rates against
each other for every historical value of an index, we produce a
nonparametric estimate of its ROC curve.
Berge and Jorda (2011) show that by calculating the AUROC we arrive
at an estimate of the ability of the index to delineate recessions
from expansions. As the area under the curve approaches 1, the more
predictive it is of U.S. expansions and recessions; its statistical
significance is judged relative to the area under the line from the
origin extending at a 45-degree angle (see the next paragraph for
more details). [1] It is also possible to compare the area under
two different curves to distinguish the statistical significance of
differences in predictive ability. This technique is commonly used
in the medical statistics literature to evaluate the ability of a
procedure or medical test to distinguish patients afflicted with a
condition from those who are not. [2]
[FIGURE B1 OMITTED]
Figure B1 displays the ROC curve for the alternative CFNAI-MA3
along with a line from the origin at a 45-degree angle. By
construction, this line has an AUROC equal to 0.5. The more the ROC
curve deviates in total above this 45-degree line, the higher an
index's AUROC will be. In addition, for an index's AUROC to exceed
0.5, it must have a slope greater than 1 at some point on the ROC
curve such that, for a given increase in the true positive rate,
the associated increase in the false positive rate is smaller. The
red dot on the curve marks the point at which it is no longer
possible to increase the true positive rate without producing more
false positives than are consistent with the observed relative
frequency of expansions and recessions.
Baker and Kramer (2007) show that the point on the curve denoted in
figure B1 by the red dot meets the decision-theoretic criteria for
a threshold rule, c, that equally penalizes type I (false positive)
and type II (false negative) classification errors for recessions
and expansions. To see this, consider the following utility
function:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [U.sub.ij] is the utility (or disutility) associated with the
prediction / given that the true state of the business cycle,
[S.sub.t], is j, with {i,j} [member of] {0, 1 }and where [pi] is
the unconditional probability of an expansion. Utility maximization
implies the following first-order condition determining [c.sup.*]:
[partial derivative]ROC/[partial derivative]r =
[U.sub.00]-[U.sub.10] 1 -[pi]/[U.sub.11]-[U.sub.01] [pi]
If we set the leading ratio of utilities to 1, this threshold
equates the slope of the ROC curve to the ratio of the
unconditional probabilities of expansion and recession. In doing
so, one is essentially equally weighting the net benefit of making
a type I error versus a type II error relative to correctly
predicting the true state of the business cycle.
[1] The procedure for evaluating statistical significance is
described in DeLong, DeLong, and Clarke-Pearson (1988).
[2] See Brave and Butters (2012a, 2012b) for further examples using
this approach to predict financial crises.
Figure 4 plots the time series of the CFNAI-MA3 and the alternative
CFNAI-MA3 with NBER recession shading. Comparing the two indexes in
figure 4, we note that the alternative CFNAI-MA3's improvement in
AUROC over the CFNAI-MA3 stems largely from its ability to more
accurately capture the timing of U.S. recessions prior to 1990. One way
in which to see this is to examine periods where the alternative
CFNAI-MA3 falls below the dashed line in figure 4. As described in box
5, the ROC framework can also be used to arrive at a single threshold
value distinguishing NBER expansions from recessions that equally
weights the desire to correctly capture both. The dashed line in the
figure is our estimate of this threshold. At -0.7, this threshold for
the alternative CFNAI-MA3 is in line with the value first put forth in
Evans, Liu, and Pham-Kanter (2002) that has been used as a threshold for
the CFNAI-MA3 and slightly above the value computed by Berge and Jorda
(2011) using the same ROC methodology (-0.8). Examining values above and
below -0.7 during the NBER recessions for the alternative CFNAI-MA3, we
note an improvement in AUROC, largely resulting from the fact that it
produces fewer false positives and false negatives during the 1969-70,
1973-75, 1980, and 1981-82 recessions. This is the case even though it
is slightly ahead of the CFNAI-MA3 in the timing of several remaining
recessions.
We can get a sense of what is driving the differences in timing of
U.S. recessions across the two measures by breaking down the difference
between the CFNAI-MA3 and the alternative CFNAI-MA3 into contributions
from the various assumptions of the dynamic factor models building up to
the alternative CFNAI-MA3. In essence, this calculation amounts to
redisplaying the information already presented in figure 2 (p. 27) in a
slightly different manner in order to highlight the impact of the
cumulative changes across three-month moving averages of the CFNAI and
DF-HAC and CDF-HAC indexes (that is, CFNAIMA3, DF-HAC-MA3, and
CDFHAC-MA3) discussed previously around business cycle turning points.
We can decompose the difference between the CFNAI and alternative CFNAI
into 1) the difference between the CFNAI and DF-HAC index and 2) the
difference between the DF-HAC index and the CDF-HAC index.11 To arrive
at the same measure for the difference between the CFNAI-MA3 and
alternative CFNAI-MA3, we take three-month moving averages of all three
indexes.
[FIGURE 3 OMITTED]
Figure 5 displays our decomposition of the difference between the
CFNAI-MA3 and alternative CFNAIMA3 into two components. The bars in the
figure represent contributions to the total difference (represented by
the dashed line in the figure) by these two components. The red bars
capture the cumulative effect of incorporating dynamics in the static
factor model and real
GDP growth along with relaxing the PCA assumptions on the
idiosyncratic error structure of the static factor model--that is, the
CFNA1-MA3 minus the DF-HAC-MA3. We refer to this component as HAC in the
figure as it is primarily the latter feature that dominates the
contribution to the total difference. The blue bars capture the marginal
effect of the measurement error (ME) we estimate in the Brauning and
Koopman (2014) model that arises from bias in the use of PCA--that is,
the DF-HAC-MA3 minus the CDF-HAC-MA3. While ME is mean zero by
construction over the entire sample period, the large magnitude of many
of its realizations in this figure suggests that the CFNAI-MA3 is likely
biased. One can see from figures 4 and 5 that HAC primarily accounts for
the better fit of the alternative CFNAI-MA3 (relative to the CFNAI-MA3)
for the 1969-70 and 1973-75 recessions, while ME is mostly responsible
for the improvement in fit for the 1980 and 1981-82 recessions.
[FIGURE 4 OMITTED]
More recently, measurement error has begun to play more of a
secondary role in explaining the discrepancies between the CFNAI-MA3 and
alternative CFNAI-MA3. This has primarily to do with the way in which
each index accounts for the protracted weakness of personal consumption
and housing indicators during the recovery from the 2007-09 recession
and, to a lesser extent, the employment-related indicators as well. The
alternative CFNAI-MA3 reinterprets what is due to idiosyncratic drivers
of variance in the underlying data series versus what is due to common
drivers on the basis of how it has related historically to real GDP
growth. For the HAC component to be so strongly negative in figure 5
since 2007 implies that the alternative CFNAI-MA3 indicates growth in
economic activity due to personal consumption and housing during the
recovery has been greater than what has been indicated by the
traditional CFNAI-MA3.
However, since 2007, real GDP growth on average has been weak
enough in comparison with the alternative CFNAI-MA3 to suggest that the
trend rate of real GDP growth has fallen. This result can be seen in
figure 6, with our estimate of the trend rate of real GDP growth
decreasing from 2.9 percent in the fourth quarter of2007 to 2.4 percent
in the fourth quarter of 2013. As a point of comparison, the
Congressional Budget Office's (CBO) estimate of potential real GDP
growth is also displayed in figure 6. Our estimate of the decrease in
the trend rate of real GDP growth is somewhat smaller than the
concurrent change in the CBO's estimate of potential real GDP
growth--a decline from 2.4 percent in the fourth quarter of 2007 to 1.7
percent in the fourth quarter of 2013. That said, our estimate of trend
GDP growth is also slightly higher than the CBO's estimate of
potential real GDP growth for much of the past decade. This is the case
even though over the full sample period (1967:Q1-2013:Q4) they exhibit a
correlation coefficient of 0.85. The large negative HAC values from
personal consumption and housing indicators since 2007 have been often
wholly or partially offset by large positive measurement errors from the
production and income indicators. This feature of the data has prevented
the alternative CFNA1-MA3 from being even further above the CFNAI-MA3
during this period, masking the implied inference for the decline in
trend GDP growth.
[FIGURE 5 OMITTED]
With recent HAC values near zero or slightly positive (see figure
5), the alternative CFNAI-MA3 suggests that the pervasiveness of the
weakness in the household sector is currently more limited than
previously thought according to the CFNAI-MA3. This development is a
good omen for the continued expansion of the U.S. economy in 2014 if the
ongoing recovery in the housing market persists. Recent negative ME
values (see figure 5) also suggest that the impact of the weakness in
production and income indicators in early 2014 on the CFNAIMA3 will
likely be transitory. While the alternative CFNAI-MA3 also fell into
negative territory in February 2014 (see figure 4), it remained much
closer to its historical average than the CFNAI-MA3. Using the
nowcasting model for real GDP growth described in box 4 (p. 26) as of
March 20, 2014, we estimate that real GDP in the first quarter of 2014
increased at an annual rate of 1.8 percent, which is 0.5 percentage
points below our current estimate of 2.3 percent for trend real GDP
growth. By comparison, the Blue Chip Economic Indicators consensus
forecast for first quarter real GDP growth on March 10, 2014, was 1.9
percent. In the next section, we evaluate the historical performance of
our nowcasting model.
[FIGURE 6 OMITTED]
Nowcasting real GDP growth
Real GDP is the broadest measure of U.S. economic activity, but it
is produced with a significant lag of up to three months. Therefore,
linking its current quarter growth rate to the more readily available
monthly CFNAI with a nowcasting model has a natural appeal. Furthermore,
nowcasts of real GDP growth produced using the CFNAI and similar indexes
have been shown to be quite accurate in several instances. (12) To
generate nowcasts, we incorporate annualized quarterly real GDP growth
into all of our dynamic factor models. However, we show here that only
the alternative CFNAI, which is based on the Brauning and Koopman (2014)
methodology, significantly boosts the explanatory power of the dynamic
factor model for real GDP growth, further suggesting that the PCA
estimate of the index is indeed biased because of a lack of
international trade, government spending, and other indicators that
inform real GDP growth.
This can be seen in the second column of table 2 (p. 31), which
displays in-sample root mean squared error (RMSE) ratios for the
nowcasts from the three-month moving averages of the DF, DF-HC, DF-HAC,
and CDF-HAC indexes. A number less than 1 indicates an improvement in
fit for the quarterly real GDP growth data relative to traditional
CFNAI-MA3 nowcasts based on the nowcasting model described in Brave and
Butters (2013). While all of the dynamic-factor-based indexes
demonstrate an in-sample RMSE ratio of less than 1, the improvement in
relative fit for the CDF-HAC index, or alternative CFNAI, at 42 percent
dwarfs the others. This is perhaps not surprising given the flexibility
of the CDF method in matching the index to observed real GDP growth. A
more convincing test of the ability of the Brauning and Koopman (2014)
methodology to correct for potential bias due to a lack of international
trade, government spending, and other indicators informing real GDP
growth would be to test its ability to nowcast when current quarter real
GDP growth is not observed.
As it turns out, the Brauning and Koopman (2014) methodology is
also important for improving the out-of-sample accuracy of our nowcasts,
though its relative improvement is not much larger than that achieved
with the methodology for the DF-HAC index. To arrive at this conclusion,
we estimated three-month moving averages of all four
dynamic-factor-based indexes using a real-time archive of the CFNAI data
series covering the period December 2003 through April 2013 and the
available "vintage" of real GDP growth in those months from
the Federal Reserve Bank of Philadelphia's Real-Time Data Set for
Macroeconomists. (13) We then compared our nowcasts made in the months
of each vintage of real GDP growth to the subsequent real-time GDP
release being forecasted to compute out-of-sample RMSE ratios similar to
the in-sample fits discussed before. (14) As the basis for comparison in
this exercise, we used similarly constructed RMSE values based on the
within-quarter nowcasting model described in Brave and Butters (2010).
The out-of-sample RMSE ratios are shown in the third column of table 2
(p. 31). While all of the ratios are again less than 1, the three-month
moving average of the CDF-HAC index (alternative CFNAI) is still the
best model in this real-time setting, with a 19 percent improvement in
forecast accuracy compared with the Brave and Butters (2010) nowcast.
The differential accuracy of the CDF-HAC-M A3 nowcasts in our
real-time out-of-sample nowcasting exercise is not as large in
comparison to what we find based on in-sample evidence. This result
suggests to us that the advantage provided by allowing for measurement
error in the CFNAI-MA3 in nowcasting real GDP growth is somewhat limited
given our current nowcasting framework. In fact, when we correlate the
forecast errors from our real-time exercise with the real-time
contribution of net exports and government spending to real GDP growth,
we obtain a correlation coefficient of 0.4. In many ways, however, we
are not making full use of the flexibility provided by the Brauning and
Koopman (2014) methodology. In future research, we plan to explore ways
in which to improve on our results--by incorporating additional factors,
by adding international trade and government spending indicators to the
current list of 85, or by employing estimation methods that allow for
more informed dynamics and/or parameter shrinkage.
Conclusion
By building on the existing framework of the CFNAI with Brauning
and Koopman's (2014) method of collapsed dynamic factor analysis,
we are able to readily extend and improve our existing methodology.
Given the resulting alternative CFNAI's superior past performance
in predicting current quarter U.S. real GDP growth and very high
correlation with NBER recessions, it may very well be a better method to
both nowcast real GDP growth and assess the state of U.S. business
cycles than the current CFNAI. Brauning and Koopman's methodology
also allows us to address several of the persistent criticisms of the
CFNAI, including the problem of overweighting certain sectors of the
U.S. economy and the important omissions of certain data series (for
example, those concerning international trade and government spending)
in nowcasting real GDP growth.
Another benefit of Brauning and Koopman's (2014) methodology
is that it makes it possible to produce both current quarter predictions
of real GDP growth and an estimate of the trend rate of real GDP growth
with each new index release. As of March 20, 2014, we estimate that real
GDP in the first quarter of 2014 increased at an annual rate of 1.8
percent, which is 0.5 percentage points below our current estimate of
2.3 percent for trend real GDP growth. While we are still in the process
of investigating the best nowcasting model with which to achieve both of
these goals, our work so far suggests that this is a promising direction
for future research with the CFNAI. Our analysis here also has
implications for the current interpretation of the index. While the
alternative CFNAI fell into negative territory in early 2014, it
suggests that the pervasiveness of the weakness in the household sector
(as well as its drag on U.S. economic activity) is more limited than
previously thought according to the traditional CFNAI and that the
impact of the recent weakness in the production and income indicators on
the index is likely to be transitory.
REFERENCES
Baker, S. G., and B. S. Kramer, 2007, "Peirce, Youden, and
receiver operating characteristic curves," American Statistician,
Vol. 61, No. 4, November, pp. 343-346.
Berge, T. J., and O. Jorda, 2011, "Evaluating the
classification of economic activity into expansions and
recessions," American Economic Journal: Macroeconomics, Vol. 3, No.
2, April, pp. 246--277.
Brauning, E, and S. J. Koopman, 2014, "Forecasting
macroeconomic variables using collapsed dynamic factor analysis,"
International Journal of Forecasting, forthcoming.
Brave, S. A., 2008, "Economic trends and the Chicago Fed
National Activity Index," Chicago Fed Letter, Federal Reserve Bank
of Chicago, No. 250, May.
Brave, S. A., and R. A. Butters, 2013, "Estimating the trend
rate of economic growth using the CFNAI," Chicago Fed Letter,
Federal Reserve Bank of Chicago, No. 311, June.
--, 2012a, "Detecting early signs of financial
instability," Chicago Fed Letter, Federal Reserve Bank of Chicago,
No. 305, December.
--, 2012b, "Diagnosing the financial system: Financial
conditions and financial stress," International Journal of Central
Banking, Vol. 8, No. 2, June, pp. 191-239.
--, 2010, "Chicago Fed National Activity Index turns
ten--Analyzing its first decade of performance," Chicago Fed
Letter, Federal Reserve Bank of Chicago, No. 273, April.
DeLong, E. R., D. M. DeLong, and D. L. ClarkePearson, 1988,
"Comparing the areas under two or more correlated receiver
operating characteristic curves: A nonparametric approach,"
Biometrics, Vol. 44, No. 3, September, pp. 837-845.
Doz, C., D. Giannone, and L. Reichlin, 2012, "A quasi-maximum
likelihood approach for large, approximate dynamic factor models,"
Review of Economics and Statistics, Vol. 94, No. 4, November, pp.
1014-1024.
Durbin, J., and S. J. Koopman, 2012, Time Series Analysis by State
Space Methods, 2nd ed., Oxford Statistical Science Series, Vol. 38,
Oxford, UK: Oxford University Press.
Evans, C. L., C. T. Liu, and G. Pham-Kanter, 2002, "The 2001
recession and the Chicago Fed National Activity Index: Identifying
business cycle turning points," Economic Perspectives, Federal
Reserve Bank of Chicago, Third Quarter, pp. 26-13, available at
www.chicagofed.org/digital_assets/publications/
economic_perspectives/2002/3qepart2.pdf.
Fisher, J. D. M., 2000, "Forecasting inflation with a lot of
data," Chicago Fed Letter, Federal Reserve Bank of Chicago, No.
151, March.
Giannone, D., L. Reichlin, and D. Small, 2008, "Nowcasting:
The real-time informational content of macroeconomic data," Journal
of Monetary Economics, Vol. 55, No. 4, May, pp. 665-676.
Harvey, A. C., 1989, Forecasting, Structural Time Series Models and
the Kalman Filter, Cambridge, UK: Cambridge University Press.
Jungbacker, B., S. J. Koopman, and M. van der Wei, 2011,
"Maximum likelihood estimation for dynamic factor models with
missing data," Journal of Economic Dynamics and Control, Vol. 35,
No. 8, August, pp. 1358-1368.
Shumway, R. H., and D. S. Stoffer, 1982, "An approach to time
series smoothing and forecasting using the EM algorithm," Journal
of Time Series Analysis, Vol. 3, No. 4, July, pp. 253-264.
Stock, J. H., and M. W. Watson, 2002, "Macroeconomic
forecasting using diffusion indexes," Journal of Business &
Economic Statistics, Vol. 20, No. 2, April, pp. 147-162.
--, 1999, "Forecasting inflation," Journal of Monetary
Economics, Vol. 44, No. 2, October, pp. 293-335.
--, 1998, "Median unbiased estimation of coefficient variance
in a time-varying parameter model," Journal of the American
Statistical Association, Vol. 93, No. 441, March, pp. 349-358.
Watson, M. W., and R. F. Engle, 1983, "Alternative algorithms
for the estimation of dynamic factor, mimic and varying coefficient
regression models," Journal of Econometrics, Vol. 23, No. 3,
December, pp. 385--400.
NOTES
(1) Additional background information on the CFNAI and its method
of construction is available at www.chicagofed.org/digital_assets/
publications/cfnai/background/cfnai_background.pdf. A complete list of
the 85 indicators, their associated categories, and their respective
weights in the overall index is available at www.chicagofed.org/
digital_assets/publications/cfnai/background/cfnai_indicators_list.pdf.
(2) See the next section and box 1 (p. 21) for details on principal
components analysis (PCA) and on how the CFNAI is the first principal
component of the 85 data series (that is, the single component common to
each data series that explains the most variation across all 85).
(3) The terms nowcast and nowcasting are derived from combining the
words now and forecasting. Nowcasting techniques are commonly used in
economics nowadays because they permit economists today to predict the
present (and recent past) of standard measures of the economy (such as
real gross domestic product, or GDP), which are often determined after a
long delay.
(4) See box 3 (p. 24) for more details on CDF analysis.
(5) See box 1 (p. 21) for further details on the factor model
representation of the CFNAI.
(6) Jungbacker, Koopman, and van der Wei (2011) also describe a
full maximum likelihood estimator of the collapsed dynamic factor model
In order to make direct comparisons across methodologies, we only
consider their EM algorithm estimation method in our article.
(7) Technically, the Stock and Watson (2002) methodology can also
incorporate mixed-frequency data However, because of that
methodology's lack of dynamics, this takes place as an additional
transformation of the data in its algorithm
(8) The trend component captures long-run factors, such as
potential growth in productivity, capital, and labor. In contrast, the
cyclical component captures medium-run factors driving economic growth
and is generally associated with the business cycle--the periodic
fluctuations in economic activity around its long-term historical trend.
(9) Interestingly, Brave (2008) found similar results for the same
two categories (namely, the employment, unemployment, and hours category
and personal consumption and housing category) when looking at the
impact of slow-moving changes in the average values of their data series
over time.
(10) See Brave and Butters (2012a, 2012b) for examples with
financial data.
(11) In other words, the difference between the CFNAI and
alternative CFNAI is the sum of the difference between the CFNAI and
DF-HAC index and the difference between the DF-HAC index and the CDF-HAC
index.
(12) See, for instance, Brave and Butters (2010). Other examples
using factor models to forecast GDP growth are Stock and Watson (2002)
and Giannone, Reichlin, and Small (2008).
(13) This data set is available at
www.philadelphiafed.org/research-anddata/real-time-center/real-time-data/.
(14) In making these comparisons, we eliminated quarters where GDP
is subject to annual and benchmark revisions.
TABLE 1
Fraction of data variance explained by the index
CFNAI DF DF-HC DF-HAC CDF-HAC
Total 0.29 0.28 0.27 0.26 0.20
Production and income 0.39 0.38 0.46 0.50 0.43
Employment, unemployment, 0.36 0.37 0.33 0.29 0.32
and hours
Personal consumption and 0.08 0.08 0.05 0.03 0.04
housing
Sales, orders, and 0.17 0.17 0.16 0.18 0.21
inventories
Notes: The table displays the fraction of the overall variance of the
85 underlying indicators in the Chicago Fed National Activity Index
(CFNAI) that is explained by the CFNAI and each of the four
dynamic-factor-based indexes over the period March 1967 through
February 2014. In addition, it decomposes this fraction into the
share explained by each of the four broad categories of indicators
listed here. The four dynamic-factor-based indexes--DF; DF-HC,
DF-HAC, and CDF-HAC--are derived from methodologies based on
Giannone, Reichlin, and Small (2008), Doz, Giannone, and Reichlin
(2012), Jungbacker, Koopman, and van der Wei (2011), and Brauning and
Koopman (2014), respectively (see the text for further details).
Source: Authors' calculations based on data from Haver Analytics.
TABLE 2
AUROC for NBER recessions and RMSE ratios
for current quarter GDP growth predictions
In-sample Out-of-sample
AUROC RMSE ratio RMSE ratio
DF 0.95 0.89 0.98
DF-HC 0.95 0.93 0.99
DF-HAC 0.95 0.93 0.85
CDF-HAC 0.98 0.58 0.81
Notes: The table displays areas under the receiver operating
characteristic (ROC) curve (AUROC) and root mean squared error (RMSE)
ratios for current quarter real gross domestic product (GDP) growth
forecasts based on the three-month moving averages of the four
dynamic-factor-based alternatives to the Chicago Fed National
Activity Index (CFNAI). The four dynamic-factor-based indexes--DF,
DF-HC, DF-HAC, and CDF-HAC--are derived from methodologies based on
Giannone, Reichlin, and Small (2008), Doz, Giannone, and Reichlin
(2012), Jungbacker, Koopman, and van der Wei (2011), and Brauning and
Koopman (2014), respectively (see the text for further details). The
closer the AUROC value is to 1, the more accurate a
dynamic-factor-based index is in signaling U.S. recessions and
expansions as determined by the National Bureau of Economic Research
(NBER). An RMSE value of less than 1 indicates a dynamic-factor-based
index's forecast that is more accurate than a similar forecast based
on the traditional CFNAI using the nowcasting models described in
Brave and Butters (2013) for in-sample comparisons over the period
March 1967 through February 2014 and Brave and Butters (2010) for
out-of-sample comparisons over the period December 2003 through April
2013 (see the text for further details).
Sources: Authors' calculations based on data from the Federal Reserve
Bank of Philadelphia, Real-Time Data Set for Macroeconomists; and
Haver Analytics.