Determining the maximum number of uncorrelated strategies in a global portfolio.
Boon, Ling-Ni ; Ielpo, Florian
In a portfolio composed of different assets from the same class, or
numerous asset classes, the drivers of return variation may appear
elusive. Although there are idiosyncratic factors influencing variation
of the returns, there exist as well common factors that account for the
portfolios collective variation. Uncovering and decomposing the
importance of these common drivers assist in cross-asset strategy
building in portfolio management--a topic of interest to investment
managers. One method to formalize the task of determining the maximum
number of uncorrelated strategies to include in a global portfolio is
the selection of the number of factors in a large-dimensional Factor
model.
Factor models have been widely studied. mainly in macroeconomics
and asset pricing. In macroeconomics, they are used to determine the
factors that influence measures of the economy, or in policy analyses.
For example, Bernanke et al. [2005] introduce the factor-augmented
vector autoregressive (FAVA k) model to analyze monetary policy. Form et
al. 120031 study the structure of the macroeconomy, while Favero et al.
120051 compare static and dynamic principal components in estimating
macroeconomic variables. In consumer demand theory, Lewbel 119911
applied factor models to budget share data to reveal information about
the demand system. In finance, Chamberlain and Rothschild. 11983] extend
arbitrage pricing theory with Factor models, which has since been used
not only to decompose risk and return into explicable and inexplicable
components but also to describe the returns' covariance structure
for prediction, and to construct portfolios with desired
characteristics, among others.
Factor models are categorized by three types of factors. There are
i) macroeconomic factors (observable economic or financial time series),
ii) fundamental factors (observable asset characteristics), and iii)
statistical factors (unobservable asset characteristics). An example of
a well-known single-factor model is the capital asset pricing model (CAPM), which describes the relationship between risk and return. The
observation that stocks with small capitalization and high
book-to-market ratio tend to perform better led Fama and French to
refine CAPM as a three-factor model. In this article, the focus is on
statistical factor models, and the factors are estimated using principal
component analysis (PCA).
PCA is frequenulv applied in financial research, especially in
studies on systemic risk. For instance, Kritzman et al. 120111 use
principal components as an implied measure of systemic risk. This
measure is extended in Kinlaw et al. 120121 to include centrality, a
feature encompassing an entity's vulnerability to failure, its
connectivity to .other entities, and the risk oldie entities to which it
is connected. More recently, Billio et al. [2012] use PCA to investigate
the connectedness of hedge funds, mutual funds, insurance companies, and
banks. These works' objective is to capture changes in correlation
and causality among Financial institutions, whereas ours is to
investigate correlation between the returns of different assets. As
another example, PCA is also applied by Puk-thuanthong and Roll 12009]
in their measure of global market integration. In all instances, PCA, as
a method to decompose returns into factors, proves to be an invaluable
tool, as its use by Financial institutions such as the U.S. Office of
Financial Research makes clear.
The operational value of exposing common factors driving variances
in these markets is immense and is of fundamental interest to any asset
and risk manager. In addition to theoretical interest regarding the
suitability of methodologies on empirical data, portfolio managers would
be better able to gauge the potential risk factors from which they could
profit or lose. Given that this factor analysis comes with a measure of
the share of the price action that is due to each factor, it could also
help investment managers decide where to put their efforts to build
their market forecasts. For instance, because the level factor explains
more than 80% of interest rate curve's movements, devoting a large
part of the forecasting effort to the level of rates makes somewhat more
sense than investing time in anticipating the slope's movements,
which explain less than 10% of rate's variations. Similarly, the
risk appetite factor in the Global Macro Hedge Fund accounts For up to
45% of the data's variance and hence deserves heightened attention.
This is in contrast to Asian market movements, which explain only about
4% of variances: thus tracking them yields only a tenth of the insights
of tracking investors' sentiment.
Knowledge of a portfolio's factor structure is also
advantageous in risk management. In the event that the portfolio's
variance is explained by few factors that surround a theme, then the
portfolio manager should be wary about the high susceptibility of the
portfolio's return to events on that theme. Our factor analysis and
the dynamic measurement of the time-varying correlation between factors
are of high importance, as a sudden increase in the correlations can
spontaneously increase the risk of a seemingly well-diversified
portfolio. For all these reasons, measuring and tracking those factors
are essential steps in the construction of any portfolio.
Numerous methods to determine the number of factors in the case of
unobserved factors have been proposed. Arguably the most popular is the
information criteria (IC) method. IC is based on the idea that an (r+1)
-factor model has to fit at least as well as an r-factor model but is
less efficient. The well-known Akaike Information Criterion (AIC) and
Bayesian Information Criterion (BIC) cannot be directly adopted when we
have two-dimensional data (i.e., N as the number of assets, T as the
time span) because they are functions of N or T alone and hence fail, to
consistently estimate the number of factors when the factors are
unobserved. Bai and Ng [2002] (BN) propose a set of six penalty
functions to replace the ones in AIC and BIC and establish conditions
ensuring consistency of their method. Despite its wide empirical
adoption., BN's criterion often does not converge, as demonstrated
in Forni et al. 120071. Alessi et al. 120091 (ABC) refine BN's
criterion and demonstrate that their criterion has superior performance.
Alternate approaches include analyzing the factor loadings (Connor and
Korajczyk [19931), tests on the matrix of the covariance eigenvalues of
returns (Kapetanios [2005], Onatski [2009], Onatski [2010]), numerous
tests on the rank of the covariance matrix (Lewbel [1991], Forni and
Reichlin [1998]), and a graphical method that is rarely adopted due to
its lack of theoretical basis (Donald [1997]).
By comparison of the criteria in a Monte Carlo study, ABC's is
deemed to be the overall best in terms of accuracy and precision.
Application of ABC's criterion to five datasets yields the
following number of factors: five for the Global Macro Hedge Fund
(GMHF), three for U.S. Treasury bond rates (USTB), two for commodity
prices, and one each for U.S. credit spreads (USCS) and currencies.
Total variation explained by the factors varies: 74% for GM HF, 94% for
USTB, 49% for USCS, 27% for commodity prices, and 59% for currencies.
Economic interpretation is attached to the factors according to correlation between the factors and the assets whose variances they
describe. The five factors for GMHF are associated with risk appetite,
commodities, the U.S. dollar, the Japanese market, and Asian stock
markets. Those for USTB fit the description of factors found in previous
research and are labelled as level, slope, and curvature. USCS's
sole factor corresponds to midrange risky assets, and that for
currencies is labelled as the carry factor. The pair of factors for
commodity prices is linked to energy and metal.
Stability of the number offactors over time is investigated by
testing significance of correlation between factors.. Even though by
definition of principal components the factors are in theory orthogonal
to each other, the estimated factors may not be so. Instantaneous
correlation between these .estimated factors call be uncovered by
considering .a rolling window over the time dimension. Correlation
between the factors over rolling windows does not yield an overarching
conclusion for all datasets regarding the hypothesis that during periods
of economic downturn, fewer factors are required to explain variances in
the data because of-increased cross-market correlation. For traditional
asset classes, such as USTB, it is evident that .correlation between
factors increased. Yet for commodity prices., it is less clear. When all
asset classes are considered too-ether in the GMHF dataset, the
different factor interactions that are observed within each asset class
manifest in .a more complex manner and suggest that cross-asset
correlation may be a leading indicator of economic cycles, because a
spike in correlation is observed prior to the 2007-2010 financial
crisis.
This article is organized as follows. We first present the factor
model and briefly explain the numerous criteria proposed to determine of
the number of factors. After a Monte Carlo Study for selected methods,
application of the methods to datasets relevant to the investment
management industry is carried out. We then interpret the results and
analyze their stability overtime.
METHODOLOGY
The Factor Model
In the analysis of .a large pool of assets' returns, a basic
mean--variance analysis requires estimation of a quadratically
increasing number of expected returns, standard deviations, and
correlation coefficients) To overcome the difficulty of having to
estimate the statistics using long historical data to attain low
standard deviations, a linear factor model that relates the returns of
the portfolio to a finite number of factors can be introduced. An
r-factor approximatc_fiu-tor modcl is specified as
[X.sub.[mu]] = [[lambda].sub.t][F.sub.t] +[e.sub.n] i = 1 ... N and
t = 1 ... T (1)
whereby [X.sub.it], is the observed data for the [i.sup.th]
cross-section at time t, [F.sub.t] is the r x 1 vector of common
fictors, [[lambda].sub.i] is the r x 1 vector of factor loadings, cu is
the idiosyncratic component, [[lambda].sub.t]'[F.sub.t] is the
common component of [X.sub.n], and' denotes the complex conjugate transpose of the matrix. [F.sub.t] and [e.sub.t] = ([e.sub.1t],
[e.sub.2t], ..., [e.sub.Nt]) are assumed to be uncor-related, and the
matrix composed of cov([e.sub.i], [e.sub.j]) is not necessarily diagonal
(i.e., allows correlation between the asset returns' idiosyncratic
components), but the largest eigenvalue of the idiosyncratic
component's covariance matrix is bounded to limit the degree of
correlation. Hence. [X.sub.t] = ([X.sub.1t], [X.sub.2t], ...,
[X.sub.Nt])is explained by both common factors and specific components.
Principal Component Analysis
Because [F.sub.t], is unobservable in our setup, we estimate its
value using principal component analysis. PCA relies on decomposition of
the correlation matrix of [X.sub.t], into its eigenvalue and
eigenvectors as var[[X.sub.1t]] = [SIGMA] Q[THETA]Q. Q is the matrix of
eigenvectors and [THETA] the eigenvalues of var[[X.sub.t]], with
[[theta].sub.1], ..., [[theta].sub.N] along its diagonal.
[ILLUSTRATION OMITTED]
For i = 1 ... N,
[[theta].sub.t]/[[SIGMA].sub.k=1.sup.N][[theta].sub.k] can be
interpreted as the amount of variation explained by the [i.sup.th]
factor. The eigenvector corresponding to the largest eigenvalue is a
scalar transformation of the first principal component, or factor. The
eigenvector corresponding to the second-largest eigenvalue is in the
same direction as the second factor, and so on. (2) Thus, estimating
[F.sub.t], involves computing the eigenvectors of the correlation matrix
of [X.sub.t].
Methods Implemented to Determine the Number of Factors
The next natural question in the setup of the factor model is the
number of factors, r, to include. For the CAPM, r = 1 (i.e., the
well-known beta), whereas in the Fama--French model, r = 3 (i.e., the
same as in CAPM plus capitalization and book-to-market ratio). Unlike
the CAPM and the Fama--French model, we do not pre-specify r. Instead, r
is determined statistically via tests that rely on different properties
of the factor models. The first,
introduced by Bai and Ng[2002], is an information criterion such as
the well-known Akaike or Bayesian criterion in regression analysis. The
number of factors, corresponds to the r-factor model with the lowest
value of a loss function. The loss function is the sum of the squared
residual from the r-factor model plus a penalty function that is
increasing in r. (3) The second criterion is by Alessi et al. [2009]
(ABC), which applies Bai and Ng [2002] repeatedly to refine the
information criterion. We also implemented the tests introduced by
Connor and Kora-jczyk [1993] that rely on comparing the squared
residuals of ordinary least squares models with various factors via a
statistical test. The fourth and final criterion that we implement is by
Onatski [2009]. This is a statistical test based on the properties of
eigenvalues of an r-factor model being bounded, but its [(r+1).sup.st]
factor is unbounded.
All four methods are compared in a Monte Carlo Study. The criteria
are evaluated based on their accuracy on simulated data. Seven different
sets of data are used, each with unique statistical
characteristics.' This approach allows us to assess the conditions
under which these criteria would perform well on empirical data. The
Monte Carlo Study reveals that the ABC criterion is superior in accuracy
and precision across data with various characteristics, and thus we will
apply it to empirical data in the next section.
EMPIRICAL RESULTS
In this section, ABC's criterion is applied to five datasets
covering major asset classes such as equities, U.S. Treasury bonds,
credit spreads, currencies, and commodities. The estimated number of
factors is first identified and labelled. Then the stability of factors
over time is analyzed by testing the significance of correlation between
them and re-estimating the number of factors after splitting each
dataset along the time dimension according to the economic cycle.
Global Results
The Global Macro Hedge Fund dataset consists of major indexes,
government bonds, currency exchange rates, and futures of currency
exchange rates and oil, from January 1999. to March 2012. The data is of
weekly frequency because of its cross-boundary nature, so the difficulty
to reconcile market closing times worldwide is mitigated. Next is a
dataset of daily closing rates of U.S.
Treasury bonds (USTB) with maturity of three months to thirty
years, from January 1997 to March 2012. The U.S. credit spreads (USCS)
data consists of daily closing rates categorized by industry (e.g.,
financial corporation, insurance, and energy), financial rating (e.g.,
AAA, AA, BBB), and duration (e.g., one to three years, three to five
years), from January 1997 to March 2012. Daily closing prices of
commodities such as gold, aluminum, natural gas, corn, wheat, the
S&P GSCI (Goldman Sachs. Commodity Index) and so on from October
1998 to March 2012 are compiled in the dataset of commodity prices. The
fifth dataset, called Currencies, includes daily prices of the euro,
pound, Swiss franc, Japanese yen, Canadian dollar, Australian dollar,
New Zealand dollar, Norwegian krone, and Swedish krona in terms of U.S.
dollars from January 1999 to December 2012. BN, ABC, and CK criteria are
applied to log prices or log spread variations for the GMHF, USCS,
Commodity Prices, and Currencies datasets, whereas for USTB, variation
of rates is used because with two-year rates close to zero or negative,
their percentage variations are extreme. Composition of each dataset is
presented in Exhibit 1.
EXHIBIT 1
Composition of All Datasets and Results Summary
Range: January 1999 to March 2012
List of Assets in GMHE Dataset
Stock Indexes (13 Dim Jones, Nasdaq 100, Eurostoxx, FTSE,
assets) CAC 40, DAX. IBEX, SMI. Nikkei, TOPIX,
Hang Seng, MSCI Singapore Free Index
(Singapore).
Bonds (12 assets) US 30Y, US 10Y, US 5Y, US 2Y, CAN I0Y, OK
IDY, GE 5Y, GE 2Y, UK I0Y, JP I0Y, OZ
10Y, ED 4 (Euro-Dollar bond).
Currencies
(29 EUR-USD, USD-JPY, GBP-JPY, USD-CHF,
assets) USD-CAD, AUD-USD, EUR-GBP, EUR-CHE,
EUR-SEK, EUR-NOK, EUR-PLN, EUR-AUD,
EUR-CAD, CHF-JPY, GBP-JPY, AUD-JPY,
AUD-NZD, AUD-CAD, USD-BRL, USD-SGD,
USD-KRW, USD-TWD, USD-CNY, CAD-JPY,
EUK-JPY, NOK-SHK, NZD-USD, USD-ZAR, USD.
Futures (5 assets) CI IE futures, JPY futures, AUD futures,
CAD futures, GBP futures.
Commodities (2 Oil. S&P GSCl.
assets)
Others (1 asset) Mini SP 500.
Range: January 1997 to March 2012
List of Assets in U.S. Treasury Bond Rates Dataset
(Maturity) 3M, 6M, 1Y, 2Y, 3Y, 4Y, 5Y, 7Y, 8Y, 9Y, I0Y, 15Y, 20Y,
25Y, 30Y.
Range: January 1997 to March 2012
List of Assets in U.S. Credit Spreads Dataset
Master index, financial corporateS, banks, insurance, industrials,
capital goods, energy, utilities, consumer cyclicals, consumer non
cyclicals, healthcare. AAA, AA, A, BBB, 1-3 years, 3-5 years, 5-7
years, 7-10 years.
Range: October 1998 to March 2012
List of Assets in Commodity Prices Dataset
Gold, silver, platinum, aluminum, copper, nickel, zinc. lead. WTI,
Brent, gas-oil, natural gas, heating oil, corn, wheat, coffee, sugar,
cocoa, cotton, soybean, rice. S&P GSCIs; agriculture, energy,
industrial metals, precious metals.
Range: January 1999 to December 2012
List of Assets in Currencies Dataset
Daily price in USD of: EUR, GBP, NOK SEK, CHE, JPY, AUD, NZD, CAD.
Aside from implementing BN, ABC, and CK criteria on the datasets,
because the Monte Carlo study indicates that the selected criteria
generally have poorer performance when cross-section correlation exists,
the influence of such dependencies on the accuracy of results obtained
is evaluated by fitting a vector autoregressive (VAR) model to remove
dynamic linear dependence, then applying the same criteria to the
residuals. Using the Akaike Information Criterion to determine the
number of lags to include in the VAR model, the USTB and USCS datasets
are fitted with VAR(3), commodity prices with VAR(2), and GMHF and
currencies with VAR(1). BN and ABC criteria provide the same outcome
when the analysis is done on the residuals as is on the returns data.
CK's criterion yields slightly different estimates. Despite its
commendable performance in the Monte Carlo study, BN's criterion
fails to converge on all datasets: [r.sub.max] is always estimated.
CK's estimates do MaX not always coincide with ABC's
estimates, but the latter is taken to be more accurate because of its
better performance in the Monte Carlo study, and it is invariant to
whether linear dependencies exist in the data. Results are presented in
Exhibit 2.
EXHIBIT 2 Summary Table of Results Using Various Criteria
Number of Factors Estimated by
Method
Dataset Bai and Ng Connor and Alessi
(BN) Korajczyk el al.
(CK) * (ABC)
Global 12(5) 5
Macro
U.S. 4(5) 3
Treasury
U.S. Credit [r.sub.max] 1(2) l
Spreads is always
estimated
Commodity 3(6) 2
Prices
Currencies 1(2) l
In Exhibit 7, the number of factors estimated and the proportion of
variance explained by each factor, i.e., the ratio of the [i.sup.th]
eigenvalue and the sum of all eigenvalues of the covariance matrix of
the returns, are presented. The proportion of variances explained by the
estimated number of Factors suggests the concentration of correlation.
An asset class or portfolio for which correlations across assets'
variations are high should have a high percentage explained. Using
ABC's criterion, the GM HF is estimated to have Five Fictors.,
which collectively explain about 74% of the variances in the dataset.
USTB has three factors that explain 94% of the variances.. Commodity
prices are estimated to have two factors. Together, they explain 27% of
the variances--the lowest among, the datasets considered.
Commodities' low concentration of correlation is consistent with
the findings of .Gorton and Rouwenhorst [2004] on the asset class's
diversification potential and becomes the rationale for investors to
increase portfolio allocation to commodity assets (Daskalaki and
Skiadopoulos [2011]). Next, one factor each is estimated for U.S. credit
spreads and currencies, accounting for around 49% and 59% of the
variances, respectively.
EXHIBIT 7 Results Summary--All Datasets
Global Macro Hedue Fund
Factor 1 2 3 4
Label Global Commodities U. Japanese
Equities S.dollar Market
Eigenvalue 0.135 0.041 0.022 0.015
Proportion 45 13 7 4.8
(%)
Cumulative 45 58 65 69.8
(%)
Global
Macro Hedue
Fund
Factor 5
Label Asian
Stock
Markets
Eigenvalue 0.014
Proportion 4.2
(%)
Cumulative 74
(%)
US Treasury Bond Rates
Factor 1 2 3
Label Level Slope Curvature
Eigenvalue 12.305 1.469 0.676
Proportion (%) 80 10 4
Cumulative (%) 80 90 44
US Credit Spreads
Factor 1
Label Mid-range risky assets
Eigenvalue 0.895
Proportion (%) 49
Commodity Prices
Factor 1 2
Label Energy Metals
Eigenvalue 0.206 0.181
Proportion (%) 15 13
Cumulative (%) 15 27
Currencies
Factor 1
Label Carry factor
Eigenvalue 0.11
Proportion (%) 59
After obtaining the number offactors, sensitivity of each
asset's return on the factor--i.e., factor loadings--is
investigated and presented in bar charts. This notion of sensitivity can
be extended to that of a portfolio, as it is merely the sum of the
corresponding assets in the portfolio. Moreover, portfolios that are
unsusceptible to a particular factor can be constructed by selecting
assets, such that their weighted sum (i.e., loadings treated as weights)
is zero. Factor loadings also help in identifying the factor--that is,
to label the latent, hypothetical factors according to their
relationship with the assets. Absolute correlation of the factors with
the assets' returns is first ordered in descending order and
progressively added to the selection, i.e., adding First those with the
largest absolute correlation, until the selection achieves at least 95%
R2 when used as explanatory variables in a simple regression model For
the asset returns. These correlations are presented as bar plots in
Exhibits 3 through 6.
[ILLUSTRATION OMITTED]
[ILLUSTRATION OMITTED]
[ILLUSTRATION OMITTED]
[ILLUSTRATION OMITTED]
Global macro hedge fund. Exhibit 3 shows that Factor 1 of GMHF is
highly correlated with major equity indices worldwide. This finding
suggests that Factor 1 corresponds to a risk appetite Factor. During
bullish periods, investors are more willing to take risks and hence
prefer to invest in equities. Conversely, during bearish periods,
investments are diverted into bonds, which have lower risk. Factor 2
relates to oil and GSCI, and hence it is labelled as a commodities
factor. Factor 3 is a dollar factor because of its association with the
U.S. dollar. Factor 4 represents the Japanese market, whereas Factor 5
is linked to Asian markets.
U.S. treasury bonds. Prior research by Dai and Singleton [2000i and
Litterman and Sheinkman 11991j has shown that observed variation in bond
prices is explained by three factors: level, slope, and curvature. These
terms describe the shift of the yield curve in response to a shock. A
level shock shifts the curve in a parallel manner, resulting in an
almost equal effect .on bonds of all maturities. The slope factor
implies larger shocks for bonds with small maturity compared with bonds
with longer maturity. Its name is derived from the effect of the yield
curve becoming less steep as a result of a slope shock. Curvature
affects medium-term interest rates, and hence it presents itself as a
"hump" on the yield curve. (5) Indeed, Exhibit 4 shows that
Factor 1 has relatively uniform correlation across bonds of all
maturity, just as a level factor would. Correlation for Factor 2 changes
sign once and has a larger correlation, and hence a larger impact, on
bonds with smaller maturity, as should a slope factor. Factor 3 has a
hump in its correlation figure, fitting the description of a curvature
factor. Thus, the Findings are consistent with existing results.
U.S. credit spreads. The sole factor for USCS is correlated to the
A-rated investments, financial corporates, and industrial credit spreads
that dominate those with utilities and consumer cyclicals as shown in
Exhibit 4, implying that it is associated with assets having mid to low
credit risk. Investments with low credit risk and conventionally small
credit spreads, such as AAA-rated assets and Treasury bonds, are not
among those that possess the highest correlation with the factor,
suggesting that they play a relatively smaller role in the movement of
returns. Similarly, industries responsible for providing goods with low
substitutability, such as utilities, healthcare, and energy, along with
those that are highly dependent on the state of the economy, such as
consumer cyclicals (e.g., entertainment, automorive, and so on), possess
lower correlation with the factor. In comparison with other datasets,
assets in USCS have a relatively more uniform correlation (i.e., all
above 50%) with the factor.
Commodity prices. Interpretation for the two factors of commodity
prices is straightforward: Factor 1 is the energy factor. Factor 2,
being correlated with numerous metals, is the metals factor, as is
evident in Exhibit 5. Similar to the case of USTB, the number of factors
in commodity datasets is commonly studied. Daskalaki et al. 120131
investigate common components of a cross-section of commodity futures
data, using numerous asset pricing models intended for equities, macro
and equity-motivated factor models, and principal component factor
models, to find that none of them satisfactorily prices commodity
assets. The authors attribute this poor pricing model performance to
heterogeneity in commodity markets, as well as segmentation of equity
and commodity markets. Our results are consistent with Daskalaki et
al.'s because among all datasets, commodity prices' factors
collectively explain the least amount of variance in the data, an
observation that supports the commodity diversification effect made
popular by Gorton and Rouwenhorst 12041, and has motivated investors to
increase their portfolio allocation in this asset class (Daskalaki and
Skiadopoulos [2011]). Furthermore, identification of factors by
commodity group and low correlation across these groups suggest a
sectorial framework when studying the commodity market.
Currencies. For the currencies dataset, a single factor is
estimated by ABC. It is labelled as the carry factor, because it is
correlated to currencies with high average rates, such as the Australian
dollar, Norwegian krone, and Swedish krona, as shown in Exhibit 6. It
could also be understood as the dollar factor, because all currency
prices are positively correlated with the U.S. dollar. To the best of
our knowledge, no empirical evidence exists in the literature regarding
the number of factors in currency datasets.
STABILITY ANALYSIS
Because all datasets span more than 10 years, including the most
recent financial crisis in 2008, it is of interest to know whether the
number of estimated factors is stable over time. Literature on the
stability of factor models is sparse. Most authors have either assumed
stability or relied on some graphical method. Bliss [19971 divided his
sample into three subperiods and investigated the factor loadings on
each. His hypothesis is that if the factor loadings appear similar
through all subperiods, then the Factor structure is stable. Perignon
and Villa [20021 found that factor loadings are stable but factor
volatility varies over time. Chantziara and Skiadopoulos [2008] analyzed
the term structure of petroleum futures by splitting the sample into two
as well. Because the PCA results are similar in both samples, the
authors conclude that the factor structure is stable. Attempts to devise
formal tests include those by Audrino et al. [2005] and Philip et al.
[20071, which focus on the term structure of interest rates. The former
relies on testing equivalence of the Factor loadings on subperiods,
whereas the latter involves constructing a bootstrap distribution for
the test statistic. Evaluation of the aforementioned research is a
free-standing topic worthy of a full-length research paper and thus is
not done here.
To investigate the stability of the factors on our dataset, we
attempt two approaches. The first assumes that the factor structure is
stable but the correlation between the factors may not be. Correlation
between factors when the factor structure is stable is closely related
to the concentration of eigenvalues. To reveal the evolution of the
relationship between factors, the (-test for significance of correlation
is used. Next, discarding the assumption of a stable factor structure,
the dataset is split into expansion and contraction economic periods,
and the corresponding number of estimated factors in each subperiod is
obtained. To take into consideration that fmancial market performance is
a leading indicator of the macroeconomic situation, the periods of
contraction and expansion are lagged by a negative number of months,
from--1 to--12. In other words, the number of factors prior to the start
of a contraction or expansion period is computed to determine if the
change in number of factors occurs before the economy takes a turn in
its cycle. The key difference between the two approaches is that by
assuming the factor structure is stable in the former, estimation of the
number of factors is done only once, and the focus is on their dynamics,
specifically correlation, over time. In the latter, the number of
factors is estimated for each variation in the economic cycle using
ABC's criterion.
Significance of Correlation between Factors
In this section, the factor structure estimated in the previous
section is assumed to be stable, but the dynamics between factors evolve
over time. Correlation between the factors is tested over a six-month
rolling window for GMHF and a one-year rolling window for U.S. Treasury
bond rates as well as for commodity prices, using the t-test for
significance of correlation. Exhibits 8 and 9 plot the number of
uncorrelated factors as a result of the t-test using the test statistic
t = P[square root of (N-2/1-[p.sup.2])] with p = correlation between two
factors, N = sample size. Under the null hypothesis that p = 0--that is,
the correlation between the factors is insignificant--t~Student with a
degree of freedom equal to N-2.
[ILLUSTRATION OMITTED]
[ILLUSTRATION OMITTED]
Plots of the number of uncorrelated factors suggest that the
factors tend to be correlated during recession periods. Increased
correlation between factors is particularly obvious for USTB. Exhibit 8
shows a marked drop of the number of uncorrelated factors to only one
factor during the most recent financial crisis. This result is
attributable to historically low interest rates as investors sought safe
investments. Because the factors were identified by their impact on
bonds of different maturity--i.e., Factor 1 affects bonds of all
maturity evenly, Factor 2 has a large impact on small maturity bonds,
and Factor 3 has the highest influence on bonds of midterm
maturity--having all factors collapse onto a Single factor suggests that
the prolonged near-zero short-term interest rates and low long-maturity
rates have removed the disproportionate impact of shocks on the yield
curve on bonds of varying maturity. Hence, the level, slope, and
curvature factors merged into a single factor.
Factors for commodity prices appear to be unsusceptible to the
economic condition, because the number of uncorrelated factors remains
stable at two throughout the period studied. This result is in line with
that of Kat and Oomen [2006], who find weak correlation across commodity
groups. Indeed, energy and metals, the two factors identified, are
distinct commodity groups. Moreover, energy and metals are essential to
many industries and are not substitutable; they represent immutable drivers of return among commodity assets. Thus, the sectorial view of
commodity market stays valid through ups and downs in the economy.
Because USCS and currencies have only one factor, stability
analysis is not applicable. For curiosity, however, we overestimate the
number of factors at five each and apply the same t-tests to investigate
the evolution of the number of uncorrelated factors over time. Exhibit 9
for USCS shows that the number of uncorrelated factors fluctuates
between one and two, and thus it does not provide conclusive evidence that the number of factors is lower during economic crisis. As for
currencies, Exhibit 9 demonstrates that the number of factors stays
constant at one for most of the period, except in early 1999, when
occasionally two factors are estimated. This period coincides with the
introduction of the euro, which could have given rise to an interim
factor influencing the returns.
For the 5% test on GMHF, the number of uncorrelated factors is
never five--the number estimated using ABC's criterion over the
entire horizon--during recession. Because this number is also fewer than
five in many other instances, it is less clear whether higher
correlation between factors is a unique feature of recessions. To
further investigate this idea, the mean absolute .correlation between
factors is plotted in Exhibit 10, on which there is an indisputable
spike in correlation in 2007. Hence, preceding the most recent financial
crisis, correlation between factors rose drastically, followed by
fluctuations in correlation. It could also be argued that the number of
factors reduced before recession, an observation that motivated the
analysis in the next subsection. With as many as five factors, the
dynamics between factors do evolve over time, as shown by the mean
absolute correlation plot, but do not necessarily materialize as a clear
lowered number of factors during recession.
[ILLUSTRATION OMITTED]
Stability of the factors is dependent on asset class. For USTB,
which is considered to be the lowest risk among those considered, the
number of factors that affects its return is lower during recessions,
especially for the one between late 2007 and 2010. Commodity
prices' factors seem invariant to the economic climate. When
combined in the GMHF portfolio, these interactions become more complex
and are realized as fluctuating correlation between factors.
Estimated Factors by Economic Cycles
Because financial markets are often thought of as leading
indicators of economic cycles, it is possible that the change in the
number of factors occurs prior to contractions in the economy. To
investigate this, the contraction periods as determined by NBER.are
lagged by a negative number of months. For example, "Lag =-1
month" of the most recent financial crisis between December 2007
and June 2009 refers to the interval November 2007 to May 2009. The
results are presented in Exhibit 11. For USTB, the number of factors
fluctuates between one and five during expansion periods, supporting the
view that correlation between factors changes prior to recessions. On
the contrary, GMHF and U.S. credit spreads have a constant number of
factors throughout expansion periods. As for commodities, the number of
factors is stable, coherent with the t-test for significance of
correlation results. The number of Eictors in the currencies dataset
falls to zero prior to contraction periods. Therefore, the view of
Financial market performance as leading indicators of economic cycles is
supported only by USTB rates and, to a lesser extent, by currencies.
CONCLUSION
This article investigates the number of cross-asset uncorrelated
strategies available to portfolio managers. After reviewing the
literature on existing approaches to determine the number of factors,
four methods are selected and tested. The Monte Carlo analysis suggests
that the criterion by Alessi et al. [2009] is most reliable.
Implementation of the criteria on five datasets yield the corresponding
estimated number of factors by ABC in parenthesis: Global Macro Hedge
Fund (5), U.S. Treasury bond rates (3), U.S. credit spreads (1),
commodity prices (2), and currencies (1). Plots of the number of
uncorrelated factors do not all support the hypothesis of increased
cross-market correlation during economic recession. Evidence is
strongest for the U.S. Treasury bond rates, which is most likely the
result of U.S. macroeconomic policies post--financial crisis, but least
strong for commodity prices, aligned with the observation of low
correlation between commodity groups claimed in previous studies. GMHF,
composed of a mix of these assets except for credit spreads,
demonstrates a combination of the observed outcome on the rest of the
datasets, yielding fluctuating correlation between the factors during
the economic downturns. The results regarding stability must be
considered with caution, however, because as not every financial crisis
is succeeded by a recession period. Thus, there could be a change in
correlation between assets occurring outside the NBER-determined
recession periods. With more in Formation on the factors driving the
returns in different asset classes, investors would have a better
understanding of the common sources of risk. Depending on the asset
class in mind, a strategy built on exploiting these common sources of
risk may have to take the economic climate into. account.
EXHIBIT 11 Number of Estimated Factors by Dataset, by Business Cycle
Expansions and Contractions, Lagged by Negative Number of Months
Dataset Global Macro Hedge U.S. Treasury
Fund Bonds Kate
Lay (No. Expansion Contraction Expansion Contraction
of
months)
0 5 6 3 4
-3 6 6 5 3
-6 5 5 2 3
-12 5 4 2 3
Dataset U.S. Credit Spreads
Lay (No. Expansion Contraction
of
months)
0 1 1
-3 1 I
-6 1 1
-12 1 2
Dataset Commodities Currencies
Lag (No. Expansion Contraction Expansion Contraction
of
months)
0 2 2 1 1
-3 2 2 1 0
-6 2 2 I 1
-12 2 2 1 0
APPENDIX A
METHODS TO DETERMINE THE NUMBER OF FACTORS TO INCLUDE
The tests considered in this study are as follows.
Bai and Ng [2002] (BN)
The estimated number of factors by BN is the integer corresponding
to the lowest value of the loss function V(r,F) + rg(N,T), or
log(V(r,[F.sub.r])) + r[[sigma].sup.2]g(N,T), with [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII] whereby [F.sup.r] is the matrix of
r factors, [LAMBDA] =
[([[lambda].sub.i.sup.r]...[[lambda].sub.N.sup.r]), g(N,T) is the
penalty for over-fitting, and r is a constant. [[sigma].sup.2] is a
consistent estimate of
1/NT[[summation].sub.i=1.sup.N][[summation].sub.t=1.sup.T]
E[[[e.sub.it]].sup.2], which in practice can be replaced by
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], [r.sub.max], a
pre-specified maximum number of factors considered. Two examples of
g(N,T) are [g.sub.1](N,T)= N+T/NT ln(min[([square root of (N)], [square
root of (T)]).sup.2), which is frequently used in empirical works, and
[g.sub.2](N,T) = (N + T - k)ln(NT)/NT, which has been shown to possess
good properties when errors are cross-correlated (Bai and Ng [2008]).
Alessi et at. [2009] (ABC)
Alessi et al. [2009] propose a refinement of BN that multiplies a
constant, c, to the penalty function as follows, V(r,[F.sup.r]) +
rcg(N,T), or log(V(r,[F.sup.r]))+ rc[[sigma].sup.2]g(N,T). The number
.of estimated factors remains as the one yielding the lowest value for
these modified loss functions. Furthermore, the authors suggest
evaluating the loss functions over random subsamples of the data to find
an estimate that is insensitive to the sample size and neighboring
values of c. Detailed explanations on the role of c is provided in
Hallin and Liska [2007], while generation of the random subsamples is
described in. Alessi et al. [2009]. This criterion has been shown to
provide a solution when BN's criterion fails, and is not any more
complex in implementation because it requires, in essence, multiple
repetitions of BN.
Connor and Korajczyk [1993] (CK)
An alternate approach developed by Connor and Kora-jczyk [1993]
(CK) is based on the idea that an r factor model's [(r + 1).sup.st]
factor can have nontrivial factor loadings for some assets but only a
small proportion of them. A statistical test for this is developed to
test whether the [(r + 1).sup.st] factor is pervasive. It proceeds by
running two regressions by ordinary least squares (OLS), one with r
factors and another with r+ 1 factors. The adjusted squared residuals,
[[sigma].sub.it] = [[epsilon].sub.it.sup.2]/1-i+1/T-t/N, with
[[epsilon].sub.it.sup.2] as the OLS estimated residuals, are computed. A
cross-sectional mean for both [[sigma].sub.it]S, defined as
[[mu].sub.r.sup.n] = [[sigma].sub.r][[sigma].sub.t]/N, is calculated
next, for both regression models. Then even month's
[[mu].sub.t.sup.n] for the regression with r + 1 factors is subtracted
from the odd month's [[mu].sub.t.sup.n] for the r factor model,
giving a value [[DELTA].sup.N] Under the null hypothesis that the model
has r factors, [[DELTA].sup.N][[pi].sup.-1/2] with [pi] as the
covariance matrix of [DELTA], is asymptotically standard normal as n
[right arrow] [infinity], hence in practice, a t-test is carried out on
the estimates [[DELTA].sup.N][[pi].sup.-1/2]. In order to establish
distribution of the idiosyncratic components, the authors made the
assumption of homoskedasticity across time periods--pointed out in Bai
and Ng [2002] to be undesirable--and that [e.sub.i] = ([e.sub.11],
[e.sub.12], ..., [e.sub.iT]), i = 1, ... [infinity] is a mixing process!
(6)
Onatski [2009]
Drawing upon the property that an r factor panel of data has
unbounded first r largest eigenvalues of the covariance matrix of
[X.sub.t], and bounded [(r+ 1).sup.s+t] eigenvalue, Onatski [2009]
developed a statistical test for [H.sub.0]:r = [r.sub.0] versus
[H.sub.1]:[r.sub.0] < r [less than or equal to] [r.sub.1], with r as
the number of factors, [r.sub.1] and [r.sub.0] are the upper and lower
bounds for the number of factors, which are determined by prior
knowledge. Beginning from [r.sub.0], for each successive r, the Discrete
Fourier Transforms (DFTS), [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN
ASCII] are computed at prespecified frequencies [w.sub.j]. By El Karoui
[2006], [X.sub.t], is asymptotically distributed as Tracy--Widom. The
test statistic is [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
whereby [[gamma].sub.i] is the largest eigenvalue of the covariance
matrix of [X.sub.t]. R is essentially a measure of the curvature at the
would-be breakpoint of the frequency-domain Scree plot postulated by the
alternative hypothesis that the model has more than [r.sub.0], factors,
but fewer than [r.sub.1] factors. Critical values for the test are
provided in Onatski [2009] for up to r = 18 factors. Similar to the case
of CK, imposing a valid distribution requires fairly strong assumptions
such as having idiosyncratic components that follow a Gaussian
distribution.
APPENDIX B
MONTE CARLO STUDY
Before choosing one method over the others, we perform a Monte
Carlo test to evaluate their relative performance on simulated data with
various qualities. The experimental design employs seven data-generating
processes (DGPs) that differ in their relationship between elements of
the idiosyncratic components of the following model:
[F.sub.y] and [[lambda].sub.ij] are normally distributed with zero
mean and unit variance. This is similar to the DGPs used in Alessi et
al. [2009].
1. Homoskedastic idiosyncratic component, same variance for the
common and idiosyncratic component: [e.sub.ii] ~ N(0,1) and r = [theta].
2. Heteroskedastic idiosyncratic component, same variance for the
common and idiosyncratic component:
3. Homoskedastic idiosyncratic component, common component has a
larger variance than the idiosyncratic component: [e.sub.it]~N(0,1) and
r= 2[theta].
4. Homoskedastic idiosyncratic component, common component has a
smaller variance than the idiosyncratic component: [e.sub.it]~N(0,1) and
r= [theta]/2.
5. Small cross-section correlation across idiosyncratic parts, same
variance for the common and idiosyncratic component: [e.sub.it] =
[v.sub.it] + [[SIGMA].sub.h [not equal to]0 h=-H.sup.H]
[bate][v.sub.i-h,t], [v.sub.it] ~ N(0,1), and r = [theta]
6. Serial correlation across idiosyncratic parts, common component
has .a smaller variance than the idiosyncratic component:
[e.sub.it] = P[e.sub.it-1] + [v.sub.it], [e.sub.it] ~ N(0,1),
[v.sub.it] ~ N(0,1), r = [theta] and r < [theta]/1-[p.sup.2]
7. Serial and small cross-section correlation across idiosyncratic
parts, common component has a larger variance than the idiosyncratic
component:
[e.sub.it] = P[e.sub.it-1] + [v.sub.it] + [[SIGMA].sub.h [not equal
to]0 h=-H.sup.H][bate][v.sub.i-hj], [e.sub.it] ~ N(0,1), [v.sub.it] ~
N(0,1), r = [theta], and r < [theta]/1-[p.sup.2]
For all seven DGPs, we test For the pairs of time and cross-section
dimension (N,T) = {(70,70),(100,120),(150,500)}. The true number of
factors, r, is chosen to be 1,3,5,8,10, and 15, all of which are
consistent with the requirement r < min{N,T}. The corresponding
[r.sub.max] for BN and ABC, and the upper bound on CK and Onatski's
test, is [r.sub.max] = 8 when r= 1,3,5; [r.sub.max] = 15 when r= 8,10;
[r.sub.max] = 20 when r= 15. CK and Onastki's test always began
with the lower bound of 1, to suggest that in many financial datasets,
it is unlikely to have prior knowledge beyond the belief that there
should be at least one factor, given that the data indeed has a factor
structure. The correlation between the common and idiosyncratic
components is p = 0.5, [bate] = 0.2, while H = max{N/20, 10} Five
hundred Monte Carlo replications are performed for each instances.
Additionally, to test ABC's criterion, the parameters to determine
the random subsamples are [n.sub.j] = 3/4N (see Alessi et al. [2009])
and [c.sub.max] = 13 with step size 0.01.
BN's criterion has perfect performance for DGPs 1 to 4,
correctly identifying the number of factors. When the DGP demonstrates
cross-section or serial correlation across the idiosyncratic components,
however, such as in DGP 5 to 7, the criterion slightly overestimates the
number of factors. There is no obvious .effect of dimensions (i.e., Nand
T) on the results, which is .consistent with BN's claim that their
criterion yields precise estimates for min {N,T} > 40. Although ABC
does not display perfect performance as does BN for DGP 1 to 4, as it is
generally plagued by mild overestimation, its performance is more
accurate for DGP 5. This result is similar to ABC's own Monte Carlo
study. Even though BN has stellar performance in most cases, adoption of
ABC is justified because most financial portfolio time series
demonstrate cross-section and serial correlation.
Onatski's criterion's performance pales in comparison
with the rest of the criterion in almost all cases, because it estimates
that the true number of factors is 1 close to 60% of the time, 2 about
30%, and 3 about 10% of the time, being insensitive to the true number
of factors. This outcome could result from having the lower bound of the
test always set at one, a choice to reflect the case that when the test
is implemented on actual data, no prior knowledge is available to
determine the lower bound. An upper bound, however, can be set because
it must be less than min{N, T}, and methods such as ABC are developed to
be less sensitive to the upper bound; hence a larger upper bound can
always be selected.
CK's test has the similar tendency of underestimating the true
number of factors as 1 close to or exceeding 50% of the time when r
[greater than or equal to] 5, when the time dimension is small. However,
for the sample with N = 150, r = 500, i.e., large cross-section and time
dimensions, the test performs reasonably well for all DGPs except for
DGP 2 and 7, correctly identifying the number of factors at least 50% of
the time for DGP 1-4 and 6, and overestimating the factor by 1 for DGP
5. In the case of DGP 2, the number of estimated factors is 1 more than
80% of the time, regardless of the actual number of factors, time, and
cross-section dimensions.
In general, the Monte Carlo study substantiates BN's criterion
as superior not only in accuracy of estimates but also its ease in
implementation. (7) In the case when BN's does not perform well,
either because of cross-sectional or serial dependence or because of
difficulty in estimating [r.sub.max] then ABC's criterion should be
executed. CK's test may perform well when the time and
cross-section dimensions are large.
REFERENCES
Alessi, L., M. Barigozzi, and M. Capasso. "A Robust Criterion
for Determining the Number of Factors in Approximate Factor
Models." Technical report, 2009.
Audrino, F., G. Barone-Adesi, and A. Mira. "The Stability of
Factor Models of Interest Rates." Journal of Financial.
Econometrics, Vol. 3, No. 3 (2005), pp. 422-441.
Bai, J., and S. Ng. "Determining the Number of Factors in
Approximate Factor Models." Econometrica, Vol. 70, No. 1 (2002),
pp. 191-221.
---. "Large Dimensional Factor Analysis." Foundations
.and Trends in Econometrics, Vol. 3, No. 2 (2008), pp. 89-163.
Bernanke, B., J. Boivin, and P.S. Eliasz. "Measuring the
Effects of Monetary Policy: A Factor-Augmented Vector Autoregressive
(FAVAR) Approach." Quarterly Journal of Economics, Vol. 120, No. 1
(2005), pp. 387-422.
Billio, M., M. Getmansky, A.W. Lo, and L. Pelizzon.
"Econometric Measures of Connectedness and Systemic Risk in the
Finance and Insurance Sectors." Journal of Financial Economics,
Vol. 104, No. 3. (2012), pp. 535-559.
Bliss, R.R. "Movements in the Term Structure of Interest
Rates." Economic Review, 4Q (1997), pp. 16-33.
Chamberlain, G., and M. Rothschild. "Arbitrage, Factor
Structure, and Mean--Variance Analysis on Large Asset Markets."
Econometrica, Vol. 51, No. 5 (1983), pp. 1281-1304.
Chantziara, T., and G.S. Skiadopoulos. "Can the Dynamics of
the Term Structure of Petroleum Futures Be Forecasted? Evidence from
Major Markets." Energy Economics, Vol. 30, No. 3 (2008), pp.
962-985.
Connor, G., and R.A. Korajczyk. "A Test for the Number of
Factors in an. Approximate Factor Model. Journal of Finance, Vol. 48,
No. 4 (1993), pp. 1263-1291.
Dai, Q., and K.J. Singleton. "Specification Analysis of Affine Term Structure Models." Journal of Finance, Vol. 55, No. 5 (2000),
pp. 1943-1979.
Daskalaki, C., A. Kostakis, and G. Skiadopoulos. "Are There
Common Factors in Commodity Futures Returns?" Working paper, 2013.
Available at http://papers.ssrn.com/so13/papers.cfm?abstract_id=2056186.
Daskalaki, C., and G. Skiadopoulos. "Should Investors Include
Commodities in Their Portfolios After All? New Evidence.' Journal
of Banking and Finance. Vol. 35, No. 10 (2011), pp. 2606-2626.
Donald, S.G. "Inference Concerning the Number of Factors in a
Multivariate Nonparametric Relationship." Econometrica, Vol. 65,
No. 1 (1997), pp. 103-132.
El Karoui, N. "Tracy--Wid.om Limit for the Largest Eigenvalue
of a Large Class of Complex Wishart Matrices." Annals of
Probability. Vol. 35, No. 2 (2006), pp. 663-714.
Favero, C.A., M. Marcel lino, and F. Neglia. "Principal
Components at Work: The Empirical Analysis of Monetary Policy with Large
Data Sets." Journal of Applied Econometrics, Vol. 20, No. 5 (2005),
pp. 603-620.
Forni, M., D. Giannone, M. Lippi, and L. Reichlin. "Opening
the Black Box: Structural Factor Models with Large Cross-Sections."
Working Paper Series 712, European Central, Bank. 2007.
Forni, M., M. Lippi, and L. Reichlin. "Opening the Black Box:
Structural Factor Models versus Structural VAR.s." CEPR. Discussion
Paper 4133, Centre for Economic Policy Research, 2003.
Forni, M., and L. Reichlin. "Let's Get Real: A Factor
Analytical Approach to Disaggregated Business Cycle Dynamics." ULB Institutional Repository 2013/10147, Universite Libre de Bruxelles,
1998.
Goff, J., ed. "Economic Letter: What Makes the Yield Curve
Move?" No. 2003-15, Federal Reserve Bank of San Francisco, 2003.
Gorton, G., and K.G. Rouwenhorst. "Facts and Fantasies about
Commodity Futures." NBER Working Papers 10595, National Bureau of
Economic Research, Inc., 2004.
Hallin, M., and R. Liska. "Determining the Number of Factors
in the General Dynamic Factor Model." Journal of the American
Statistical Association, 102 (2007), pp. 603-617.
Jaffe, I. Principal Component Analysis, 2nd ed. New York: Springer,
2002.
Kapetanios, G. "A Testing Procedure for Determining the Number
of Factors in Approximate Factor Models with Large Datasets."
Journal of Business & Economic Statistics, Vol. 28, No. 3 (2010),
pp. 397-409.
Kat, H.M., and R.C.A. Oomen. "What Every Investor Should Know
about Commodities, Part II: Multivariate Return Analysis."
Technical Report 33, Cass Business School, 2006.
Kinlaw, W., M. Kritzman, and D. Turkington. "Toward
Determining Systemic Importance." The Journal of Portfolio
Management, Vol. 38, No. 4 (2012), pp. 100-111.
Kritzman, M., Y. Li, S. Page, and R. Rigobon. "Principal
Components as a Measure Of Systemic Risk." The Journal of Portfolio
Management, Vol. 37, No. 4 (2011), pp. 112-126.
Lewbel, A. "The Rank of Demand Systems: Theory and
Nonparametric Estimation." Econometrica, Vol. 59, No. 3 (1991), pp.
711-730.
Litterman, R., and J. Sheinkman. "Common Factors Affecting
Bond Returns." The Journal of Fixed Income, Vol. 1, No. 1 (1991),
pp. 54-61.
National Bureau of Economic Research. "U.S. Business Cycle
Expansions and Contractions." 2012. Available online at
http://www.nber.org/cycles.html
Onatski, A. "Testing Hypotheses about the Number of Factors in
Large Factor Models." Econometrica, Vol. 77, No. 5 (2009), pp.
1447-1479.
---. "Determining the Number of Factors from Empirical
Distribution of Eigenvalues." Review of and Statistics, Vol. 92,
No. 4 (2010), pp. 1004-1016.
Perignon, C., and C. Villa. "Permanent and Transitory Factors
Affecting the Dynamics of the Term Structure of Interest Rates."
Technical report, International Center for Financial Asset Management
and Engineering, 2002.
Philip, D., C. Kao, and G. Urga. "Testing for Instability in
Factor Structure of Yield Curves." Technical report, Cass Business
School. 2007.
Pukthuanthong, K., and R. Roll. "Global Market Integration: An
Alternative Measure and Its Application." Journal of Financial
Economics, Vol. 94, No. 2 (2009), pp. 214-232.
To order reprints of this article, please contact Dewey Palmieri at
dpalmieri@iijournals.com or 212-224-3675.
[TABLE OMITTED]
Disclaimer
The content of this article is the sole responsibility oldie
authors. It does not necessarily reflect the views of Amundi, Lombard
Odier Asset Management, their staff members or clients.
ENDNOTES
The authors thank participants of the Computational and Financial
Econometrics (CFE) Conference 2012 for insightful comments.
(1.) Mean--variance analysis of an N-asset portfolio requires
[N.sup.2]+3N/2 estimates.
(2.) See Jolliffe [2002] for the theory and applications of PCA.
(3.) For more details on Bai and Ng [2002] and all tests described
below, please refer to the appendixes.
(4.) More details on the simulated data are provided in the
appendixes.
(5.) Refer to Goff [2003] for more details and figures illustrating
these factors.
(6.) Definition of a Mixing Process: Let [MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII] whereby g and H are [sigma]-algebras. So
[alpha] is the maximal difference between the joint probability of
events in g and H. The mixing coefficient is [MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII] whereby [F.sub.a.dup.b] is the
[sigma]-algebra generated [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN
ASCII]. If [alpha](m) [right arrow] 0 as m [right arrow] [infinity],
then [{[e.sub.i]}.sub.i=1.sup.[infinity]] is called strong mixing, or
[alpha]-mixing.
(7.) Links to the codes are
http://www.columbia.edu/~sn2294/research.html,
http://www.barigozzi.eu/mb/Codes.html, and
http://www.columbia.edu/~ao2027/, respectively (last accessed November
28, 2012).
LING-NI BOON is a research analyst at Amundi in Paris, France.
lingni.boon@amundi.com
FLORIAN IELPO is a fund manager at Lombard Odier Investment
Management in Geneva. Switzerland. f.ielpo@lombardodier.com