Risky arbitrage, limits of arbitrage, and nonlinear adjustment in the dividend-price ratio.
Gallagher, Liam A. ; Taylor, Mark P.
MARK P. TAYLOR (*)
I. INTRODUCTION
The present value model of stock prices has generally been rejected
on U.S. data, as in Campbell and Shifler (1987, 1988a, 1988b). (1) While
this may be taken as evidence against the efficient markets hypothesis,
an alternative view would be that the instantaneous arbitrage assumed in
the simple present value model is too restrictive. In recent arbitrage
models developed by, inter alios, Grossman and Miller (1988), De Long,
Shleifer, Summers, and Waldmann (1990) and Campbell and Kyle (1993),
arbitrage is generally less than perfect because arbitrageurs face
either fundamental or noise trader risk. In particular, given that the
actions of noise traders may lead to greater fundamental mispricing of
an asset, perceived deviations of asset prices from their fundamental
values represent risky arbitrage opportunities, as in Shleifer and
Summers (1990). (2) Thus, small deviations from fundamentals may not be
arbitraged because the perceived gains may not be enough to outweigh
this risk. Given, however, a distributio n of degrees of risk aversion across smart traders, arbitrage will increase as the degree of
fundamental mispricing increases, so that arbitrage is stabilizing and
becomes more stabilizing in extreme circumstances. Traditional arbitrage
models, therefore, imply a degree of nonlinearity in asset price
dynamics, for example, as in Chiang, Davidson, and Okunev (1997). We
term this broad approach the "risky arbitrage" hypothesis.
This approach may be contrasted with the more recent "limits
of arbitrage" hypothesis, suggested by Shleifer and Vishny (1997),
in which arbitrage activity is viewed in an agency context. In the
Shleifer-Vishny model, arbitrageurs (the agents) have access to funds
mainly from outside investors (the principals), who will generally gauge
the ability of arbitrageurs--and hence decide on the amount of funds to
allocate to them--based on their past performance. Since the track
record of smart arbitrageurs is likely to be poorest when prices have
deviated far from their fundamental values, the implication of the
limits of arbitrage hypothesis is that arbitrage is likely to be least
effective in returning prices to fundamental values when investor
sentiment has driven them far away. (3) That is, "[w]hen arbitrage
requires capital, arbitrageurs can become most constrained when they
have the best opportunities, i.e., when the mispricing they have bet
against gets even worse" (Shleifer and Vishny [1997, 37]). In
contra st, the risky arbitrage models are without agency problems and
arbitrageurs are more aggressive when prices move further from
fundamental values.
In this article we provide a simple test of these two alternative
views of arbitrage activity by nonlinear time series modeling of the
aggregate log dividend-price ratio, such that adjustment towards
equilibrium varies nonlinearly with the size of the deviation from
equilibrium.
The remainder of the paper is set out as follows. Section II
presents a simple test of the risky arbitrage hypothesis against the
alternative of the limits of arbitrage hypothesis using the logarithmic present value model and recently developed techniques in parametric
nonlinear modeling. The procedure for selecting the appropriate modeling
of the log dividend-price ratio is outlined in section III. Section IV
describes the data and presents the empirical results. Section VI
extends the earlier empirical results to allow for a time-varying
returns in the present value representation. Section VI concludes the
study.
II. THE LOG DIVIDEND-PRICE RATIO, NONLINEARITY, ARBITRAGE, AND
ADJUSTMENT
The loglinear present value model can be expressed as:
(1) [y.sub.t] = [d.sub.t] - [p.sub.t] = -[summation over
([infinity]/j=0)] [[rho].sup.j][E.sub.t][DELTA][d.sub.t+1+j] + [k.sup.*]
where [p.sub.t] and [d.sub.t] are the log of real stock prices and
real dividends, respectively, [rho] = [(1+R).sup.-1], where R is a
constant discount rate equal to the average dividend-price ratio, and
[k.sup.*] is a constant, for example, as in Campbell and Shiller
(1988b). When [p.sub.t] and [d.sub.t] are first-difference stationary,
I(1), equation (1) implies that they are also cointegrated with a
cointegrating vector [1,- 1]', that is, that the log dividend-price
ratio, [y.sub.t] = [d.sub.t] - [p.sub.t], is stationary.
The loglinear representation of the present value model of stock
prices also implies a number of highly nonlinear cross-equation
restrictions similar to those encountered in rational expectations
models, as in Campbell and Shiller (1988b), Cuthbertson (1996), and
Campbell, Lo, and MacKinlay (1997). Campbell and Shiller (1988b) show
that, given [rho], the restrictions can be simplified to a linear form.
A test of the cross-equations restrictions, with constant expected
excess returns, is a Wald test statistic for zero coefficients in a
regression of [[xi].sub.t] on lagged [y.sub.t] and [DELTA][d.sub.t],
with the asset return [[xi].sub.t] [equivalent to] k + [[rho]P.sub.t] +
(1 - [rho])[d.sub.t] - [P.sub.t-1] = k - [rho][y.sub.t] + [y.sub.t-1] +
[DELTA][d.sub.t], and the constant k = - log([rho]) - (l -
[rho])log(1/[rho] - 1). (4) Campbell and Shiller (1988b) also
demonstrate that a weak test of the model is that the log dividend-price
ratio Granger-causes changes in log dividends.
In aggregate U.S. stock market data, it is well known that the log
dividend series is very close to a random walk, possibly with drift, as
in Cochrane (1994) and Campbell, Lo, and MacKinlay (1997). Let this
drift parameter, the average dividend growth, be c. Then the
right-hand-side of (1) collapses to a constant [k.sup.**]= -[c/(1 -
[rho])] + [k.sup.*]. In this simple formulation, [K.sup.**] becomes the
fundamental value of the log dividend-price ratio, so that stocks are
judged to be fundamentally mispriced as their real price deviates from a
constant multiple (-[k.sup.**]) of the level of real dividends. A simple
test of the risky arbitrage hypothesis against the alternative of the
limits of arbitrage hypothesis then becomes a test of whether [y.sub.t]
adjusts towards [k.sup.**] more quickly or more slowly as the size of
the deviation of [y.sub.t] from [k.sup.**] grows.
A parsimonious parametric time series model of nonlinear mean
reversion which has been shown to approximate well a broad range of
nonlinearity is the smooth transition autoregressive (STAR) model, as in
Terasvirta (1994). The exponential smooth transition autoregressive
model of order q [ESTAR(q)] may be written:
(2) [y.sub.t] = [[pi].sub.0] + [summation over (q/i=1)]
[[pi].sub.i][y.sub.t-i] + {[[pi].sup.*.sub.0] + [summation over (q/i=1)]
[[pi].sup.*.sub.i][y.sub.t-i]}
x {1 - exp[-[gamma][([y.sub.t-g] - [c.sup.*]).sup.2]]} + [u.sub.t]
where [y.sub.t] is assumed stationary and ergodic, [u.sub.t] is a
stochastic disturbance term, and [gamma] > 0. (5) The exponential
transition function F([y.sub.t-g])=1-exp[-[gamma][([y.sub.t-g] -
[c.sup.*]).sup.2]] is U-shaped and bounded between zero and unity, with
the (smoothness) parameter [gamma] determining the speed of the
transition process between extreme regimes. The middle regime
corresponds to F = 0, [y.sub.t-g] = [c.sup*], when (2) becomes a linear
AR(q) model:
(3) [y.sub.t] = [[pi].sub.0] + [summation over (q/i=1)]
[[pi].sub.i][y.sub.t-i]+[u.sub.t].
The outer regime corresponds to the limit, lim [absolute
value[y.sub.t-g] - [c.sup.*]] [right arrow] [infinity] when F = 1 and
(2) becomes a different AR(q) model:
(4) [y.sub.t] = ([[pi].sub.0] + [[pi].sup.*.sub.0]) + [summation
over (q/i=1)]([[pi].sub.i]+[[pi].sup.*.sub.i])[y.sub.t-i]+[u.sub.t].
Intermediate values of F will result in the dynamics governed by a
linear combination of (3) and (4) with the weights given by (1-F) and F
respectively. Global stability of the ESTAR(q) model requires
(5) [summation over (q/i=1)]([[pi].sub.i] + [[pi].sup.*.sub.i])
< 1.
Now, if
(6) [summation over (q/i=1)]([[pi].sub.i] + [[pi].sup.*.sub.i])
< [summation over (q/i=1)] [[pi].sub.i],
then this would imply that the degree of mean reversion grows as
the deviation from the fundamental equilibrium grows, consistent with
the risky arbitrage hypothesis. On the other hand, if the reverse
inequality holds, then the degree of mean reversion shrinks as the
degree of mispricing grows, consistent with the limits of arbitrage
hypothesis. Thus, from (6), we can deduce that support for the risky
arbitrage hypothesis would be provided if the estimated value of
[summation over (q/i=1)] [[pi].sup.*.sub.i] were negative and
significantly different from zero, while support for the limits of
arbitrage hypothesis would be provided if this sum were positive and
significantly different from zero.
III. LINEARITY TESTING AND MODEL SELECTION
Terasvirta (1994) suggest testing linearity against ESTAR by first
specifying the appropriate order of the autoregressive components, q,
and suggests choosing this from an examination of the partial
autocorrelation function (PACF) of [y.sub.t] in the usual fashion. For a
given value of the delay parameter g, Granger and Terasvirta (1993) and
Terasvirta (1994) show that appropriate tests of the null hypothesis of
linearity against an alternative hypothesis of nonlinear adjustment may
be based on the artificial regression:
(7) [y.sub.t] = [[beta].sub.00] + [summation over (q/i=1)]
([[beta].sub.1i][y.sub.t-i] + [[beta].sub.2i][y.sub.t-i][y.sub.t-g]
+ [[beta].sub.3i][y.sub.t-i][y.sup.2.sub.t-g] +
[[beta].sub.4i][y.sub.t-i][y.sup.3.sub.t-g]) + [[epsilon].sub.t].
Since (7) may be viewed as a reparameterization of (2), with an
unrestricted third-order Taylor series expansion of the transition
function, an appropriate simple test of nonlinearity is clearly an
F-test, [F.sub.1], of the following restrictions on (7):
(8) [H.sub.01]: [[beta].sub.2i] = [[beta].sub.3i] [[beta].sub.4i] =
0, i = 1,... , q
against the alternative that [H.sub.01] is not valid.
If the transition function is of the exponential family discussed
above, however, third-order terms vanish in its Taylor series expansion,
see Granger and Terasvirta (1993). Intuitively, because the exponential
transition function is U-shaped as a function of [y.sub.t-g], it will be
better approximated by a quadratic than by a cubic. Moreover, given that
we shall examine the behavior of [y.sub.t] with its mean removed, if the
dividend-price ratio averaged over the whole sample period has been
close to the equilibrium level, we would also expect the ESTAR model (2)
to satisfy [[pi].sup.*.sub.0] = [c.sup.*] = 0. If (7) is interpreted as
the Taylor series expansion of (2), this would further imply
[[beta].sub.2i] = 0 in (7). This reasoning therefore suggests the
following sequence of tests:
(9a) [H.sub.04]: [[beta].sub.4i] = 0 i = 1,...,q
(9b) [H.sub.03]: [[beta].sub.3i] = 0 \[[beta].sub.4i] = 0, i =
1,...,q
(9c) [H.sub.02]: [[beta].sub.2i] = 0 \[[beta].sub.4i] = 0, i =
1,...,q
where we might denote the relevant Wald-statistics for (9a), (9b),
and (9c) respectively by [W.sub.4], [W.sub.3] and [W.sub.2] If the true
model is ESTAR, we would expect not to reject [H.sub.04] but to reject
[H.sub.03], and if in addition the sample mean value of [y.sub.t] is
close to the equilibrium value, we would expect not to reject
[H.sub.02].
Of course, in practice g is not known. We therefore follow the
procedure suggested by Granger and Terasvirta (1993) and Terasvirts
(1994) for selecting g. This involves testing the null hypothesis
[H.sub.01] for a range of values of g = 1, 2,...G, and in each case
calculating the Wald-statistic [W.sub.1](g). The delay parameter is then
chosen such that [W.sub.1](g) = [sup.sub.g] [W.sub.1](g) g = 1,..., G.
Although it might be thought that maximizing the test statistic in this
fashion would generate substantial pre-test bias, the Monte Carlo evidence of Terasvirta (1994) suggests that this should only lead to
slight bias in the test size. If this procedure leads to linearity being
rejected in favor of an ESTAR(q) model, we follow Tong (1990) in
estimating (2) by nonlinear least squares, which provides estimators
that are consistent and asymptotically normally distributed. We use
heteroskedasticity-robust forms of these Wald statistics in our
empirical work (see White 1980).
IV. EMPIRICAL RESULTS WITH U.S. DATA
Quarterly data on aggregate U.S. stock market prices and dividends,
for the 1926i--97iv period, were obtained from the data base of the
Center for Research in Securities Prices (CRSP) of the University of
Chicago. The stock price and dividend data are from the CRSP Indices
files and consumer prices used to deflate the nominal series is from the
Stock, Bonds, Bills, and Inflation (SBBI) series (Ibbotson and
Associates). Table 1 reports some summary statistics on the series of
interest. (6) The sample autocorrelations of the price and dividend
series reveal some degree of persistence in each series as they tend to
die away slowly. The first-order autocorrelation values close to one
suggest that the series are non-stationary. The unit root tests confirm
that we cannot reject the null hypothesis of I(1) behavior for the
price, dividend and returns series at standard significance levels.
The results from two tests of cointegration are reported in Table
2. Although much of the analysis in the present article is predicated on
the assumption that adjustment in the dividend-price ratio is nonlinear,
Balke and Fomby (1997) have shown that standard cointegration procedures
work reasonably well in the nonlinear case and suggest that their
results are likely to hold for smooth transition models, although a
cautious note in interpreting the cointegration tests in the presence of
nonlinearity is provided by Corradi, Swanson, and White (2000). However,
even in the presence of nonlinearity, the Monte Carlo evidence of
Corradi, Swanson, and White (2000) shows that, in large samples, the
augmented DickeyFuller (ADF) test can still be used to signal the
presence of a unit root so long as a trend is included in the auxiliary
regression (i.e., so long as the [[TAU].sub.[TAU]] form of the ADF
statistic is used).
The ADF test with a lag length set at four rejects the null
hypothesis of no cointegration. If we impose a unit slope coefficient
and test the stationarity of the log price-dividend ratio, the ADF test
statistic is -3.47 (see Table 1). On the basis of these unit root test
statistics, we reject the hypothesis that the log dividend-price ratio
is non-stationary at the 5 percent level.
Further evidence is provided by the Johansen (1988) maximum
likelihood estimation technique which strongly rejects the null
hypothesis of no cointegration, with the long-run equilibrium given by
[d.sub.t] 0.702[p.sub.t]. (7) Moreover, the likelihood ratio statistic
for the hypothesis that [d.sub.t] and [p.sub.t] are cointegrated with a
cointegrating vector [1, -1]', asymptotically distributed as
[[chi].sup.2](1) under the null hypothesis, is equal to 5.47 with a
marginal significance level of 2 percent. Notwithstanding Balke and
Fomby's (1997) evidence concerning the Johansen test in the
presence of nonlinearity, Corradi, Swanson, and White (2000) show that
it may be subject to substantial size distortions in such circumstances.
Hence, we are not unduly worried by the formal rejection, at the nominal
5 percent level, of the hypothesis that the cointegrating vector is [1,
-1]'; this may also be due to a failure to allow for time-varying
returns (see section V). With these caveats, the cointegration results
ne vertheless suggest that the log present value model holds with the
adjustment to the long-run equilibrium given by the mean-reverting log
dividend-price ratio. (8)
Unit root test suggest that [y.sub.t] and [[xi].sub.t] are
stationary for the implied annualized discount rates of 3.02 percent
([rho] = 0.9926) and 2.46 percent ([rho] = 0.9940). (9) The Wald tests
that asset returns are unpredictable, that is, that the coefficients on
the lagged and [DELTA][d.sub.t] and [y.sub.t] are jointly zero, are
strongly significant at less than the 0.0001 percent level (see Table
3). Thus, for quarterly U.S. stock prices, and similarly to other
studies (for example, Campbell and Shiller [1988b]), the log present
value model is statistically rejected at all conventional significance
levels. A weak test of the present value model (i.e., that the log
dividend-price ratio Granger-causes dividends) is supported by the data.
Given that there does appear to be a long-run relationship between
dividends and prices, however, this may be due to less than
instantaneous arbitrage, as discussed above, resulting in the log
dividend-price ratio following a nonlinear mean-reverting process.
In modeling and testing for nonlinearity we use the demeaned log
dividend-price ratio, [y.sub.t], presented in Figure 1. Visual
examination of the series does not suggest the presence of a regime
shift in the data or the presence of outliers which might spuriously generate evidence of nonlinearity. Moreover, there appeared to be no
strong visual evidence of variation in the mean of the series. This was
also confirmed by simple, recursive Chow tests on the estimated mean
(not reported). The series yields evidence of positive skewness (see
Table 1) and the computed Jarque-Bera statistic of 9.87 reveals
statistically significant non-normality at the 1 percent level. For a
large part of the 1926-56 period, the log dividend-price ratio is above
the mean (of the full series) and below the mean in the succeeding
period. Examination of the PACF of [y.sub.t] (not reported but available
on request) revealed significant correlations up to order four.
Accordingly the linearity tests are based on the artificial regression
(7) with q set equal to four. Table 4, which reports tests of linearity,
provides strong evidence of nonlinearity: [W.sub.1] rejects linearity at
the near zero percent level for g = 5 and the [W.sub.4], [W.sub.3], and
[W.sub.2] statistics strongly suggest that an ESTAR(4) model with g = 5
and [[pi].sub.0] = [c.sup.*] = 0 is the most appropriate
parameterization for [y.sub.t].
As is common in the analysis of asset price data, see for example,
Campbell, Lo, and MacKinlay (1997), the modeling of [y.sub.t] as an
ESTAR(4) process indicated substantial autoregressive conditional
heteroskedasticity (ARCH; Engle [1982]) in the innovations. This
additional nonlinearity was therefore captured by modeling [y.sub.t] as
an ESTAR(4)-ARCH(1) process. (10). In estimating the nonlinear model we
follow Terasvirta (1994) and standardize the exponent of the transition
function F by dividing it by [[sigma].sup.2.sub.y] the sample variance
of [y.sub.t], and choosing a starting value for the standardized
smoothness parameter equal to 1. The estimated nonlinear ear model is
(11).
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.7066 DW = 1.9601
ARR(4) = l.1609 [[sigma].sup.2.sub.y] = 0.0847
The figures in parentheses are standard errors and t-ratios are
given in braces. [R.sup.2] is the proportion of the variation in
[y.sub.t] explained by the model; DW is the Durbin-Watson statistic;
ARR(4) denotes a Lagrange multiplier test statistic for up to
fourth-order autocorrelation of the residuals, as in Eitrheim and
Terasvirta (1996).
A simple linear modeling of [y.sub.t] as an AR(4) process reveals a
slightly lower goodness-of-fit than the nonlinear modeling of [y.sub.t]
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.68 DW = 1.89 ARR(4) = 0.79
Since there was evidence of ARCH effects, we also modeled [y.sub.t]
as an AR(4)-ARCH(1) process:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.68 DW = 1.91 ARR(4) = 0.70
and the results are similar to those of the simple AR(4) model.
Investigating more parsimonious ESTAR models reveals the variables
which can be omitted from the final nonlinear model specification. After
estimating the unrestricted model and deleting terms insignificant at
the five percent level, the parsimonious estimated nonlinear model is
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.7018 DW = 1.9641
ARR(4) = 1.2287 LR(4) = 0.3578
LR(4) denotes a likelihood ratio statistic for the parsimonious
restrictions implicit in the estimated model against the unrestricted
model. The estimated model clearly fits well, with well determined
coefficients and satisfactory diagnostics. The dynamics of the model are
interesting, since the first four estimated parameter values of the
ESTAR(4) model sum to greater than unity, implying that the log
dividend-price ratio actually tends to move away from the long-run
equilibrium implied by the present value model when it is in its close
neighborhood and is mean reverting when far away from its equilibrium
level.
The coefficient of determination in our preferred ESTAR-GARCH model
at 0.7, is superior to that of the linear specifications at 0.68,
albeit slightly so. The significant evidence of nonlinearity presented
in Table 4, however, shows that the linear models are actually
misspecified, so that, although there is only a marginal increase in
goodness of fit by allowing for nonlinearity, there is increased
confidence in the ability of the model to capture the stock price
dynamics.
The scatter plot of the estimated transition function against
[y.sub.t-5], given in Figure 2, shows that the distribution of the log
dividend-price ratio is in fact more or less symmetrically distributed
around the estimated mean and this is confirmed by a simple count which
reveals that some 47% of the deviations are above the mean, with the
remaining more or less half of the sample below. It is also apparent
from Figure 2 that for large deviations from the long-run equilibrium
there is some evidence of a fast adjustment back towards the
equilibrium, suggesting support for the risky arbitrage hypothesis. This
impression is confirmed formally, with the outer regime coefficient of
-0.620 significantly different from zero at the 0.001 percent level (in
the unrestricted model, the sum of the outer regime coefficients,
[summation over (4/i=1)][[pi].sup.*.sub.1], is -0.616 and is
significantly different from zero at the 0.442 percent level). Thus, the
degree of mean reversion increases significantly with the size of the
deviation of the log dividend-price ratio from the long-run equilibrium
level suggested by the fundamentals.
On the other hand, Figure 2 reveals that while mean reversion
increases with the degree of mispricing, the speed of mean reversion is
still relatively low for all but the largest deviations from
equilibrium. Shleifer and Vishny (1997) note that arbitrage is likely to
be weakest in sectors of the market in which arbitrage is particularly
risky, such as more volatile sectors. Accordingly, by concentrating on
the market index, we may be masking important limits to arbitrage effects in particular sectors, and this suggests an avenue for future
research.
V. TIME-VARYING EXPECTED RETURNS
The present value model defined by equation (1) assumes that
expected stock returns are constant. Recent empirical evidence, as in
Timmermann (1995) and Campbell, Lo, and MacKinlay (1997), suggests that
expected stock returns may be time-varying. A loglinear approximation of
the present value model outlined in Cuthbertson (1996, 347) and
Campbell, Lo, and MacKinlay (1997, 260-64), is given by:
(10) [y.sub.t] = [d.sub.t] - [p.sub.t] = [k/(1 - [rho])]
+ [E.sub.t][[summation over ([infinity]/j=0)]
[[rho].sup.j](-[DELTA][d.sub.t+1+j] + [r.sub.t+1+j])]
where [r.sub.t+1] [equivalent to] log([P.sub.t+1] + [D.sub.t+1]) -
log([P.sub.t]) [approximately equal to] k + [rho] [p.sub.t+1] + (1 -
[rho])[d.sub.t+1] - [p.sub.t] is the ex post stock return, [rho]
[equivalent to] 1/[1 + exp(d - p)], k = -log([rho]) - (1 - [rho])
log(1/[rho] - 1), and d - p is the average log dividend-price ratio.
[P.sub.t] and [D.sub.t] are the real stock price and dividend series,
respectively. If [d.sub.t] and [p.sub.t] are each generated by an I(1)
process, then (10) implies that [y.sub.t] will be a stationary process
if and only if the stock return series [r.sub.t] is generated by a
stationary, I(0) process. In practice, Campbell, Lo, and MacKinlay
(1997) and Timmermann (1995) point out that, at least with U.S. data,
[r.sub.t] appears to be generated by a highly persistent process which
may be hard to distinguish from an I(1) process. By rearranging (10), we
redefine [y.sub.t] as:
(11) [y.sub.t] = [d.sub.t] - [p.sub.t] - [1/(1 - [rho])][r.sub.t]
= [k/(1 - [rho])] + [E.sub.t] ([summation over ([infinity]/j=0)]
[[rho].sup.j] {-[DELTA][d.sub.t+1+j] + [1/(1 -
[rho])][DELTA][r.sub.t+1+j]})
which suggests testing for the stationarity of [y.sub.t] by testing
for cointegration between the log dividend-price ratio and the stock
return. We may then test for nonlinear adjustment in the behavior of the
redefined [y.sub.t], as outlined above.
Testing for stationarity of the log dividend-price ratio may be
problematic in the time-varying returns model as given by equation (11),
however. Redefining [y.sub.t] = [d.sub.t] - [p.sub.t] - [1/(1 -
[rho])][r.sub.t], where [r.sub.t] [equivalent to] log([P.sub.t] +
[D.sub.t]) - log([P.sub.t-1]) implies [y.sub.t] is a stationary process
if [d.sub.t], [P.sub.t], and [r.sub.t] are cointegrated. The ADF test
statistics reported in Table 1 support the null hypothesis that
[d.sub.t] and [p.sub.t] are generated by unit root processes and reject
the hypothesis that [r.sub.t] is unit root. However, further examination
of the partial autocorrelation function finds that [r.sub.t] is highly
persistent, with a root just within rather than actually on the unit
circle. This suggests that, in a finite-sample context, it may be
fruitful to test for a cointegrating relationship of the form of the
left side of (11), i.e., to treat [r.sub.t] as a unit root process.
We test for cointegration between the log dividend-price ratio and
the ex post rate of return. The Johansen (1988) maximum likelihood
technique strongly rejects the null hypothesis of no cointegration in
favor of one cointegrating vector at the 5 percent significance level,
with a cointegrating vector [1, -1, -(1/1 - [rho])]'. A likelihood
ratio test that the first two elements of this vector were indeed [1,
-1]' yielded a statistic insignificant at the 5 percent level. The
stationary long-run equilibrium is given by [y.sub.t] = [d.sub.t] -
[p.sub.t] - [98.6r.sub.t] + 6.2, implying that [rho] = 0.98986. (12) An
alternative approach to calculate [rho] is to set it equal to
1/1[1+exp(d-p)], where d-p is the average log dividend-price ratio. This
yields a [rho] of 0.9895 which is very close to the [rho] implied by the
long-run equilibrium, as computed above. Therefore, it is no surprise
that both values of [rho] generate very similar results.
In modelling and testing for nonlinearity we use the deseasonalised
[y.sub.t] Examination of the PACF of [y.sub.t] revealed significant
correlations up to order one, setting q = 1 for the linearity tests.
Table 5 provides strong evidence of nonlinearity: [W.sub.1] rejects
linearity at the near zero percent level for g = 4 and the [W.sub.4],
[W.sub.3] and [W.sub.2] statistics strongly suggest that an ESTAR(1)
model with g = 4 and [[pi].sub.0] = [c.sup.*] = 0 is the most
appropriate parameterization for [y.sub.1]. The estimated nonlinear
model is
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.13 DW = 2.04
ARR(4) = 1.59 [[sigma].sup.2.sub.y] = 119.05
Investigating more parsimonious models reveals the variables which
can be omitted from the final nonlinear model specification. After
estimating the unrestricted model and deleting terms insignificant at
the 5 percent level, the parsimonious estimated nonlinear model is
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[R.sup.2] = 0.13 DW 2.04
ARR(4) = 1.59 LR(1) = 0.04
LR(1) denotes a likelihood ratio statistic for the parsimonious
restrictions implicit in the estimated model against the unrestricted
model. The estimated model clearly fits well, with well determined
coefficients and satisfactory diagnostics. The dynamics of the model are
interesting, quite different from the constant returns case, since the
parameter estimates imply mean reversion even in the neighborhood of
equilibrium, although the speed of mean reversion rises as the degree of
mispricing increases.
As in the constant expected returns case, the outer regime
coefficient is statistically significantly different from zero at
standard significance levels. The coefficient value of -1.0482 suggests
that the degree of mean reversion increases significantly with the size
of the deviation of from the present value representation. Furthermore,
the scatter plot of the estimated transition function against
[y.sub.t-4], given in Figure 3, shows that, for large deviations from
the long-run equilibrium there is evidence of a very fast adjustment
back towards the equilibrium. Thus, the time-varying expected returns
representation of the present value model also supports the risky
arbitrage hypothesis.
VI. CONCLUSION
The research reported in this paper represents a first attempt to
discriminate between recent alternative hypotheses concerning arbitrage
in financial markets. The evidence presented reveals that the market log
dividend-price ratio is approximated well by an ESTAR-ARCH model, so
that adjustment towards the long-run equilibrium implied by the
loglinear version of the present value model is nonlinear. The
parameters of the estimated nonlinear models imply significantly
increasing mean-reverting adjustment as the degree of mispricing rises.
These findings are consistent with the risky arbitrage hypothesis.
Further research might concentrate on particular sectors of the
market rather than focusing on the market index--for example, those with
higher than average volatility where the risks to arbitrage are greatest
and its effect is therefore likely to be weakest, as indicated in
Shleifer and Vishny (1997).
Gallagher: Economics Fellow, University of Oxford, United Kingdom,
and Department of Economics, University College Cork, Cork, Ireland.
Phone +353-21-4902974, Fax +353-21-4273920, E-mail I.gallagher@ucc.ie
Taylor: Centre for Economic Policy Research, United Kingdom, and
Professor of Economics and Finance, Warwick Business School, University
of Warwick, Coventry CV4 7AL, United Kingdom. Phone +44-24-765-72832,
Fax +44-24-765-73013, E-mail mark.taylor@wbs.warwick.ac.uk
(*.) We wish to thank three anonymous referees for their helpful
comments and suggestions.
(1.) There exists a number of competing theories that explain the
deviation of the market and fundamental values (represented by expected
value of future discounted dividends), including noise traders, fads,
and speculative bubbles, as in DeLong, Shleifer, Summers and Waldmann
(1990). These theories suggest that stock prices move away from their
fundamental value for periods of time. Stock prices deviate from
fundamentals in a highly persistent way that reflects a random walk
process, see for example, Summers (1986) and Shleifer and Vishny (1997).
Furthermore, Summers (1986, 599) noted that "[r]isk-averse
speculators will only be willing to take limited positions when they
perceive valuation errors. Hence errors will not be eliminated unless
they are widely noticed." Therefore, although stock prices may
reflect their fundamentals in the long run, they may deviate
substantially from their fundamentals for long periods of time, as in De
Long, Shleifer, Summers and Waldmann (1990).
(2.) Shleifer and Summers (1990) provide an accessible summary of
the fundamental and noise trader risks facing arbitrageurs and the role
of investor sentiment in driving stock prices away from fundamentals.
(3.) Frankel and Froot (1986) and Goodhart (1988) apply a similar
approach to foreign exchange rates, suggesting that smart money may have
less and less influence on the market exchange rate as it moves away
from the fundamental level recognised by smart money. Kirwan's
(1993) "epidemics of opinion" model is also closely related.
(4.) In order to avoid the problem that [p.sub.t] and [d.sub.t] are
not measured contemporaneously, in testing the cross-equation
restrictions, we follow Campbell and Shiller (1988b) in constructing
[y.sub.t] as [d.sub.t-1] - [p.sub.t].
(5.) The ESTAR model can be viewed as a generalization of the
exponential autoregressive (EAR) model with [k.sup.*] = [c.sup.*] = 0,
or as a generalization of a special case of a double-threshold
autoregressive (TAR) model, as in Terasvirta (1994).
(6.) The quarterly log dividend series reveals some degree of
seasonality. For this reason, the real log dividend series was
deseasonalized by regressing the series against seasonal dummies, and
using the deseasonalized dummies for empirical estimation.
(7.) Gonzalo and Tae-Hwy (1998) suggest that both ADF and Johansen
cointegration tests be employed in testing for cointegration.
(8.) Previous evidence of cointegration between log real stock
prices and dividends is mixed, see for example, Campbell and Shiller
(1988a,b) and Cuthbertson, Hayes and Nitzsche (1997). The majority of
studies find weak support for the cointegrating relationship and a
stationary log dividend-price ratio.
(9.) The estimates of [rho] are derived from OLS and Johansen ML
estimation of the cointegrating vector of real stock prices and real
dividends and are consistent with previous studies, see for example,
Campbell and Shiller (1987), Campbell and Shiller (1988b) and
Cuthbertson, Hayes and Nitzsche (1997).
(10.) In fact, we tried a range of generalized ARCH process, as in
Bollerslev (1987), and found that a simple ARCH(l) formulation was
adequate in terms of the significance of estimated parameters.
(11.) Corradi, Swanson and White (2000) show that the consistency
of estimated cointegrating parameters in a nonlinear setting is only
guaranteed in a very specialized first-order Markovian modeling
situation. Although we have imposed a particular form of the
cointegrating vector [1, -1]' this suggests that the usual
superconsistency results associated with cointegration in the linear
case, as in Stock (1987) is not guaranteed in the present application.
Some caution should therefore be exercised for example in interpreting
the standard errors of the estimated parameters in our models, since
they are conditioned on an estimate of the cointegrating vector.
(12.) The value of [rho] is consistent with previous studies, see
for example, Campbell, Lo, and MacKinlay (1997, 261).
REFERENCES
Balke, N. S., and T. B. Fomby. "Threshold Cointegration."
International Economic Review, 38(3), 1997, 627-45.
Bollerslev, T. "A Conditional Heteroskedastic Time Series
Model for Speculative Prices and Rates of Return." Review of
Economics and Statistics, 69, 1987, 542-47.
Campbell, J. Y., and A. Kyle. "Smart Money, Noise Trading, and
Stock Price Behavior." Review of Economic Studies, 60(1), 1993,
1-34.
Campbell, J. Y., A. W Lo, and A. C. MacKinlay. The Econometrics of
Financial Markets. Princeton, NJ: Princeton University Press, 1997.
Campbell, J. Y., and R. J. Shiller. "Cointegration and Tests
of Present Value Models." Journal of Political Economy, 95(5),
1997, 1062-88.
-----. "Stock Prices, Earnings, and Expected Dividends."
Journal of Finance, 43(3), 1988a, 661-76.
-----. "The Dividend-Price Ratio and Expectations of Future
Dividends and Discount Factors." Review of Financial Studies, 1(3),
1988b, 195-228.
Chiang, R., I. Davidson, and J. Okunev, "Some Further
Theoretical and Empirical Implications Regarding the Relationship
Between Earnings, Dividends and Stock Prices." Journal of Banking
and Finance, 21, 1997, 17-35.
Cochrane, J. H. "Permanent and Transitory Components of GNP and Stock Prices." Quarterly Journal of Economics, 109(436), 1994,
241-65.
Corradi, V., N. R. Swanson, and H. White. "Testing for
Stationarity-Ergodicity and for Comovements Between Nonlinear Discrete
Time Markov Processes." Journal of Econometrics, 96(1), 2000,
39-73.
Cuthbertson, K. Quantitative Financial Economics. Chichester, U.K.:
John Wiley & Sons, 1996.
Cuthbertson, K., S. Hayes, and D. Nitzsche. "The Behaviour of
UK Stock Prices and Returns: Is the Market Efficient?" Economic
Journal, 107(443), 1997, 986-1008.
De Long, J. B., A. Shleifer, L. H. Summers, and R. J. Waldmann.
"Noise Trader Risk in Financial Markets." Journal of Political
Economy, 98(4), 1990, 703-38.
Dickey, D., and W. A. Fuller. "Distribution of the Estimates
for Autoregressive Time Series with a Unit Root." Journal of the
American Statistical Association, 74, 1979, 427-31.
Eitrheim, [empty set]., and T Terasvirta. "Testing the
Adequacy of Smooth Transition Autoregressive Models." Journal of
Econometrics, 74(1), 1996, 59-75.
Engle, R. F. "Autoregressive Conditional Heteroskedasticity
with Estimates of the Variance of United Kingdom Inflation."
Econometrica, 50(4), 1982, 987-1008.
Frankel, J. A., and K. A. Froot. "The Dollar as an Irrational
Speculative Bubble: A Tale of Fundamentalists and Chartists." The
Marcus Wallenberg papers on International Finance, vol. 1, 1986, 27-55.
Gonzalo, J., and L. Tae-Hwy. "Pitfalls in Testing for Long Run
Relationships." Journal of Econometrics, 86(1), 1998, 129-54.
Goodhart, C. A. E. "The Foreign Exchange Market: A Random Walk
with a Dragging Anchor." Economica, 55, 1988, 437-60.
Granger, C. J., and T. Terasvirta. Modelling Nonlinear Economic
Relationships. Oxford, U.K.: Oxford University Press, 1993.
Grossman, S. J., and M. H. Miller. "Liquidity and Market
Structure." Journal of Finance, 43(3), 1988, 617-33.
Johansen, S. "Statistical Analysis of Cointegration
Vectors." Journal of Economic Dynamics and Control, 12(2/3), 1988,
231-54.
Kirman, A. P. "Ants, Rationality and Recruitment."
Quarterly Journal of Economics, 108, 1993, 137-56.
MacKinnon, J. G. "Critical Values for Cointegration
Tests," in Long-Run Economic Relationships, edited by R. F. Engle
and C. W. J. Granger. Oxford, U.K.: Oxford University Press, 1991,
267-76.
Shleifer, A., and L. H. Summers. "The Noise Trader Approach to
Finance." Journal of Economic Perspectives, 4(2), 1990, 19-33.
Shleifer, A., and R. W. Vishny. "The Limits of
Arbitrage." Journal of Finance, 52(1), 1997, 35-55.
Stock, J. M. "Asymptotic Properties of Least Squares
Estimators of Cointegrating Vectors," Econometrica, 55, 1987,
1035-56.
Summers, L. H. "Does the Stock Market Rationally Reflect
Fundamental Values?" Journal of Finance, 41(3), 1986, 591-601.
Terasvirta, T. "Specification, Estimation, and Evaluation of
Smooth Transition Autoregressive Models." Journal of the American
Statistical Association, 89(425), 1994, 208-18.
Timmermann, A. "Cointegration Tests of Present Value Models
with a Time-Varying Discount Factor." Journal of Applied
Econometrics, 10, 1995, 17-31.
Tong, H. Non-Linear Time Series: A Dynamical System Approach.
Oxford, U.K.: Oxford University Press, 1990.
White, H. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity." Econometrica,
48, 1980, 817-38.
[Figure 1 omitted]
[Figure 2 omitted]
[Figure 3 omitted]
TABLE 1
Summary Statistics and Unit Root Tests
DF ADF-[[tau].sub.[tau]]
[d.sub.t] -2.794 -2.434
[DELTA][d.sub.t] -35.688 (*) -6.550 (*)
[p.sub.t] -1.294
[DELTA][p.sub.t] -7.668 (*)
([d.sub.t] - [p.sub.t]) -3.470 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -8.097 (*)
[r.sub.t] -7.686 (*)
[DELTA][r.sub.t] -12.365 (*)
ADF-[[tau].sub.[mu]] Skew
[d.sub.t] -1.258 -0.666 (*)
[DELTA][d.sub.t] -6.561 (*) -0.358 (*)
[p.sub.t] -2.571 -0.229 (*)
[DELTA][p.sub.t] -7.670 (*) -0.316 (*)
([d.sub.t] - [p.sub.t]) -2.974 (*) 0.458 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -8.099 (*) 0.332 (*)
[r.sub.t] -7.676 (*) 0.364 (*)
[DELTA][r.sub.t] -12.349 (*) 0.906 (*)
Autocorrelation, [rho](k)
Kurt [rho](1)
[d.sub.t] -0.534 .90 (*)
[DELTA][d.sub.t] 2.895 (*) -.64 (*)
[p.sub.t] -1.059 (*) .97 (*)
[DELTA][p.sub.t] 8.439 (*) -.05
([d.sub.t] - [p.sub.t]) -0.133 .77 (*)
[DELTA]([d.sub.t] - [p.sub.t]) 1.686 (*) -.42 (*)
[r.sub.t] 8.532 (*) -.06
[DELTA][r.sub.t] 12.967 (*) -.53 (*)
Autocorrelation, [rho](k)
[rho](2) [rho](3) [rho](4)
[d.sub.t] .89 (*) .85 (*) .89 (*)
[DELTA][d.sub.t] .38 (*) -.56 (*) .76 (*)
[p.sub.t] .94 (*) .91 (*) .88 (*)
[DELTA][p.sub.t] .01 .16 -.18 (*)
([d.sub.t] - [p.sub.t]) .74 (*) .63 (*) .65 (*)
[DELTA]([d.sub.t] - [p.sub.t]) .16 -.29 (*) .33 (*)
[r.sub.t] .01 .16 -.18 (*)
[DELTA][r.sub.t] -.04 .23 (*) -.25 (*)
Autocorrelation,
[rho](k)
[rho](5) [rho](6)
[d.sub.t] .81 (*) .79 (*)
[DELTA][d.sub.t] -.53 (*) .33 (*)
[p.sub.t] .86 (*) .83 (*)
[DELTA][p.sub.t] .01 .01
([d.sub.t] - [p.sub.t]) .53 (*) .49 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -.19 (*) .11
[r.sub.t] .01 .01
[DELTA][r.sub.t] .09 .09
Notes: The sample period is 1926i- 97iv. p(k) = autocorrelation between
[x.sub.t] and [x.sub.t-k], for x = {d,p,d - p, [DELTA]d, [DELTA]p,
[DELTA](d - p), [r.sub.t], [DELTA][r.sub.t]}. [d.sub.t] is the log real
dividend series, [p.sub.t] is the log real stock price series, [r.sub.t]
is the expected stock return series, [DELTA] = (1 - L) denotes the first
difference. An asterisk denotes significantly different from zero at the
5 percent level. Skew and Kurt denotes standard skewness and kurtosis
statistics. The unit root tests are the Dickey-Fuller (DF) and the
Augmented Dickey-Fuller (ADF) with constant and with and without time
trend, for the null hypothesis that the series is unit root, see Dickey
and Fuller (1979). The ADF unit root test is of the [[tau].sub.[tau]]
from if it includes a time trend and [[tau].sub.[mu]] if it does not
include a time trend. The lag truncation of four was chosen as this
ensured that absence of serial correlation in the residuals of the ADF
regression. The critical [[tau].sub.[tau]] and [[tau].sub.[mu]] are
-3.43 and -2. 87, at the 5%, and -3.99 and -3.46 at the 1%, level of
significance, respectively, see MacKinnon (1991).
TABLE 2
Results from Cointegration Tests
[d.sub.t] = -0.855 + 0.580[p.sub.t] DW = 0.83
(0.033) (0.021) [R.sup.2] = 0.73
[25.932] [25.561] ADF = -4.20
Johansen Maximum Likelihood
Estimation
[d.sub.t] = 0.702[p.sub.t]
LR Test for cointegrating vector
[1,-1]':[[chi].sup.2](1) = 5.47
P-value = 0.02
10%
[H.sub.0] [lambda]-Max Critical Values Trace
r = 0 17.43 10.60 18.64
r [less than or equal to] 1 1.21 2.71 1.21
10%
[H.sub.0] Critical Values
r = 0 13.31
r [less than or equal to] 1 2.71
Notes: r denotes the number of cointegrating vectors. The lag truncation
of 4 was chosen using the Ljung-Box Q-statistic to ensure whiteness of
the VAR (and ADF regression--the ADF test is of the [[TAU].sub.[TAU]]
form) residuals. The critical values of the Johansen cointegration tests
are those reported in CATS in RATS. The sample period is 1926i- 97iv.
TABLE 3
Tests of the Present Value Model
[rho] = 0.9926 (3.02% discount rate)
AIC/BIC selects a lag length of five
Test of present value model: [[chi].sup.2](10): 756.548;
P-value < 0.0001%
[R.sup.2] = 0.738; DW = 1.996; SSE = 6.725
[rho] = 0.9940 (2.46% discount rate)
AIC/BIC selects a lag length of five
Test of present value model: [[chi].sup.2](10): 756.404;
P-value < 0.0001%
[R.sup.2] = 0.738; DW = 1.996; SSE = 6.734
Granger Tests
[DELTA][d.sub.t] equation [R.sup.2] = 0.561; [y.sub.t] Granger-
causes
[DELTA][d.sub.t] at 0.001%
[y.sub.t] equation [R.sup.2] = 0.875; [DELTA][d.sub.t] Granger-
causes
[y.sub.t] at < 0.001%
Notes: The information criteria are the Akaike information criterion
(AIC) and the Bayes information criterion (BIC). White's
Zeteroskedasticity-consistent covariance matrix estimator is used in
constructing standard errors and test statistics. The sample period
is 1926i-97iv.
TABLE 4
p-values for the Linearity Tests of the Log Dividend-Price
Ratio, [y.sub.t]: AR(4)
[W.sub.1] [W.sub.4] [W.sub.3] [W.sub.2]
g = 1 0.3434 0.3761 0.1613 0.6353
g = 2 0.2768 0.9702 0.0207 0.6817
g = 3 0.5036 0.2171 0.5668 0.6387
g = 4 0.0090 0.0505 0.0159 0.3089
g = 5 0.0004 0.5697 0.0000 0.2308
g = 6 0.0253 0.0879 0.0334 0.3199
g = 7 0.2126 0.4848 0.2115 0.1776
g = 8 0.2979 0.3133 0.2846 0.3725
Note: The sample period is 1926i- 97iv. The long-run equilibrium
adjustment is given by the demeaned log price-dividend ratio, [y.sub.t]
= [d.sub.t] - [p.sub.t]. The artificial regression (7), used to
calculate the linearity Wald-tests, are based on q set equal to four.
All test statistics were constructed using heteroskedasticity-robust
methods (see White [1980]).
TABLE 5
p-values for the Linearity Tests of the [y.sub.t]: AR(1)
[W.sub.1] [W.sub.4] [W.sub.3] [W.sub.2]
g = 1 1.354e-6 0.2089 1.190e-7 0.5169
g = 2 0.1708 0.3220 0.5505 0.0549
g = 3 2.45e-10 1.413e-9 0.0018 0.2665
g = 4 2.58e-15 6.490e-6 6.77e-12 0.0603
g = 5 0.3874 0.9607 0.8348 0.0836
g = 6 0.4662 0.6761 0.9319 0.1231
g = 7 4.549e-5 0.2964 0.0004 0.0026
g = 8 0.0624 0.0224 0.9643 0.1482
Notes: The sample period is 1926i-1997iv. The long-run equilibrium
adjustment is given by the demeaned log price-dividend ratio, [y.sub.t]
= [d.sub.t] - [p.sub.t]. The artificial regression (7), used to
calculate the linearity Wald-tests, are based on q set equal to four.
All the test statistics were constructed using heteroskedasticity-robust
methods (see White [1980]).
RELATED ARTICLE: ABBREVIATIONS
ADF: augmented Dickey-Fuller
ARCH: autoregressive conditional heteroskedasticity
DW: Durbin-Watson
CRSP: Center for Research in Securities Prices
ESTAR: exponential smooth transition autoregressive
PACF: partial autocorrelation function
SBBI: Stock, Bonds, Bills, and Inflation
STAR: smooth transmission autoregressive