文章基本信息

标题：Risky arbitrage, limits of arbitrage, and nonlinear adjustment in the dividend-price ratio.
作者：Gallagher, Liam A. ; Taylor, Mark P.
期刊名称：Economic Inquiry
印刷版ISSN：0095-2583
出版年度：2001
期号：October
语种：English
出版社：Western Economic Association International
关键词：Arbitrage;Economics;Securities;Securities industry;Securities prices

Risky arbitrage, limits of arbitrage, and nonlinear adjustment in the dividend-price ratio.

Gallagher, Liam A. ; Taylor, Mark P.

MARK P. TAYLOR (*)

I. INTRODUCTION

The present value model of stock prices has generally been rejected on U.S. data, as in Campbell and Shifler (1987, 1988a, 1988b). (1) While this may be taken as evidence against the efficient markets hypothesis, an alternative view would be that the instantaneous arbitrage assumed in the simple present value model is too restrictive. In recent arbitrage models developed by, inter alios, Grossman and Miller (1988), De Long, Shleifer, Summers, and Waldmann (1990) and Campbell and Kyle (1993), arbitrage is generally less than perfect because arbitrageurs face either fundamental or noise trader risk. In particular, given that the actions of noise traders may lead to greater fundamental mispricing of an asset, perceived deviations of asset prices from their fundamental values represent risky arbitrage opportunities, as in Shleifer and Summers (1990). (2) Thus, small deviations from fundamentals may not be arbitraged because the perceived gains may not be enough to outweigh this risk. Given, however, a distributio n of degrees of risk aversion across smart traders, arbitrage will increase as the degree of fundamental mispricing increases, so that arbitrage is stabilizing and becomes more stabilizing in extreme circumstances. Traditional arbitrage models, therefore, imply a degree of nonlinearity in asset price dynamics, for example, as in Chiang, Davidson, and Okunev (1997). We term this broad approach the "risky arbitrage" hypothesis.

This approach may be contrasted with the more recent "limits of arbitrage" hypothesis, suggested by Shleifer and Vishny (1997), in which arbitrage activity is viewed in an agency context. In the Shleifer-Vishny model, arbitrageurs (the agents) have access to funds mainly from outside investors (the principals), who will generally gauge the ability of arbitrageurs--and hence decide on the amount of funds to allocate to them--based on their past performance. Since the track record of smart arbitrageurs is likely to be poorest when prices have deviated far from their fundamental values, the implication of the limits of arbitrage hypothesis is that arbitrage is likely to be least effective in returning prices to fundamental values when investor sentiment has driven them far away. (3) That is, "[w]hen arbitrage requires capital, arbitrageurs can become most constrained when they have the best opportunities, i.e., when the mispricing they have bet against gets even worse" (Shleifer and Vishny [1997, 37]). In contra st, the risky arbitrage models are without agency problems and arbitrageurs are more aggressive when prices move further from fundamental values.

In this article we provide a simple test of these two alternative views of arbitrage activity by nonlinear time series modeling of the aggregate log dividend-price ratio, such that adjustment towards equilibrium varies nonlinearly with the size of the deviation from equilibrium.

The remainder of the paper is set out as follows. Section II presents a simple test of the risky arbitrage hypothesis against the alternative of the limits of arbitrage hypothesis using the logarithmic present value model and recently developed techniques in parametric nonlinear modeling. The procedure for selecting the appropriate modeling of the log dividend-price ratio is outlined in section III. Section IV describes the data and presents the empirical results. Section VI extends the earlier empirical results to allow for a time-varying returns in the present value representation. Section VI concludes the study.

II. THE LOG DIVIDEND-PRICE RATIO, NONLINEARITY, ARBITRAGE, AND ADJUSTMENT

The loglinear present value model can be expressed as:

(1) [y.sub.t] = [d.sub.t] - [p.sub.t] = -[summation over ([infinity]/j=0)] [[rho].sup.j][E.sub.t][DELTA][d.sub.t+1+j] + [k.sup.*]

where [p.sub.t] and [d.sub.t] are the log of real stock prices and real dividends, respectively, [rho] = [(1+R).sup.-1], where R is a constant discount rate equal to the average dividend-price ratio, and [k.sup.*] is a constant, for example, as in Campbell and Shiller (1988b). When [p.sub.t] and [d.sub.t] are first-difference stationary, I(1), equation (1) implies that they are also cointegrated with a cointegrating vector [1,- 1]', that is, that the log dividend-price ratio, [y.sub.t] = [d.sub.t] - [p.sub.t], is stationary.

The loglinear representation of the present value model of stock prices also implies a number of highly nonlinear cross-equation restrictions similar to those encountered in rational expectations models, as in Campbell and Shiller (1988b), Cuthbertson (1996), and Campbell, Lo, and MacKinlay (1997). Campbell and Shiller (1988b) show that, given [rho], the restrictions can be simplified to a linear form. A test of the cross-equations restrictions, with constant expected excess returns, is a Wald test statistic for zero coefficients in a regression of [[xi].sub.t] on lagged [y.sub.t] and [DELTA][d.sub.t], with the asset return [[xi].sub.t] [equivalent to] k + [[rho]P.sub.t] + (1 - [rho])[d.sub.t] - [P.sub.t-1] = k - [rho][y.sub.t] + [y.sub.t-1] + [DELTA][d.sub.t], and the constant k = - log([rho]) - (l - [rho])log(1/[rho] - 1). (4) Campbell and Shiller (1988b) also demonstrate that a weak test of the model is that the log dividend-price ratio Granger-causes changes in log dividends.

In aggregate U.S. stock market data, it is well known that the log dividend series is very close to a random walk, possibly with drift, as in Cochrane (1994) and Campbell, Lo, and MacKinlay (1997). Let this drift parameter, the average dividend growth, be c. Then the right-hand-side of (1) collapses to a constant [k.sup.**]= -[c/(1 - [rho])] + [k.sup.*]. In this simple formulation, [K.sup.**] becomes the fundamental value of the log dividend-price ratio, so that stocks are judged to be fundamentally mispriced as their real price deviates from a constant multiple (-[k.sup.**]) of the level of real dividends. A simple test of the risky arbitrage hypothesis against the alternative of the limits of arbitrage hypothesis then becomes a test of whether [y.sub.t] adjusts towards [k.sup.**] more quickly or more slowly as the size of the deviation of [y.sub.t] from [k.sup.**] grows.

A parsimonious parametric time series model of nonlinear mean reversion which has been shown to approximate well a broad range of nonlinearity is the smooth transition autoregressive (STAR) model, as in Terasvirta (1994). The exponential smooth transition autoregressive model of order q [ESTAR(q)] may be written:

(2) [y.sub.t] = [[pi].sub.0] + [summation over (q/i=1)] [[pi].sub.i][y.sub.t-i] + {[[pi].sup.*.sub.0] + [summation over (q/i=1)] [[pi].sup.*.sub.i][y.sub.t-i]}

x {1 - exp[-[gamma][([y.sub.t-g] - [c.sup.*]).sup.2]]} + [u.sub.t]

where [y.sub.t] is assumed stationary and ergodic, [u.sub.t] is a stochastic disturbance term, and [gamma] > 0. (5) The exponential transition function F([y.sub.t-g])=1-exp[-[gamma][([y.sub.t-g] - [c.sup.*]).sup.2]] is U-shaped and bounded between zero and unity, with the (smoothness) parameter [gamma] determining the speed of the transition process between extreme regimes. The middle regime corresponds to F = 0, [y.sub.t-g] = [c.sup*], when (2) becomes a linear AR(q) model:

(3) [y.sub.t] = [[pi].sub.0] + [summation over (q/i=1)] [[pi].sub.i][y.sub.t-i]+[u.sub.t].

The outer regime corresponds to the limit, lim [absolute value[y.sub.t-g] - [c.sup.*]] [right arrow] [infinity] when F = 1 and (2) becomes a different AR(q) model:

(4) [y.sub.t] = ([[pi].sub.0] + [[pi].sup.*.sub.0]) + [summation over (q/i=1)]([[pi].sub.i]+[[pi].sup.*.sub.i])[y.sub.t-i]+[u.sub.t].

Intermediate values of F will result in the dynamics governed by a linear combination of (3) and (4) with the weights given by (1-F) and F respectively. Global stability of the ESTAR(q) model requires

(5) [summation over (q/i=1)]([[pi].sub.i] + [[pi].sup.*.sub.i]) < 1.

Now, if

(6) [summation over (q/i=1)]([[pi].sub.i] + [[pi].sup.*.sub.i]) < [summation over (q/i=1)] [[pi].sub.i],

then this would imply that the degree of mean reversion grows as the deviation from the fundamental equilibrium grows, consistent with the risky arbitrage hypothesis. On the other hand, if the reverse inequality holds, then the degree of mean reversion shrinks as the degree of mispricing grows, consistent with the limits of arbitrage hypothesis. Thus, from (6), we can deduce that support for the risky arbitrage hypothesis would be provided if the estimated value of [summation over (q/i=1)] [[pi].sup.*.sub.i] were negative and significantly different from zero, while support for the limits of arbitrage hypothesis would be provided if this sum were positive and significantly different from zero.

III. LINEARITY TESTING AND MODEL SELECTION

Terasvirta (1994) suggest testing linearity against ESTAR by first specifying the appropriate order of the autoregressive components, q, and suggests choosing this from an examination of the partial autocorrelation function (PACF) of [y.sub.t] in the usual fashion. For a given value of the delay parameter g, Granger and Terasvirta (1993) and Terasvirta (1994) show that appropriate tests of the null hypothesis of linearity against an alternative hypothesis of nonlinear adjustment may be based on the artificial regression:

(7) [y.sub.t] = [[beta].sub.00] + [summation over (q/i=1)] ([[beta].sub.1i][y.sub.t-i] + [[beta].sub.2i][y.sub.t-i][y.sub.t-g]

+ [[beta].sub.3i][y.sub.t-i][y.sup.2.sub.t-g] + [[beta].sub.4i][y.sub.t-i][y.sup.3.sub.t-g]) + [[epsilon].sub.t].

Since (7) may be viewed as a reparameterization of (2), with an unrestricted third-order Taylor series expansion of the transition function, an appropriate simple test of nonlinearity is clearly an F-test, [F.sub.1], of the following restrictions on (7):

(8) [H.sub.01]: [[beta].sub.2i] = [[beta].sub.3i] [[beta].sub.4i] = 0, i = 1,... , q

against the alternative that [H.sub.01] is not valid.

If the transition function is of the exponential family discussed above, however, third-order terms vanish in its Taylor series expansion, see Granger and Terasvirta (1993). Intuitively, because the exponential transition function is U-shaped as a function of [y.sub.t-g], it will be better approximated by a quadratic than by a cubic. Moreover, given that we shall examine the behavior of [y.sub.t] with its mean removed, if the dividend-price ratio averaged over the whole sample period has been close to the equilibrium level, we would also expect the ESTAR model (2) to satisfy [[pi].sup.*.sub.0] = [c.sup.*] = 0. If (7) is interpreted as the Taylor series expansion of (2), this would further imply [[beta].sub.2i] = 0 in (7). This reasoning therefore suggests the following sequence of tests:

(9a) [H.sub.04]: [[beta].sub.4i] = 0 i = 1,...,q

(9b) [H.sub.03]: [[beta].sub.3i] = 0 \[[beta].sub.4i] = 0, i = 1,...,q

(9c) [H.sub.02]: [[beta].sub.2i] = 0 \[[beta].sub.4i] = 0, i = 1,...,q

where we might denote the relevant Wald-statistics for (9a), (9b), and (9c) respectively by [W.sub.4], [W.sub.3] and [W.sub.2] If the true model is ESTAR, we would expect not to reject [H.sub.04] but to reject [H.sub.03], and if in addition the sample mean value of [y.sub.t] is close to the equilibrium value, we would expect not to reject [H.sub.02].

Of course, in practice g is not known. We therefore follow the procedure suggested by Granger and Terasvirta (1993) and Terasvirts (1994) for selecting g. This involves testing the null hypothesis [H.sub.01] for a range of values of g = 1, 2,...G, and in each case calculating the Wald-statistic [W.sub.1](g). The delay parameter is then chosen such that [W.sub.1](g) = [sup.sub.g] [W.sub.1](g) g = 1,..., G. Although it might be thought that maximizing the test statistic in this fashion would generate substantial pre-test bias, the Monte Carlo evidence of Terasvirta (1994) suggests that this should only lead to slight bias in the test size. If this procedure leads to linearity being rejected in favor of an ESTAR(q) model, we follow Tong (1990) in estimating (2) by nonlinear least squares, which provides estimators that are consistent and asymptotically normally distributed. We use heteroskedasticity-robust forms of these Wald statistics in our empirical work (see White 1980).

IV. EMPIRICAL RESULTS WITH U.S. DATA

Quarterly data on aggregate U.S. stock market prices and dividends, for the 1926i--97iv period, were obtained from the data base of the Center for Research in Securities Prices (CRSP) of the University of Chicago. The stock price and dividend data are from the CRSP Indices files and consumer prices used to deflate the nominal series is from the Stock, Bonds, Bills, and Inflation (SBBI) series (Ibbotson and Associates). Table 1 reports some summary statistics on the series of interest. (6) The sample autocorrelations of the price and dividend series reveal some degree of persistence in each series as they tend to die away slowly. The first-order autocorrelation values close to one suggest that the series are non-stationary. The unit root tests confirm that we cannot reject the null hypothesis of I(1) behavior for the price, dividend and returns series at standard significance levels.

The results from two tests of cointegration are reported in Table 2. Although much of the analysis in the present article is predicated on the assumption that adjustment in the dividend-price ratio is nonlinear, Balke and Fomby (1997) have shown that standard cointegration procedures work reasonably well in the nonlinear case and suggest that their results are likely to hold for smooth transition models, although a cautious note in interpreting the cointegration tests in the presence of nonlinearity is provided by Corradi, Swanson, and White (2000). However, even in the presence of nonlinearity, the Monte Carlo evidence of Corradi, Swanson, and White (2000) shows that, in large samples, the augmented DickeyFuller (ADF) test can still be used to signal the presence of a unit root so long as a trend is included in the auxiliary regression (i.e., so long as the [[TAU].sub.[TAU]] form of the ADF statistic is used).

The ADF test with a lag length set at four rejects the null hypothesis of no cointegration. If we impose a unit slope coefficient and test the stationarity of the log price-dividend ratio, the ADF test statistic is -3.47 (see Table 1). On the basis of these unit root test statistics, we reject the hypothesis that the log dividend-price ratio is non-stationary at the 5 percent level.

Further evidence is provided by the Johansen (1988) maximum likelihood estimation technique which strongly rejects the null hypothesis of no cointegration, with the long-run equilibrium given by [d.sub.t] 0.702[p.sub.t]. (7) Moreover, the likelihood ratio statistic for the hypothesis that [d.sub.t] and [p.sub.t] are cointegrated with a cointegrating vector [1, -1]', asymptotically distributed as [[chi].sup.2](1) under the null hypothesis, is equal to 5.47 with a marginal significance level of 2 percent. Notwithstanding Balke and Fomby's (1997) evidence concerning the Johansen test in the presence of nonlinearity, Corradi, Swanson, and White (2000) show that it may be subject to substantial size distortions in such circumstances. Hence, we are not unduly worried by the formal rejection, at the nominal 5 percent level, of the hypothesis that the cointegrating vector is [1, -1]'; this may also be due to a failure to allow for time-varying returns (see section V). With these caveats, the cointegration results ne vertheless suggest that the log present value model holds with the adjustment to the long-run equilibrium given by the mean-reverting log dividend-price ratio. (8)

Unit root test suggest that [y.sub.t] and [[xi].sub.t] are stationary for the implied annualized discount rates of 3.02 percent ([rho] = 0.9926) and 2.46 percent ([rho] = 0.9940). (9) The Wald tests that asset returns are unpredictable, that is, that the coefficients on the lagged and [DELTA][d.sub.t] and [y.sub.t] are jointly zero, are strongly significant at less than the 0.0001 percent level (see Table 3). Thus, for quarterly U.S. stock prices, and similarly to other studies (for example, Campbell and Shiller [1988b]), the log present value model is statistically rejected at all conventional significance levels. A weak test of the present value model (i.e., that the log dividend-price ratio Granger-causes dividends) is supported by the data. Given that there does appear to be a long-run relationship between dividends and prices, however, this may be due to less than instantaneous arbitrage, as discussed above, resulting in the log dividend-price ratio following a nonlinear mean-reverting process.

In modeling and testing for nonlinearity we use the demeaned log dividend-price ratio, [y.sub.t], presented in Figure 1. Visual examination of the series does not suggest the presence of a regime shift in the data or the presence of outliers which might spuriously generate evidence of nonlinearity. Moreover, there appeared to be no strong visual evidence of variation in the mean of the series. This was also confirmed by simple, recursive Chow tests on the estimated mean (not reported). The series yields evidence of positive skewness (see Table 1) and the computed Jarque-Bera statistic of 9.87 reveals statistically significant non-normality at the 1 percent level. For a large part of the 1926-56 period, the log dividend-price ratio is above the mean (of the full series) and below the mean in the succeeding period. Examination of the PACF of [y.sub.t] (not reported but available on request) revealed significant correlations up to order four. Accordingly the linearity tests are based on the artificial regression (7) with q set equal to four. Table 4, which reports tests of linearity, provides strong evidence of nonlinearity: [W.sub.1] rejects linearity at the near zero percent level for g = 5 and the [W.sub.4], [W.sub.3], and [W.sub.2] statistics strongly suggest that an ESTAR(4) model with g = 5 and [[pi].sub.0] = [c.sup.*] = 0 is the most appropriate parameterization for [y.sub.t].

As is common in the analysis of asset price data, see for example, Campbell, Lo, and MacKinlay (1997), the modeling of [y.sub.t] as an ESTAR(4) process indicated substantial autoregressive conditional heteroskedasticity (ARCH; Engle [1982]) in the innovations. This additional nonlinearity was therefore captured by modeling [y.sub.t] as an ESTAR(4)-ARCH(1) process. (10). In estimating the nonlinear model we follow Terasvirta (1994) and standardize the exponent of the transition function F by dividing it by [[sigma].sup.2.sub.y] the sample variance of [y.sub.t], and choosing a starting value for the standardized smoothness parameter equal to 1. The estimated nonlinear ear model is (11).

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.7066 DW = 1.9601

ARR(4) = l.1609 [[sigma].sup.2.sub.y] = 0.0847

The figures in parentheses are standard errors and t-ratios are given in braces. [R.sup.2] is the proportion of the variation in [y.sub.t] explained by the model; DW is the Durbin-Watson statistic; ARR(4) denotes a Lagrange multiplier test statistic for up to fourth-order autocorrelation of the residuals, as in Eitrheim and Terasvirta (1996).

A simple linear modeling of [y.sub.t] as an AR(4) process reveals a slightly lower goodness-of-fit than the nonlinear modeling of [y.sub.t]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.68 DW = 1.89 ARR(4) = 0.79

Since there was evidence of ARCH effects, we also modeled [y.sub.t] as an AR(4)-ARCH(1) process:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.68 DW = 1.91 ARR(4) = 0.70

and the results are similar to those of the simple AR(4) model.

Investigating more parsimonious ESTAR models reveals the variables which can be omitted from the final nonlinear model specification. After estimating the unrestricted model and deleting terms insignificant at the five percent level, the parsimonious estimated nonlinear model is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.7018 DW = 1.9641

ARR(4) = 1.2287 LR(4) = 0.3578

LR(4) denotes a likelihood ratio statistic for the parsimonious restrictions implicit in the estimated model against the unrestricted model. The estimated model clearly fits well, with well determined coefficients and satisfactory diagnostics. The dynamics of the model are interesting, since the first four estimated parameter values of the ESTAR(4) model sum to greater than unity, implying that the log dividend-price ratio actually tends to move away from the long-run equilibrium implied by the present value model when it is in its close neighborhood and is mean reverting when far away from its equilibrium level.

The coefficient of determination in our preferred ESTAR-GARCH model at 0.7, is superior to that of the linear specifications at 0.68, albeit slightly so. The significant evidence of nonlinearity presented in Table 4, however, shows that the linear models are actually misspecified, so that, although there is only a marginal increase in goodness of fit by allowing for nonlinearity, there is increased confidence in the ability of the model to capture the stock price dynamics.

The scatter plot of the estimated transition function against [y.sub.t-5], given in Figure 2, shows that the distribution of the log dividend-price ratio is in fact more or less symmetrically distributed around the estimated mean and this is confirmed by a simple count which reveals that some 47% of the deviations are above the mean, with the remaining more or less half of the sample below. It is also apparent from Figure 2 that for large deviations from the long-run equilibrium there is some evidence of a fast adjustment back towards the equilibrium, suggesting support for the risky arbitrage hypothesis. This impression is confirmed formally, with the outer regime coefficient of -0.620 significantly different from zero at the 0.001 percent level (in the unrestricted model, the sum of the outer regime coefficients, [summation over (4/i=1)][[pi].sup.*.sub.1], is -0.616 and is significantly different from zero at the 0.442 percent level). Thus, the degree of mean reversion increases significantly with the size of the deviation of the log dividend-price ratio from the long-run equilibrium level suggested by the fundamentals.

On the other hand, Figure 2 reveals that while mean reversion increases with the degree of mispricing, the speed of mean reversion is still relatively low for all but the largest deviations from equilibrium. Shleifer and Vishny (1997) note that arbitrage is likely to be weakest in sectors of the market in which arbitrage is particularly risky, such as more volatile sectors. Accordingly, by concentrating on the market index, we may be masking important limits to arbitrage effects in particular sectors, and this suggests an avenue for future research.

V. TIME-VARYING EXPECTED RETURNS

The present value model defined by equation (1) assumes that expected stock returns are constant. Recent empirical evidence, as in Timmermann (1995) and Campbell, Lo, and MacKinlay (1997), suggests that expected stock returns may be time-varying. A loglinear approximation of the present value model outlined in Cuthbertson (1996, 347) and Campbell, Lo, and MacKinlay (1997, 260-64), is given by:

(10) [y.sub.t] = [d.sub.t] - [p.sub.t] = [k/(1 - [rho])]

+ [E.sub.t][[summation over ([infinity]/j=0)] [[rho].sup.j](-[DELTA][d.sub.t+1+j] + [r.sub.t+1+j])]

where [r.sub.t+1] [equivalent to] log([P.sub.t+1] + [D.sub.t+1]) - log([P.sub.t]) [approximately equal to] k + [rho] [p.sub.t+1] + (1 - [rho])[d.sub.t+1] - [p.sub.t] is the ex post stock return, [rho] [equivalent to] 1/[1 + exp(d - p)], k = -log([rho]) - (1 - [rho]) log(1/[rho] - 1), and d - p is the average log dividend-price ratio. [P.sub.t] and [D.sub.t] are the real stock price and dividend series, respectively. If [d.sub.t] and [p.sub.t] are each generated by an I(1) process, then (10) implies that [y.sub.t] will be a stationary process if and only if the stock return series [r.sub.t] is generated by a stationary, I(0) process. In practice, Campbell, Lo, and MacKinlay (1997) and Timmermann (1995) point out that, at least with U.S. data, [r.sub.t] appears to be generated by a highly persistent process which may be hard to distinguish from an I(1) process. By rearranging (10), we redefine [y.sub.t] as:

(11) [y.sub.t] = [d.sub.t] - [p.sub.t] - [1/(1 - [rho])][r.sub.t]

= [k/(1 - [rho])] + [E.sub.t] ([summation over ([infinity]/j=0)] [[rho].sup.j] {-[DELTA][d.sub.t+1+j] + [1/(1 - [rho])][DELTA][r.sub.t+1+j]})

which suggests testing for the stationarity of [y.sub.t] by testing for cointegration between the log dividend-price ratio and the stock return. We may then test for nonlinear adjustment in the behavior of the redefined [y.sub.t], as outlined above.

Testing for stationarity of the log dividend-price ratio may be problematic in the time-varying returns model as given by equation (11), however. Redefining [y.sub.t] = [d.sub.t] - [p.sub.t] - [1/(1 - [rho])][r.sub.t], where [r.sub.t] [equivalent to] log([P.sub.t] + [D.sub.t]) - log([P.sub.t-1]) implies [y.sub.t] is a stationary process if [d.sub.t], [P.sub.t], and [r.sub.t] are cointegrated. The ADF test statistics reported in Table 1 support the null hypothesis that [d.sub.t] and [p.sub.t] are generated by unit root processes and reject the hypothesis that [r.sub.t] is unit root. However, further examination of the partial autocorrelation function finds that [r.sub.t] is highly persistent, with a root just within rather than actually on the unit circle. This suggests that, in a finite-sample context, it may be fruitful to test for a cointegrating relationship of the form of the left side of (11), i.e., to treat [r.sub.t] as a unit root process.

We test for cointegration between the log dividend-price ratio and the ex post rate of return. The Johansen (1988) maximum likelihood technique strongly rejects the null hypothesis of no cointegration in favor of one cointegrating vector at the 5 percent significance level, with a cointegrating vector [1, -1, -(1/1 - [rho])]'. A likelihood ratio test that the first two elements of this vector were indeed [1, -1]' yielded a statistic insignificant at the 5 percent level. The stationary long-run equilibrium is given by [y.sub.t] = [d.sub.t] - [p.sub.t] - [98.6r.sub.t] + 6.2, implying that [rho] = 0.98986. (12) An alternative approach to calculate [rho] is to set it equal to 1/1[1+exp(d-p)], where d-p is the average log dividend-price ratio. This yields a [rho] of 0.9895 which is very close to the [rho] implied by the long-run equilibrium, as computed above. Therefore, it is no surprise that both values of [rho] generate very similar results.

In modelling and testing for nonlinearity we use the deseasonalised [y.sub.t] Examination of the PACF of [y.sub.t] revealed significant correlations up to order one, setting q = 1 for the linearity tests. Table 5 provides strong evidence of nonlinearity: [W.sub.1] rejects linearity at the near zero percent level for g = 4 and the [W.sub.4], [W.sub.3] and [W.sub.2] statistics strongly suggest that an ESTAR(1) model with g = 4 and [[pi].sub.0] = [c.sup.*] = 0 is the most appropriate parameterization for [y.sub.1]. The estimated nonlinear model is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.13 DW = 2.04

ARR(4) = 1.59 [[sigma].sup.2.sub.y] = 119.05

Investigating more parsimonious models reveals the variables which can be omitted from the final nonlinear model specification. After estimating the unrestricted model and deleting terms insignificant at the 5 percent level, the parsimonious estimated nonlinear model is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[R.sup.2] = 0.13 DW 2.04

ARR(4) = 1.59 LR(1) = 0.04

LR(1) denotes a likelihood ratio statistic for the parsimonious restrictions implicit in the estimated model against the unrestricted model. The estimated model clearly fits well, with well determined coefficients and satisfactory diagnostics. The dynamics of the model are interesting, quite different from the constant returns case, since the parameter estimates imply mean reversion even in the neighborhood of equilibrium, although the speed of mean reversion rises as the degree of mispricing increases.

As in the constant expected returns case, the outer regime coefficient is statistically significantly different from zero at standard significance levels. The coefficient value of -1.0482 suggests that the degree of mean reversion increases significantly with the size of the deviation of from the present value representation. Furthermore, the scatter plot of the estimated transition function against [y.sub.t-4], given in Figure 3, shows that, for large deviations from the long-run equilibrium there is evidence of a very fast adjustment back towards the equilibrium. Thus, the time-varying expected returns representation of the present value model also supports the risky arbitrage hypothesis.

VI. CONCLUSION

The research reported in this paper represents a first attempt to discriminate between recent alternative hypotheses concerning arbitrage in financial markets. The evidence presented reveals that the market log dividend-price ratio is approximated well by an ESTAR-ARCH model, so that adjustment towards the long-run equilibrium implied by the loglinear version of the present value model is nonlinear. The parameters of the estimated nonlinear models imply significantly increasing mean-reverting adjustment as the degree of mispricing rises. These findings are consistent with the risky arbitrage hypothesis.

Further research might concentrate on particular sectors of the market rather than focusing on the market index--for example, those with higher than average volatility where the risks to arbitrage are greatest and its effect is therefore likely to be weakest, as indicated in Shleifer and Vishny (1997).

Gallagher: Economics Fellow, University of Oxford, United Kingdom, and Department of Economics, University College Cork, Cork, Ireland. Phone +353-21-4902974, Fax +353-21-4273920, E-mail I.gallagher@ucc.ie

Taylor: Centre for Economic Policy Research, United Kingdom, and Professor of Economics and Finance, Warwick Business School, University of Warwick, Coventry CV4 7AL, United Kingdom. Phone +44-24-765-72832, Fax +44-24-765-73013, E-mail mark.taylor@wbs.warwick.ac.uk

(*.) We wish to thank three anonymous referees for their helpful comments and suggestions.

(1.) There exists a number of competing theories that explain the deviation of the market and fundamental values (represented by expected value of future discounted dividends), including noise traders, fads, and speculative bubbles, as in DeLong, Shleifer, Summers and Waldmann (1990). These theories suggest that stock prices move away from their fundamental value for periods of time. Stock prices deviate from fundamentals in a highly persistent way that reflects a random walk process, see for example, Summers (1986) and Shleifer and Vishny (1997). Furthermore, Summers (1986, 599) noted that "[r]isk-averse speculators will only be willing to take limited positions when they perceive valuation errors. Hence errors will not be eliminated unless they are widely noticed." Therefore, although stock prices may reflect their fundamentals in the long run, they may deviate substantially from their fundamentals for long periods of time, as in De Long, Shleifer, Summers and Waldmann (1990).

(2.) Shleifer and Summers (1990) provide an accessible summary of the fundamental and noise trader risks facing arbitrageurs and the role of investor sentiment in driving stock prices away from fundamentals.

(3.) Frankel and Froot (1986) and Goodhart (1988) apply a similar approach to foreign exchange rates, suggesting that smart money may have less and less influence on the market exchange rate as it moves away from the fundamental level recognised by smart money. Kirwan's (1993) "epidemics of opinion" model is also closely related.

(4.) In order to avoid the problem that [p.sub.t] and [d.sub.t] are not measured contemporaneously, in testing the cross-equation restrictions, we follow Campbell and Shiller (1988b) in constructing [y.sub.t] as [d.sub.t-1] - [p.sub.t].

(5.) The ESTAR model can be viewed as a generalization of the exponential autoregressive (EAR) model with [k.sup.*] = [c.sup.*] = 0, or as a generalization of a special case of a double-threshold autoregressive (TAR) model, as in Terasvirta (1994).

(6.) The quarterly log dividend series reveals some degree of seasonality. For this reason, the real log dividend series was deseasonalized by regressing the series against seasonal dummies, and using the deseasonalized dummies for empirical estimation.

(7.) Gonzalo and Tae-Hwy (1998) suggest that both ADF and Johansen cointegration tests be employed in testing for cointegration.

(8.) Previous evidence of cointegration between log real stock prices and dividends is mixed, see for example, Campbell and Shiller (1988a,b) and Cuthbertson, Hayes and Nitzsche (1997). The majority of studies find weak support for the cointegrating relationship and a stationary log dividend-price ratio.

(9.) The estimates of [rho] are derived from OLS and Johansen ML estimation of the cointegrating vector of real stock prices and real dividends and are consistent with previous studies, see for example, Campbell and Shiller (1987), Campbell and Shiller (1988b) and Cuthbertson, Hayes and Nitzsche (1997).

(10.) In fact, we tried a range of generalized ARCH process, as in Bollerslev (1987), and found that a simple ARCH(l) formulation was adequate in terms of the significance of estimated parameters.

(11.) Corradi, Swanson and White (2000) show that the consistency of estimated cointegrating parameters in a nonlinear setting is only guaranteed in a very specialized first-order Markovian modeling situation. Although we have imposed a particular form of the cointegrating vector [1, -1]' this suggests that the usual superconsistency results associated with cointegration in the linear case, as in Stock (1987) is not guaranteed in the present application. Some caution should therefore be exercised for example in interpreting the standard errors of the estimated parameters in our models, since they are conditioned on an estimate of the cointegrating vector.

(12.) The value of [rho] is consistent with previous studies, see for example, Campbell, Lo, and MacKinlay (1997, 261).

REFERENCES

Balke, N. S., and T. B. Fomby. "Threshold Cointegration." International Economic Review, 38(3), 1997, 627-45.

Bollerslev, T. "A Conditional Heteroskedastic Time Series Model for Speculative Prices and Rates of Return." Review of Economics and Statistics, 69, 1987, 542-47.

Campbell, J. Y., and A. Kyle. "Smart Money, Noise Trading, and Stock Price Behavior." Review of Economic Studies, 60(1), 1993, 1-34.

Campbell, J. Y., A. W Lo, and A. C. MacKinlay. The Econometrics of Financial Markets. Princeton, NJ: Princeton University Press, 1997.

Campbell, J. Y., and R. J. Shiller. "Cointegration and Tests of Present Value Models." Journal of Political Economy, 95(5), 1997, 1062-88.

-----. "Stock Prices, Earnings, and Expected Dividends." Journal of Finance, 43(3), 1988a, 661-76.

-----. "The Dividend-Price Ratio and Expectations of Future Dividends and Discount Factors." Review of Financial Studies, 1(3), 1988b, 195-228.

Chiang, R., I. Davidson, and J. Okunev, "Some Further Theoretical and Empirical Implications Regarding the Relationship Between Earnings, Dividends and Stock Prices." Journal of Banking and Finance, 21, 1997, 17-35.

Cochrane, J. H. "Permanent and Transitory Components of GNP and Stock Prices." Quarterly Journal of Economics, 109(436), 1994, 241-65.

Corradi, V., N. R. Swanson, and H. White. "Testing for Stationarity-Ergodicity and for Comovements Between Nonlinear Discrete Time Markov Processes." Journal of Econometrics, 96(1), 2000, 39-73.

Cuthbertson, K. Quantitative Financial Economics. Chichester, U.K.: John Wiley & Sons, 1996.

Cuthbertson, K., S. Hayes, and D. Nitzsche. "The Behaviour of UK Stock Prices and Returns: Is the Market Efficient?" Economic Journal, 107(443), 1997, 986-1008.

De Long, J. B., A. Shleifer, L. H. Summers, and R. J. Waldmann. "Noise Trader Risk in Financial Markets." Journal of Political Economy, 98(4), 1990, 703-38.

Dickey, D., and W. A. Fuller. "Distribution of the Estimates for Autoregressive Time Series with a Unit Root." Journal of the American Statistical Association, 74, 1979, 427-31.

Eitrheim, [empty set]., and T Terasvirta. "Testing the Adequacy of Smooth Transition Autoregressive Models." Journal of Econometrics, 74(1), 1996, 59-75.

Engle, R. F. "Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation." Econometrica, 50(4), 1982, 987-1008.

Frankel, J. A., and K. A. Froot. "The Dollar as an Irrational Speculative Bubble: A Tale of Fundamentalists and Chartists." The Marcus Wallenberg papers on International Finance, vol. 1, 1986, 27-55.

Gonzalo, J., and L. Tae-Hwy. "Pitfalls in Testing for Long Run Relationships." Journal of Econometrics, 86(1), 1998, 129-54.

Goodhart, C. A. E. "The Foreign Exchange Market: A Random Walk with a Dragging Anchor." Economica, 55, 1988, 437-60.

Granger, C. J., and T. Terasvirta. Modelling Nonlinear Economic Relationships. Oxford, U.K.: Oxford University Press, 1993.

Grossman, S. J., and M. H. Miller. "Liquidity and Market Structure." Journal of Finance, 43(3), 1988, 617-33.

Johansen, S. "Statistical Analysis of Cointegration Vectors." Journal of Economic Dynamics and Control, 12(2/3), 1988, 231-54.

Kirman, A. P. "Ants, Rationality and Recruitment." Quarterly Journal of Economics, 108, 1993, 137-56.

MacKinnon, J. G. "Critical Values for Cointegration Tests," in Long-Run Economic Relationships, edited by R. F. Engle and C. W. J. Granger. Oxford, U.K.: Oxford University Press, 1991, 267-76.

Shleifer, A., and L. H. Summers. "The Noise Trader Approach to Finance." Journal of Economic Perspectives, 4(2), 1990, 19-33.

Shleifer, A., and R. W. Vishny. "The Limits of Arbitrage." Journal of Finance, 52(1), 1997, 35-55.

Stock, J. M. "Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors," Econometrica, 55, 1987, 1035-56.

Summers, L. H. "Does the Stock Market Rationally Reflect Fundamental Values?" Journal of Finance, 41(3), 1986, 591-601.

Terasvirta, T. "Specification, Estimation, and Evaluation of Smooth Transition Autoregressive Models." Journal of the American Statistical Association, 89(425), 1994, 208-18.

Timmermann, A. "Cointegration Tests of Present Value Models with a Time-Varying Discount Factor." Journal of Applied Econometrics, 10, 1995, 17-31.

Tong, H. Non-Linear Time Series: A Dynamical System Approach. Oxford, U.K.: Oxford University Press, 1990.

White, H. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity." Econometrica, 48, 1980, 817-38.

[Figure 1 omitted]

[Figure 2 omitted]

[Figure 3 omitted]

TABLE 1

Summary Statistics and Unit Root Tests


 DF ADF-[[tau].sub.[tau]]

[d.sub.t] -2.794 -2.434
[DELTA][d.sub.t] -35.688 (*) -6.550 (*)
[p.sub.t] -1.294
[DELTA][p.sub.t] -7.668 (*)
([d.sub.t] - [p.sub.t]) -3.470 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -8.097 (*)
[r.sub.t] -7.686 (*)
[DELTA][r.sub.t] -12.365 (*)


 ADF-[[tau].sub.[mu]] Skew

[d.sub.t] -1.258 -0.666 (*)
[DELTA][d.sub.t] -6.561 (*) -0.358 (*)
[p.sub.t] -2.571 -0.229 (*)
[DELTA][p.sub.t] -7.670 (*) -0.316 (*)
([d.sub.t] - [p.sub.t]) -2.974 (*) 0.458 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -8.099 (*) 0.332 (*)
[r.sub.t] -7.676 (*) 0.364 (*)
[DELTA][r.sub.t] -12.349 (*) 0.906 (*)

 Autocorrelation, [rho](k)
 Kurt [rho](1)

[d.sub.t] -0.534 .90 (*)
[DELTA][d.sub.t] 2.895 (*) -.64 (*)
[p.sub.t] -1.059 (*) .97 (*)
[DELTA][p.sub.t] 8.439 (*) -.05
([d.sub.t] - [p.sub.t]) -0.133 .77 (*)
[DELTA]([d.sub.t] - [p.sub.t]) 1.686 (*) -.42 (*)
[r.sub.t] 8.532 (*) -.06
[DELTA][r.sub.t] 12.967 (*) -.53 (*)

 Autocorrelation, [rho](k)
 [rho](2) [rho](3) [rho](4)

[d.sub.t] .89 (*) .85 (*) .89 (*)
[DELTA][d.sub.t] .38 (*) -.56 (*) .76 (*)
[p.sub.t] .94 (*) .91 (*) .88 (*)
[DELTA][p.sub.t] .01 .16 -.18 (*)
([d.sub.t] - [p.sub.t]) .74 (*) .63 (*) .65 (*)
[DELTA]([d.sub.t] - [p.sub.t]) .16 -.29 (*) .33 (*)
[r.sub.t] .01 .16 -.18 (*)
[DELTA][r.sub.t] -.04 .23 (*) -.25 (*)

 Autocorrelation,
 [rho](k)
 [rho](5) [rho](6)

[d.sub.t] .81 (*) .79 (*)
[DELTA][d.sub.t] -.53 (*) .33 (*)
[p.sub.t] .86 (*) .83 (*)
[DELTA][p.sub.t] .01 .01
([d.sub.t] - [p.sub.t]) .53 (*) .49 (*)
[DELTA]([d.sub.t] - [p.sub.t]) -.19 (*) .11
[r.sub.t] .01 .01
[DELTA][r.sub.t] .09 .09

Notes: The sample period is 1926i- 97iv. p(k) = autocorrelation between
[x.sub.t] and [x.sub.t-k], for x = {d,p,d - p, [DELTA]d, [DELTA]p,
[DELTA](d - p), [r.sub.t], [DELTA][r.sub.t]}. [d.sub.t] is the log real
dividend series, [p.sub.t] is the log real stock price series, [r.sub.t]
is the expected stock return series, [DELTA] = (1 - L) denotes the first
difference. An asterisk denotes significantly different from zero at the
5 percent level. Skew and Kurt denotes standard skewness and kurtosis
statistics. The unit root tests are the Dickey-Fuller (DF) and the
Augmented Dickey-Fuller (ADF) with constant and with and without time
trend, for the null hypothesis that the series is unit root, see Dickey
and Fuller (1979). The ADF unit root test is of the [[tau].sub.[tau]]
from if it includes a time trend and [[tau].sub.[mu]] if it does not
include a time trend. The lag truncation of four was chosen as this
ensured that absence of serial correlation in the residuals of the ADF
regression. The critical [[tau].sub.[tau]] and [[tau].sub.[mu]] are
-3.43 and -2. 87, at the 5%, and -3.99 and -3.46 at the 1%, level of
significance, respectively, see MacKinnon (1991).
TABLE 2

Results from Cointegration Tests


[d.sub.t] = -0.855 + 0.580[p.sub.t] DW = 0.83
 (0.033) (0.021) [R.sup.2] = 0.73
 [25.932] [25.561] ADF = -4.20
Johansen Maximum Likelihood
 Estimation
[d.sub.t] = 0.702[p.sub.t]
LR Test for cointegrating vector
 [1,-1]':[[chi].sup.2](1) = 5.47
 P-value = 0.02
 10%
[H.sub.0] [lambda]-Max Critical Values Trace

r = 0 17.43 10.60 18.64
r [less than or equal to] 1 1.21 2.71 1.21

 10%
[H.sub.0] Critical Values

r = 0 13.31
r [less than or equal to] 1 2.71

Notes: r denotes the number of cointegrating vectors. The lag truncation
of 4 was chosen using the Ljung-Box Q-statistic to ensure whiteness of
the VAR (and ADF regression--the ADF test is of the [[TAU].sub.[TAU]]
form) residuals. The critical values of the Johansen cointegration tests
are those reported in CATS in RATS. The sample period is 1926i- 97iv.
TABLE 3

Tests of the Present Value Model

[rho] = 0.9926 (3.02% discount rate)
 AIC/BIC selects a lag length of five
 Test of present value model: [[chi].sup.2](10): 756.548;
 P-value < 0.0001%
 [R.sup.2] = 0.738; DW = 1.996; SSE = 6.725
[rho] = 0.9940 (2.46% discount rate)
 AIC/BIC selects a lag length of five
 Test of present value model: [[chi].sup.2](10): 756.404;
 P-value < 0.0001%
 [R.sup.2] = 0.738; DW = 1.996; SSE = 6.734
Granger Tests
 [DELTA][d.sub.t] equation [R.sup.2] = 0.561; [y.sub.t] Granger-
 causes
 [DELTA][d.sub.t] at 0.001%
 [y.sub.t] equation [R.sup.2] = 0.875; [DELTA][d.sub.t] Granger-
 causes
 [y.sub.t] at < 0.001%

Notes: The information criteria are the Akaike information criterion
(AIC) and the Bayes information criterion (BIC). White's
Zeteroskedasticity-consistent covariance matrix estimator is used in
constructing standard errors and test statistics. The sample period
is 1926i-97iv.
TABLE 4

p-values for the Linearity Tests of the Log Dividend-Price
Ratio, [y.sub.t]: AR(4)

 [W.sub.1] [W.sub.4] [W.sub.3] [W.sub.2]

g = 1 0.3434 0.3761 0.1613 0.6353
g = 2 0.2768 0.9702 0.0207 0.6817
g = 3 0.5036 0.2171 0.5668 0.6387
g = 4 0.0090 0.0505 0.0159 0.3089
g = 5 0.0004 0.5697 0.0000 0.2308
g = 6 0.0253 0.0879 0.0334 0.3199
g = 7 0.2126 0.4848 0.2115 0.1776
g = 8 0.2979 0.3133 0.2846 0.3725

Note: The sample period is 1926i- 97iv. The long-run equilibrium
adjustment is given by the demeaned log price-dividend ratio, [y.sub.t]
= [d.sub.t] - [p.sub.t]. The artificial regression (7), used to
calculate the linearity Wald-tests, are based on q set equal to four.
All test statistics were constructed using heteroskedasticity-robust
methods (see White [1980]).
TABLE 5

p-values for the Linearity Tests of the [y.sub.t]: AR(1)

 [W.sub.1] [W.sub.4] [W.sub.3] [W.sub.2]

g = 1 1.354e-6 0.2089 1.190e-7 0.5169
g = 2 0.1708 0.3220 0.5505 0.0549
g = 3 2.45e-10 1.413e-9 0.0018 0.2665
g = 4 2.58e-15 6.490e-6 6.77e-12 0.0603
g = 5 0.3874 0.9607 0.8348 0.0836
g = 6 0.4662 0.6761 0.9319 0.1231
g = 7 4.549e-5 0.2964 0.0004 0.0026
g = 8 0.0624 0.0224 0.9643 0.1482

Notes: The sample period is 1926i-1997iv. The long-run equilibrium
adjustment is given by the demeaned log price-dividend ratio, [y.sub.t]
= [d.sub.t] - [p.sub.t]. The artificial regression (7), used to
calculate the linearity Wald-tests, are based on q set equal to four.
All the test statistics were constructed using heteroskedasticity-robust
methods (see White [1980]).

RELATED ARTICLE: ABBREVIATIONS

ADF: augmented Dickey-Fuller

ARCH: autoregressive conditional heteroskedasticity

DW: Durbin-Watson

CRSP: Center for Research in Securities Prices

ESTAR: exponential smooth transition autoregressive

PACF: partial autocorrelation function

SBBI: Stock, Bonds, Bills, and Inflation

STAR: smooth transmission autoregressive