Fiscal readjustments in the United States: a nonlinear time-series analysis.
Cipollini, Andrea ; Fattouh, Bassam ; Mouratidis, Kostas 等
I. INTRODUCTION
The recent deterioration in U.S. budget deficit has raised serious
concerns about the long-run sustainability of U.S. fiscal policy. In
addressing this issue, many studies have examined whether U.S. fiscal
policy respects the intertemporal government budget constraint. This
constraint implies that Ponzi games in which the government rolls over
its debt in full every period by borrowing to cover both principal and
interest payments are ruled out as a viable option for government
finances. The no-Ponzi game restriction, which is regarded as synonymous
with sustainability, requires that today's government debt is
matched by an excess of future primary surpluses over primary deficits
in present value terms. This condition imposes testable restrictions on
the time-series properties of key fiscal measures such as the stock of
public debt, the budget deficit, and the long-run relationship between
government expenditures and revenues.
In a seminal article, Hamilton and Flavin (1986) suggest that a
sufficient condition for the intertemporal budget constraint to hold is
for the deficit inclusive of interest payments to be stationary. Wilcox
(1989) extends the work of Hamilton and Flavin by allowing stochastic interest rates and nonstationarity in the noninterest surplus. He shows
that when the sustainability condition holds, the present value of the
stock of public debt should be stationary and has an unconditional mean
of zero. Trehan and Walsh (1988) generalize the Hamilton and Flavin
result and show that if debt and deficits are integrated of order 1, and
if interest rates are constant, then a necessary and sufficient
condition for sustainability is that debt and primary balances
(net-of-interest deficits) are cointegrated. Other studies examine the
time-series properties of government spending and revenues. For
instance, Hakkio and Rush (1991) show that a necessary condition for
intertemporal budget constraint is the existence of cointegration
between government expenditure (inclusive of interest payments) and
government revenues. Quintos (1995) expands on Hakkio and Rush (1991)
and introduces the concept of strong sustainability condition, which
implies that the undiscounted public debt is finite in the long run.
More recent work has emphasized the importance of nonlinearity in
the U.S. fiscal policy. This nonlinearity may arise if we expect fiscal
authorities to react differently to whether the deficit has reached a
certain threshold deemed to be unacceptable or unsustainable. Bertola
and Drazen (1993) develop a framework that allows for trigger points in
the process of fiscal adjustment such that significant adjustments in
budget deficit may take place only when the ratio of deficit to output
reaches a certain threshold. This may reflect the existence of political
constraints that block deficit cuts, which are relaxed only when the
deficit reaches a sufficiently high level deemed to be unsustainable
(Alesina and Drazen 1991; Bertola and Drazen 1993).
Recent studies have found strong evidence of nonlinearity in U.S.
fiscal policy. Using an exponential smooth transition autoregressive
model and long-span data set starting from 1916, Sarno (2001) provides
evidence of nonlinear mean reversion in the U.S. debt-to-gross domestic
product (GDP) ratio. By using a threshold autoregressive model, Arestis,
Cipollini, and Fattouh (2004) provide evidence of threshold effects such
that policymakers will intervene to reduce per capita deficit only when
it reaches a certain threshold.
In line with the above studies, we provide new evidence of strong
nonlinearity in the U.S. fiscal policy. We contribute to the existing
literature by extending the analysis of U.S. fiscal adjustment from a
single-equation setting to a multivariate one using a nonlinear vector
error correction model (VECM). This extension adds value both in terms
of our economic understanding of the fiscal adjustment process in the
United States and assessing the forecasting power of the model. First,
using a multivariate threshold cointegration model, we are able to
identify whether the government's solvency constraint in the United
States is achieved through tax increases, spending cuts, or a
combination of both. The issue of which specific item of the budget
ensures fiscal readjustments has received considerable attention among
U.S. policymakers and has been recently the focus of much heated debate.
For instance, Rubin, Orszag, and Sinai (2004) argue that "balancing
the budget for the longer term will require a combination of expenditure
restraint and revenue increases." The authors believe that
"the single most important act Congress and the Administration
could take at this point to rein the budget over the next decade would
be to re-establish the budget rules that existed in the 1990s. These put
caps on discretionary spending and required that reductions in taxes or
increases in mandatory spending be paid for with other tax increases or
spending cuts." A study by the Congressional Budget Office (2003)
also cautioned that "economic growth alone is unlikely to bring the
nation's long term fiscal position into balance."
The contribution of the academic literature to this debate has been
very limited. Alesina and Perotti (1995) find evidence that for fiscal
adjustment to be permanent and effective, the focus must be on the level
of expenditure rather than on taxation. (1) They argue that tax
increases ease fiscal problems only temporarily. Temporary tax increases
may also be very difficult to reverse, and as such, tax-driven deficit
cuts may induce high tax ratios. Furthermore, raising taxes is
unpopular, and there are doubts whether such a strategy can in fact
increase government revenues. Bohn (1991) and Crowder (1997) rely on the
government's intertemporal solvency condition to analyze the
performance of fiscal stabilization plans over a long-term data span.
Specifically, the budget item series showing most of the
error-correcting dynamics is the one bearing most of the fiscal
readjustment burden. Crowder (1997) shows that the large U.S. deficits
in the 1980s and early 1990s have been primarily caused by increases in
government spending rather than falls in tax revenues. Thus, in order to
restore the intertemporal budget constraint, the bulk of fiscal
readjustment should occur through government spending cuts rather than
increases in tax revenues. Bohn (1991) shows that regardless of the
shock that caused the high budget deficit, historically these deficits
have been corrected by combination of both spending cuts and tax
increases. Auerbach (2000) finds that both components of U.S. fiscal
policy have been responsive to fluctuations in the deficit although the
response from government spending has been much more important.
Our results reveal the following important findings. They provide
support for the existence of trigger points in U.S. fiscal policy.
Specifically, we find strong evidence of nonlinearity in the fiscal
process where adjustment occurs only when the real deficit per capita
reaches a certain threshold. Below this threshold, there seem to be no
significant error correction effects, which may suggest that
policymakers become sensitive to large deficits only when the deficit
reaches the very "high" level deemed to be unacceptable or
unsustainable. More importantly, we find that government expenditure
shows the strongest error-correcting dynamics, and hence, the bulk of
fiscal adjustment seems to occur through spending cuts rather than
increases in tax revenue.
In addition to gaining better understanding of the U.S. fiscal
adjustment process, we evaluate the out-of-sample density forecast and
probability forecast performance of the estimated model. Our results
highlight an additional advantage from generalizing the model from a
single-equation to a multivariate setting. Specifically, the results of
out-of-sample density forecast and probability forecasts suggest that
there is an improvement in forecast performance when we move from a
univariate autoregressive (AR) model to a multivariate model. We also
compare the out-of-sample forecast performance of the linear and
threshold models. In a recent survey, Granger (2001) concludes that a
major weakness of the literature on nonlinear models is that little is
known about the out-of-sample forecasting properties of different
nonlinear models or their out-of-sample forecast performance with those
corresponding to linear models. The empirical findings suggest that,
although the threshold VECM has a slightly better probability forecast
performance than the linear VECM, the density forecast performance of
both the linear and the nonlinear VECMs is similar for the long horizon
(e.g., 2 yr ahead), and thus, we cannot recommend the use of the
threshold VECM over simple linear models for forecasting purposes.
Similar results have been found recently in the context of the exchange
market (see for instance, Rapach and Wohar 2006). This suggests that
although nonlinear models are useful to gain a better understanding of
the U.S. fiscal policy, they do not necessarily provide more reliable
forecasts.
This paper is organized as follows. Section II describes the
empirical methodology, while Section III presents the empirical results.
Section IV summarizes and concludes.
II. EMPIRICAL METHOD
A. Threshold Cointegration
A VECM fitted to both G, the real government expenditure per
capita, and R, the real government revenue per capita, is used to test
whether there is any evidence of public finance sustainability and to
test which of the two fiscal series carries the burden of fiscal
readjustment (if any). Many empirical studies have concentrated on
estimating the following linear VECM (where, for simplicity, we fix the
VECM lag order to 1):
(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [mu] is a two-dimensional vector of intercepts, [w.sub.t-1] =
[G.sub.t-1] - [beta][R.sub.t-1], [alpha] is a two-dimensional vector of
speed of adjustment coefficients, and [u.sub.t] is the error term
vector. According to Quintos (1995), the deficit is "strongly"
sustainable if the I(1) processes [R.sub.t] and [G.sub.t] are
cointegrated and [beta] = 1, while it is "weakly" sustainable
if [R.sub.t] and [G.sub.t] are cointegrated and 0 < [beta] < 1.
Weak sustainability implies that the government constraint holds, but
the undiscounted debt process is exploding at a rate that is less than
the growth rate of the economy. Although this case is consistent with
sustainability, it is inconsistent with the ability of the government to
market its debt in the long run. Thus, in this paper, we will only test
for the "strong" sustainability condition and set [beta] = 1.
(2) By setting [beta] = 1, the error correction term becomes the real
deficit per capita.
As argued above, Equation (1) may not be the most appropriate means
to characterize the fiscal adjustment process, for there may exist
trigger points in the process of fiscal adjustment. Hence, in this
study, we focus on the following threshold VECM:
(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
The model given by Equation (2) allows us to test whether there are
significant asymmetries in the adjustment process of per capita
government revenues and per capita government expenditure to the
long-run equilibrium level depending on the level of deficit per capita,
[w.sub.t-1], given by [G.sub.t] - [R.sub.t]. In particular, if the real
deficit per capita exceeds the trigger point [gamma], then there is a
switch in the speed of adjustment coefficients from [[alpha].sub.1] to
[[alpha].sub.2], as well for the other short-run dynamics parameters.
Hansen and Seo (2002) suggest estimating the model given by Equation (2)
through maximum likelihood under the assumption that the errors
[u.sub.t] are iid Gaussian. The Gaussian likelihood is
(3) [L.sub.n] = - n/2 log[absolute value of [summation]] - 1/2
[n.summation over (t=1)] [u'.sub.t] [[summation].sup.- 1]
[u.sub.t],
where
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
with the indicator function [d.sub.1t]([gamma]) taking value 1 if
the deficit is below the trigger point [gamma] and 0 otherwise.
Furthermore, [d.sub.2t]([gamma]) is equal to (1-[d.sub.1t]([gamma])). In
order to detect nonlinearity, Hansen and Seo (2002) use a Lagrange
multiplier (LM) statistics to test [H.sub.0] (linear cointegration)
versus [H.sub.1] (threshold cointegration). If the cointegrating vector
is known and equal to [[beta].sub.0] (in our study, it is fixed to
unity), then the LM test is given by
(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Given that the asymptotic critical values of the distribution of
the test statistics cannot in general be tabulated, bootstrapped p
values are computed using both a fixed regressor and a parametric
bootstrap method, as described by Hansen and Seo (2002).
B. Out-of-Sample Density Forecasts
To further motivate the use of threshold VECM, we explore whether
our proposed model is superior to both the univariate model and the
linear model in terms of its out-of-sample forecast performance.
Traditionally, evaluating the forecast accuracy of models has been based
on point forecasts, often using the root mean square error (RMSE). The
empirical evidente suggests that the forecasting ability of linear
models frequently outperforms the nonlinear model on the basis of the
RMSE criterion alone. (3) However, several studies have recently
emphasized the importance of evaluating forecast performance on the
basis of an estimate of the complete probability distribution of the
possible future outcomes of the series (i.e., a density forecast) as
opposed to point forecasting.
More specifically, only under certainty equivalence (e.g.,
policymakers with quadratic loss function and linear dynamics of the
predicted variable), can the RMSE be used as a criterion for choosing an
optimal forecast. (4)
If certainty equivalence does not hold, then it is important to
focus not only on the first moments but also on the overall density of
forecasts. The density forecasts are generated through stochastic
simulation, and we give in the Appendix a detailed description of this
method.
First, we produce the density forecasts for both changes in
government spending and tax revenues using a univariate AR model. Then,
we produce the marginal density forecasts for both changes in government
spending and tax revenues, [DELTA]G and [DELTA]R, respectively. We also
produce the conditional density of government spending changes and tax
revenues changes, [DELTA]G/AR and [DELTA]R/ [DELTA]G, respectively.
Finally, we produce the joint density of government spending changes and
tax revenues changes, ([DELTA]G/AR) x [DELTA]R and ([DELTA]R/[DELTA]G) x
[DELTA]G, respectively. We consider three different forecast horizons,
h, equal to one, four, and eight quarters ahead. For the purpose of
density forecast evaluation, in line with Clements and Smith (2000), for
a given forecast horizon h, we calculate the probability integral
transforms (PITs) of the actual realizations, [y.sub.t], of each fiscal
series over the forecast evaluation period with respect to the
model's forecast densities, given by
[{[p.sub.t]([y.sub.t])}.sup.n.sub.t=1]. Therefore, we evaluate the PIT:
(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
for t = 1, ..., n. When the model's forecast density
corresponds to the true predictive density, the sequence [z.sub.t] is
iid, U(0, 1). In line with Diebold, Gunther, and Tay (1998) and Clements
and Smith (2000), we use informal data analysis to test whether PIT is
lid, U(0, 1). Therefore, evaluating the accuracy of density predictions
consists of assessing uniformity using the Probability-Probability (PP)
plots. (5) Specifically, we plot the empirical distribution function of
PIT against the 45[degrees] line, with critical values defining the
confidence intervals obtained from Miller (1956). Then, in order to
assess whether the PIT series is iid, we use the Lagrange multiplier
test for the null of serial independence of (PIT - P[[bar.I]T).sup.j]
for integer j up to order 3, with PIT being the mean PIT series. (6)
Furthermore, we consider the Berkowitz (2001) approach to evaluate
the accuracy of density forecasts. (7) Specifically, we take the inverse of the Gaussian cumulative distribution function with respect to each
component of the sequence PIT, which gives [PIT.sup.*]. Under the null
of iid U(0, 1) for the sequence PIT, the series [PIT.sup.*] becomes a
standard Gaussian random variable. In order to test for normality in
[PIT.sup.*], Berkowitz (2001) suggested a likelihood ratio test for the
joint null of normality and lid in [PIT.sup.*]. The test statistic is
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII] is the value of the maximum
likelihood function of an AR(1) model fitted to [PIT.sup.*], where [??]
and [??] are the estimated intercept and autoregressive coefficient,
respectively, and [??] is the estimated standard deviation for the
residuals of the AR(1). Under the null, the [LR.sub.B] has a
[X.sup.2.sub.3] distribution.
The stochastic simulation is used both to produce forecasts under
any type of scenario (e.g., the density forecast) and to generate the
forecasts for particular types of scenarios. Specifically, we are
interested in generating the probability forecasts for two types of
events (for probability forecast analysis, see Clements 2005; Galvao
2006). The first one is defined by negative changes in government
spending and the second one by positive changes in tax revenues. Using
the simulation method described in the Appendix, we produce 1,000
h-step-ahead forecasts for government spending changes (conditional on
the available information set), and we count how many of these forecasts
are negative. This number divided by 1,000 gives the probability
forecast for the government spending series. The same methodology is
applied to generate the probability forecast for the tax revenue series.
We repeat this exercise by increasing the overall sample by one
additional observation till we reach the end of the forecast evaluation
period. We use the following indicators of probability forecast accuracy
(Galvao 2006):
QPS = 1/T [T.summation over (t=1)] 2[([P.sub.t] - [R.sub.t]).sup.2]
LPS = 1/T [T.summation over (t=1)] [absolute value of [(1 -
[R.sub.t]) ln (1 - [P.sub.t]) + [R.sub.t] ln([P.sub.t])],
where [P.sub.t] and [R.sub.t] are the probability forecast and the
actual realization of the variable one is interested in predicting.
Finally, the QPS score ranges from 0 to 2, with 0 being perfect
accuracy. The second one ranges from 0 to [infinity]. LPS and QPS imply
different loss functions with large mistakes more heavily penalized under LPS.
III. EMPIRICAL ANALYSIS
A. Data and Data Sources
The data set used in this study comprises quarterly observations
over the first quarter of 1947 to the last quarter of 2004. We examine
the dynamics of real per capita expenditure and real per capita
revenues, and hence, we only focus on the strong sustainability
condition, as Quintos (1995) has described. We first collect data on the
nominal current federal expenditure (inclusive of interest payments) and
current federal revenues (seasonally adjusted). We deflate both series
by implicit GDP deflator to obtain real values. The series are then
deflated by population to obtain real per capita government expenditure
and real per capita government revenues. All the data have been obtained
from the Federal Reserve Economic Data database available from the
Federal Reserve Bank of St. Louis.
B. In-Sample Forecasting Analysis
The augmented Dickey-Fuller (ADF) and the Philips-Perron tests for
the null of unit root (see the first two columns of Table 1) suggest
that we cannot reject the null hypothesis of nonstationarity in the
levels of real per capita government expenditure and real per capita
government revenue. These findings are also confirmed by the tests
developed by Ng and Perron (2001) under GLS detrending using the
modified AIC to select the optimal lag order. Specifically, as for the
tax revenue series, the [MZ.sup.GLS.sub.a] and [ADF.sup.GLS] tests
suggest that we cannot reject the null of unit root at any significance
level. As for the tax revenue series, according to the
[MZ.sup.GLS.sub.a], we cannot reject the null of unit root, whereas
using the [ADF.sup.GLS], we cannot reject the null of unit root at 1%
significance level.
Before carrying on with cointegration analysis, we select the VECM
lag length. The results are reported in Table 2 Panel A for the linear
VECM and Table 2 Panel B for the threshold VECM. As can be seen from
this table, both the AIC and BIC statistics pick a lag of 1. This holds
for both the linear and the threshold VECMs. (8)
We next test for the existence of threshold effects in the VECM
using the SupLM statistic. As can be seen from Table 3, the
[SupLM.sup.0] statistic suggests a strong presence of threshold effects
where the null hypothesis of no threshold can be rejected at the 5%
level. The Wald tests also point in the same direction. The null
hypothesis that the error correction coefficients and dynamic
coefficients are the same in both regimes can be rejected at 5% and 1%
levels, respectively.
The parameter estimates were calculated by minimization of log
[absolute value of [SIGMA]([??])] over 300 grid points for the parameter
[gamma]. The estimates are reported in Table 4. The estimated threshold
is $8.859 per capita, which implies that the first regime occurs when
the real deficit per capita is less than or equal to $8.859. This regime
contains 82% of the sample observations. The second regime occurs when
the real deficit per capita is above the threshold of $8.859. Following
Hansen and Seo (2002), we label the first regime as the
"typical" regime and the second regime as the
"unusual" regime. The results in Table 4 show that the typical
regime has no significant error correction effects, with the
coefficients on the lagged error correction terms in both equations
[DELTA][R.sub.t] and [DELTA][G.sub.t] insignificant at the conventional
levels. In contrast, error correction effects occur only in the unusual
regime--that is, when the real deficit per capita has risen above the
estimated threshold. Interestingly, the results indicate that fiscal
readjustment occurs through spending cuts rather than with increases in
tax revenue: while the estimated coefficient on the error correction
term in the government expenditure equation is large and highly
significant, the estimated coefficient on the error correction term in
the revenue per capita equation is quite small and not significant at
the conventional levels.
In Figure 1, we plot the deviations of the real deficit per capita
from the estimated threshold point estimate over the sample period. Note
that in this figure, positive values identify the unusual regime,
whereas the negative values identify the typical regime. Figure 1
clearly shows that there have been four major shifts from the typical to
the unusual regime in the real deficit per capita dynamics. First, a
major shift occurred in February 1975 during the peak of the 1973 oil
crisis, which plunged the U.S. economy into a deep recession. A second
major shift occurred in March 1981. This shift, which occurred during
the Reagan presidency, corresponds to the effects of the legislation
passed by the Congress aimed at cutting personal income taxes over the
next 3 yr (the 1981 Economic Recovery Tax Act). Since the tax cuts were
not met by equal cuts in government spending, the federal budget went
into large deficit and remained so for a considerable period of time. It
is only in March 1987 that we witness a regime shift back toward the
typical regime. This switch reflects in part the intensive political and
economic debate in the Congress and in the media and the efforts made by
fiscal authorities to reduce the large and growing budget deficit. These
efforts were manifested in the Tax Reform Act of 1986 and the Balanced
Budget and Emergency Deficit Control Act, which called for progressive
reduction in the deficit and the achievement of a balanced budget by the
early 1990s (Ippolito 1990).
[FIGURE 1 OMITTED]
Despite the efforts made to balance the budget, another major shift
(the third one) from the typical to the unusual regime occurred in
February 1991. This switch occurred during the Senior Bush presidency
and corresponds closely to the recession that plagued the U.S. economy
at the beginning of Senior Bush's term and later to the budgetary
requirements of the Gulf War. In February 1994, there was a switch to
the typical regime, which lasted for the rest of the 1990s. This
coincided with President Clinton's move to the White House and the
importance he attached to balancing the budget in his economic policy.
Finally, in April 2002, there was a switch from the typical to the
unusual regime. This switch corresponds to the current President
Bush's administration with its emphasis on cutting taxes and
boosting defense and security outlays, which has caused large budget
deficits.
C. Out-of-Sample Forecasting Analysis
We compare the out-of-sample forecast performance of the linear
model and the threshold cointegration model. We leave out the last 64
observations of the sample for density forecast evaluation. More
specifically, the forecast evaluation period for the one-quarter-ahead
predictions starts from the first quarter of 1989, which corresponds to
the beginning of the George Bush senior administration, and ends in the
last quarter of 2004.
In order to produce out-of-sample forecasts, we estimate
recursively three different model specifications (univariate AR, linear
VECM, and nonlinear VECM). We concentrate on one-quarter-, 1-yr-, and
2-yr-ahead predictions. As for the one-quarter-ahead projections, we
consider, initially, the sample that ends in the last quarter of 1988,
and then, we increase the sample by one observation each time period
till we reach a sample period that ends in the third quarter of 2004. In
order to produce four-quarters-ahead predictions, we consider,
initially, the sample that ends in the first quarter of 1998, and then,
we increase the sample by one observation each time period till we reach
a sample period that ends in the last quarter of 2003. Finally, to
produce eight-quarters-ahead predictions, we consider, initially, the
sample that ends in the first quarter of 1987, and then, we increase the
sample by one observation each time period till we reach a sample period
that ends in the last quarter of 2002.
The out-of-sample point forecast evaluation in Table 5 shows that
the evidence is inconclusive: the RMSE values corresponding to the point
forecast of the government spending series obtained from the different
models are close to each other at the different forecast horizons.
Moreover, although the nonlinear VECM is the worst in the
one-quarter-ahead point prediction of tax revenues, the different models
have a similar performance for the 1-and 2-yr forecast horizons.
The results from Table 6 suggest that none of the models proposed
is capable of providing a good density forecast for the tax revenue
series. Specifically, although the PP plots for the PIT sequence (see
the right-hand-side panel of Figures 2-8) show the 45[degrees] line
inside the confidence interval bands for all the model specifications
and for most of the forecast horizons (with the exception of the
1-yr-ahead density forecast from the univariate AR model--see Figure 2),
the Ljung-Box test suggests evidence of serial correlation in the first
and third moments of the PIT sequence. (9) As for government spending,
there is an improvement in density prediction performance when we move
from the univariate AR model to the multivariate model and as we
considera forecast horizon longer than one quarter. In particular, even
though there is no evidence of serial correlation in the first, second,
and third moments of the PIT sequence corresponding to AR density
forecasts (Table 6), the corresponding PP plots show the 45[degrees]
line outside the confidence interval bands (see the left-hand-side panel
of Figure 2) for density prediction over 1 and 2 yr, respectively.
As for the multivariate models, the density forecast performance
for the linear and threshold VECM specifications is similar for the long
horizon (e.g., 2 yr ahead). The Ljung-Box test suggests absence of
serial correlation in the first, second, and third moments of the PIT
sequence for the marginal, conditional, and joint density forecasts of
government spending produced by both the linear and the nonlinear VECMs
and for any forecast horizon (Table 6). However, the PP plots for the
PIT associated with the threshold VECM marginal, conditional, and joint
density forecasts of government spending have the 45[degrees] line
inside the confidence interval bands only when we consider an
eight-step-ahead forecast horizon (Figures 6-8). From Figures 3-5, we
can observe that the PP plots for the PIT associated with the linear
VECM marginal, conditional, and joint density forecasts of government
spending have the 45[degrees] line inside the confidence interval bands
for any forecast horizon.
Using the Berkowitz (2001) test, from Table 7, we can observe that
the strongest rejection of the null hypothesis of normality and iid for
the inverse of the cumulative (standardized) Gaussian distribution with
respect to the PIT sequence applies to the AR and linear VECM. When we
use a threshold VECM, there is a mild nonrejection of the null
hypothesis if we turn our focus on the conditional and joint density
(one-step-ahead) forecasts of government spending changes and on the
marginal and joint density (eight-step-ahead) forecasts of government
spending changes.
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
[FIGURE 5 OMITTED]
[FIGURE 6 OMITTED]
[FIGURE 7 OMITTED]
[FIGURE 8 OMITTED]
Finally, the probability forecast exercise confirms the results
obtained from the density forecast evaluation. As mentioned in the
Section II above, we are interested in evaluating the model forecast
performance regarding events that can be associated with fiscal
readjustments, and these are either positive changes in tax revenues or
negative changes in government spending. Therefore, as also noted above,
we need to compute probability forecasts and evaluate them in terms of
QPS and LPS scores. As for government spending (Table 8 Panel A), the
best performer for any type of prediction horizon is the nonlinear VECM
where QPS and LPS scores are considerably lower than the corresponding
ones for the AR model. As for the tax revenues (Table 8 Panel B), the
worst probability forecast performance is the one associated with the
nonlinear VECM for the one-quarter-ahead probability forecast. There are
gains from moving a univariate AR modeling framework to a multivariate
model if the prediction horizon is either 1 or 2 yr ahead, and the
nonlinear VECM is the best performer (in terms of QPS and LPS scores) if
the forecast horizon is 2 yr ahead.
IV. CONCLUSIONS
In this paper, we investigate empirically the U.S.
government's intertemporal solvency condition and assess whether
the government's solvency constraint has been achieved mainly
through tax increases, spending cuts, or a combination of both. Using a
threshold vector error correction estimation procedure, we find evidence
that government authorities would intervene only when the deficit per
capita had reached a certain threshold. Our results show that the bulk
of fiscal adjustment occurs through spending cuts rather than increases
in tax revenue.
In terms of forecasting, the picture is mixed. By evaluating the
out-of-sample density forecast performance of the estimated model, we
show that there is an improvement in forecast performance when we move
from a univariate AR model specification to a multivariate model.
However, we find that the forecasting performance of both linear and
nonlinear VECMs is similar for long horizon (e.g., 2 yr ahead), and
thus, we cannot recommend the use of the threshold VECM over simple
linear models for forecasting purposes.
This suggests that our proposed model could be improved upon and
should be evaluated in comparison with not only alternative multivariate
nonlinear models but also multivariate linear models with structural
breaks. One might also consider a time trend or an indicator of the U.S.
business cycle as an additional threshold variable (beyond the
government deficit) in the nonlinear multivariate model. Recently,
Galvao (2006) has found that the U.S. term spread performs well in
predicting U.S. industrial production using a threshold VAR with both a
time trend and the term spread as threshold variables.
These extensions can prove very fruitful avenues for future
research.
ABBREVIATIONS
ADF: Augmented Dickey-Fuller
AIC: Akaike Information Criterion
AR: Autoregressive
BIC: Bayesian Information Criterion
GDP: Gross Domestic Product
GLS: Generalized Least Squares
iid: Independent and Identically Distributed
LM: Lagrange Multiplier
LPS: Logarithmic Probability Score
PIT: Probability Integral Transform
PP: Probability-Probability
QPS: Quadratic Probability Score
RMSE: Root Mean Square Error
VECM: Vector Error Correction Model
APPENDIX A1. GENERATION OF JOINT DENSITY FORECAST OF LINEAR AND
NONLINEAR VECMS USING STOCHASTIC SIMULATION
The stochastic simulation method explained in Galvao (2006) is used
to produce the joint density forecasts. Define [x.sub.t], as the vector
of endogenous variables {[DELTA]G, [DELTA]R}' and [X.sup.t] =
{[x.sub.t-1], [x.sub.t-2], ..., [x.sub.1]} as the history at time t.
Given an estimate of A from the linear VECM [x.sub.t] = f([X.sup.t-1];
A) + [u.sub.t] and of the sample covariance matrix of residuals, [??], a
trial sequence of forecasts [x.sub.t+1], [x.sub.t+2], [x.sub.t+3], ...,
[x.sub.t+h] is built as follows. A random vector [u.sub.t+1] is drawn
from the distribution u ~ N(0, [??]), and it is used to calculate
[??].sub.t+1], given [X.sup.t] and [??]. Then, [[??].sub.t+1] is added
to "history" to form [[??].sup.t+1]. This procedure is
continued until the sequence of forecast is complete {[x.sub.t+1],
[x.sub.t+2], [x.sub.t+3], ..., [x.sub.t+h]}. This sequence of forecast
can be called [S.sub.1], and the same trial is repeated to obtain a set
of 1,000 forecast sequences. In the case of threshold, models, the
forecasting model can be also written as [x.sup.j.sub.t] =
[f.sup.k](X.sup.t-1];[[beta].sup.j] + [[epsilon].sup.j.sub.t], where j =
1, 2, to indicate the two regimes. Therefore, given [[??].sup.1] and
[[??].sup.2], which are the estimated covariances for the two regimes,
in order to obtain the forecast sequence we proceed as follows. Given
the one-step-ahead point forecast, either the vector [u.sup.1.sub.t+h]
is drawn from [u.sup.1 ~ N(0, [[??].sup.1]) or the vector
[u.sup.2.sub.t+h] is drawn from [u.sup.2] ~ N(0, [[??].sup.2]) depending
on whether the deficit is below or above the estimated threshold. The
realizations for this vector of innovations are then used to calculate
[[??].sub.t+1], given [X.sup.t] and [??]. Then [[??].sub.t+1] is added
to history to form [[??].sup.t+1]. This procedure is continued until the
sequence of forecast is complete {[x.sub.t+1], [x.sub.t+2], [x.sub.t+3],
..., [x.sub.t+h]. This sequence of forecast can be called [S.sub.m] and
the same trial is repeated to obtain a set of 1,000 forecast sequences.
For each sequence of forecasts [S.sub.m] (with m describing the mth
scenario), we pick the last vector of observations, e.g., [x.sub.t+h,].
The first component of this vector describes the joint model prediction
for the (change in the) government spending series associated with
scenario m, and the second component of this vector describes the joint
model prediction for the (change in the) tax revenue series associated
with scenario m.
APPENDIX A2. GENERATION OF CONDITIONAL DENSITY FORECAST LINEAR AND
NONLINEAR VECMS USING STOCHASTIC SIMULATION
The methodology to generate the sequence of forecast S (by picking
the last observation in this sequence) is similar to the method
described in Appendix Al. The only exception is to fix one of the two
innovations to a specific value, and this gives the conditional density
forecast. In particular, if we fix the innovation to tax revenues to the
sample mean of this series, and if we let the other shock (e.g., the one
affecting government spending) to get 1,000 realizations from Gaussian
random draws, then we are able to generate the density forecast of
government spending conditional on the sample mean value of tax
revenues. Furthermore, if we fix the innovation to government spending
to the sample mean of this series, and if we let the other shock (e.g.,
the one affecting tax revenues) to get 1,000 realizations from Gaussian
random draws, then we are able to generate the density forecast of tax
revenues spending conditional on the sample mean value of tax revenues.
APPENDIX A3. GENERATION OF MARGINAL DENSITY FORECAST OF LINEAR AND
NONLINEAR VECMS USING STOCHASTIC SIMULATION
The methodology to generate the sequence of forecast S (by picking
the last observation in this sequence) is similar to the method
described in Appendix Al. However, the simulation method involves
calibration to the sample standard deviation of each series and not to
the overall sample covariance matrix. Specifically, the only difference
with the method described in Appendix Al is multiplying the different
realization of an iid shock (using standardized Gaussian random draws)
by the sample standard deviation of government spending, thereby
obtaining the marginal density forecast of government. Finally, if we
multiply the different realization of an iid shock (using standardized
Gaussian random draws) by the sample standard deviation of tax revenues,
we obtain the marginal density forecast of tax revenues.
APPENDIX A4. GENERATION OF DENSITY FORECAST OF A UNIVARIATE AR
MODEL USING STOCHASTIC SIMULATION
Given the estimation of an AR(1) for each of the two series, the
density forecasts at different horizon for one series is given by
[x.sub.t+h] = [[??].sub.0]h + ([[??].sup.h.sub.1][x.sub.t] +
[[alpha].sub.1.sup.h-1][u.sub.t+1] + ... + [u.sub.t+h]), where
[[??].sub.0] and [[??].sub.1] are the estimated intercept and
autoregressive coefficient of each series, respectively.
REFERENCES
Alesina, A., and R. Drazen. "Why Are Stabilisation
Delayed?" American Economic Review, 82, 1991, 1170-88.
Alesina, A., and R. Perotti. "Fiscal Expansion and Adjustment
in OECD Countries." Economic Policy, 21, 1995, 206-47.
Arestis, P., A. Cipollini, and B. Fattouh. "Threshold Effects
in the U.S. Budget Deficit." Economic Inquiry, 42, 2004, 214-22.
Auerbach, A. J. "Formation of Fiscal Policy: The Experience of
the Past Twenty-Five Years." FRBNY Economic Policy Review, 6, 2000,
9-23.
Berkowitz, J. "Testing Density Forecasts, with Applications to
Risk Management." Journal of Business and Economic Statistics, 19,
2001, 465-74.
Bertola, G., and A. Drazen. "Trigger Points and Budget Cuts:
Explaining the Effects of Fiscal Austerity." American Economic
Review, 83, 1993, 11-26.
Bohn, H. "Budget Balance through Revenue or Spending
Adjustments?" Journal of Monetary Economics, 27, 1991, 333-59.
Christoffersen, P. F., and F. Diebold. "Optimal Prediction
under Asymmetric Loss." Econometric Theory, 13, 1997, 808-17.
Clements, M. P. Evaluating Econometric Forecasts of Economic and
Financial Variables. Basingstoke, United Kingdom: Palgrave Macmillan,
2005.
Clements, M. P., and J. Smith. "Evaluating the Forecast
Densities of Linear and Nonlinear Models Applications to Output Growth
and Unemployment." Journal of Forecasting, 19, 2000, 255-76.
Congressional Budget Office. "The Long-Term Budget
Outlook." Washington, DC: CBO Publications Office, 2003.
Crowder, W. "The U.S. Federal Intertemporal Budget Constraint:
Restoring Equilibrium through Increased Revenues or Decreased
Spending." Manuscript, 1997.
Cunado, J. L., A. Gil-Alana, and F. Perez de Gracia. "Is the
US Fiscal Deficit Sustainable? A Fractionally Integrated Approach."
Journal of Economics and Business, 56, 2004, 501-26.
Diebold, F. X., T. A. Gunther, and A. S. Tay. "Evaluating
Density Forecast." International Economic Review, 39, 1998, 863-83.
Diebold, F. X., and J. A. Nason. "Non-Parametric Exchange Rate
Prediction." Journal of International Economics, 28, 1990, 315-32.
Galvao, A. B. C. "Structural Break Threshold VAR for
Predicting US Recessions Using the Spread." Journal of Applied
Econometrics, 21, 2006, 463-87.
Granger, C. W. J. "An Overview of Nonlinear Macroeconometric
Empirical Models." Macroeconomic Dynamics, 5, 2001, 466-81.
Hakkio, C. S., and M. Rush. "Is the Budget Deficit 'Too
Large'?" Economic Inquiry, 29, 1991, 429-45.
Hamilton, J. D., and M. A. Flavin. "On the Limitation of
Government Borrowing: A Framework for Empirical Testing." American
Economic Review, 76, 1986, 808-19.
Hansen, B., and B. Seo. "Testing for Two Regime Threshold
Cointegration in Vector Error Correction Models." Journal of
Econometrics, 110, 2002, 293-318.
Ippolito, D. S. Uncertain Legacies." Federal Budget Policy
from Roosevelt through Reagan. Charlottesville, VA: University Press of
Virginia, 1990.
Martin, G. M. "US Deficit Sustainability: A New Approach Based
on Multiple Endogenous Breaks." Journal of Applied Econometrics,
15, 2000, 83-105.
Miller, L. H. "Table of Percentage Points of Kolmogorov
Statistics." Journal of American Statistical Association, 51, 1956,
111-21.
Ng, S., and P. Perron. "Lag Length Selection and the
Construction of Unit Root Tests with Good Size and Power."
Econometrica, 69, 2001, 1519-54.
Quintos, C. E. "Sustainability of the Deficit Process with
Structural Shifts." Journal of Business and Economic Statistics,
13, 1995, 409-17.
Rapach, D., and M. Wohar. "The Out-of-Sample Forecasting
Performance of Nonlinear Models of Real Exchange Rate Behavior."
International Journal of Forecasting, 22, 2006, 341-61.
Rubin, R. E., P. R. Orszag, and A. Sinai. "Sustained Budget
Deficits: Longer-Run U.S. Economic Performance and the Risk of Financial
and Fiscal Disarray." Paper presented at the AEA-NAEFA Joint
Session, Allied Social Science Associations Annual Meeting, 2004.
Sarno, L. "The Behavior of US Public Debt: A Non-Linear
Perspective." Economics Letters, 74, 2001, 119-25.
Sarno, L., and G. Valente. "Comparing the Accuracy of Density
Forecasts from Competing Models." Journal of Forecasting, 23, 2004,
541-57.
Spanos, A. Probability Theory and Statistical Inference:
Econometric Modeling with Observational Data. Cambridge: Cambridge
University Press, 1999.
Trehan, B., and C. E. Walsh. "Common Trends, the Government
Budget Constraint and Revenue Smoothing." Journal of Economic
Dynamics and Control, 12, 1988, 425-44.
Wilcox, D. W. "The Sustainability of Government Deficits:
Implications of the Present Value Borrowing Constraint." Journal of
Money, Credit and Banking, 21, 1989, 291-306.
(1.) Alesina and Perotti (1995) use the long-run, cyclically
adjusted primary deficit to identify periods of fiscal readjustment.
Specifically, a very tight fiscal policy in year t occurs when the
cyclically adjusted deficit decreases by more than 1.5% of GDP. A
successful fiscal adjustment in year t occurs when a tight fiscal policy
implemented in year t is such that the gross debt-to-GDP ratio in year t
+ 3 is at least 5 percentage points lower than that in year t.
(2.) Most recent empirical studies also suggest evidence strong
sustainability either without regime shifts, as own by Cunado,
Gil-Alana, and Perez de Gracia 14), or with regime shifts (Arestis,
Cipollini, and Fattouh 2004; Martin 2000).
(3.) Diebold and Nason (1990) give four reasons why nonlinear
models, although they have better in-sample fit than linear models, may
fail to dominate in terms of out-of-sample forecast performance based on
the RMSE (see also Clements and Smith 2000).
(4.) Christoffersen and Diebold (1997) show that under asymmetric
loss, the optimal forecast is the conditional mean plus a bias term,
which depend on both the forecaster's loss function and the
conditional variance of predicted variable.
(5.) PP plots provide a visual inspection of the discrepancy
between shapes created by the patterns of points on a plot and a
reference straight line.
(6.) A high order is chosen because, as noted by Diebold, Gunther,
and Tay (1998), dependence may be present in higher moments.
(7.) Recently, an alternative approach to evaluate the accuracy of
density forecast has been suggested by Sarno and Valente (2004).
(8.) For robustness, we also estimated the VECM with two lags. The
results are very similar to those obtained with one lag, and to save
space, we do not report them. The results are available from the authors
upon request.
(9.) It is worth noting that empirical distribution type of tests,
such as PP plots, is valid only under the assumption that PIT follows an
iid process (Spanos 1999).
ANDREA CIPOLLINI, BASSAM FATTOUH and KOSTAS MOURATIDIS *
* The authors wish to thank three anonymous referees. All the
computations have been carried out using GAUSS. The authors wish to
thank Serena Ng and Bruce Hansen and Byeongseon Seo for making available
the GAUSS routines.
Cipollini: University of Essex, School of Accounting, Finance and
Management, Wivenhoe Park, C04 3SQ Colchester, UK. Phone +44 1206872314,
E-mail acipol@essex.ac, uk
Fattouh: Department for Financial and Management Studies, CeFiMS,
SOAS, University of London, Thornhaugh Street, Russell Square, London
WCIH 0XG, United Kingdom. Tel 0044-(20)78984053, Fax 0044-(20) 78984089,
E-mail bf11@soas.ac.uk
Mouratidis: Swansea University, School of Business and Economics
Swansea, Singleton Park, Swansea, SA2 8PP, Wales UK. Phone +44 (0) 1792
295364, Fax +44 (0) 1792 295626, E-mail k.mouratidis@swan.ac.uk
TABLE 1 Unit Root Tests on the Level of the Series R and G
ADF PP [ADF.sup.GLS] [MZ.sup.GLS.sub.a]
R -0.339 0.534 0.960 1.051
G 0.498 -0.514 2.053 1.633
TABLE 2 Lag Order for Linear VECM and Threshold
VECM
Lag Order AIC BIC
(A) Linear VECM
1 -9.781 -6.887
2 -9.474 -5.156
3 -6.78 -1.055
4 -0.409 6.711
(B) Threshold VECM
1 -12.87 -7.082
2 -11.24 -2.609
3 -6.782 -1.055
4 7.798 22.04
TABLE 3
Tests for Threshold Cointegration
[beta] = 1
Lagrange multiplier 18.700
threshold test statistic
Fixed regressor .062
asymptotic p value
Bootstrap p value .085
Wald Test for Equality of
Dynamic Coefficients VECM Coefficients
Wald test = 23.19 Wald test = 6.12
p value = .000 p value = .046
Notes: The p values for the LM threshold test were
obtained by 5,000 bootstrap replications.
TABLE 4
Estimates of the Threshold VAR
[beta] = 1
Threshold estimate = 8.859
Regime 1
[DELTA] G [DELTA] R
Intercept 0.312 (0.066) 0.068 (0.101)
[w.sub.t-1] -0.010 (0.014) 0.023 (0.022)
[DELTA] [G.sub.t-1] -0.216 (0.135) -0.040 (0.099)
[DELTA] [R.sub.t-1] -0.094 (0.055) 0.101 (0.138)
% Observations in
regime 82
Threshold estimate = 8.859
Regime 2
[DELTA] G [DELTA] R
Intercept 3.137 (1.160) 0.988 (1.336)
[w.sub.t-1] -0.242 (0.096) -0.012 (0.113)
[DELTA] [G.sub.t-1] -0.142 (0.125) -0.514 (0.246)
[DELTA] [R.sub.t-1] -0.088 (0.084) -0.697 (0.153)
% Observations in
regime 18
Notes: Standard errors in parentheses.
TABLE 5
RMSE for Point Forecast
AR Linear VECM
Forecast Horizon (h) [DELTA] G [DELTA] R [DELTA] G [DELTA] R
1 0.823 1.507 0.824 1.517
4 0.808 1.503 0.774 1.465
8 0.795 1.509 0.807 1.462
Threshold VECM
Forecast Horizon (h) [DELTA] G [DELTA] R
1 0.821 1.781
4 0.784 1.484
8 0.827 1.514
Notes: The RMSE associated with the point forecasts has been obtained
by recursive estimation of both linear and nonlinear VECMs using the
sample running from January 1989 to April 2004 as the forecast
evaluation period.
TABLE 6 LM Test for iid of PIT
(A) One-Quarter-Ahead Forecasts
AR Linear VECM
Moment [DELTA] G [DELTA] R [DELTA] G [DELTA] R
1 .143 .002 .127 .006
2 .560 .284 .510 .104
3 .228 .051 .209 .051
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .126 .008
2 .515 .120
3 .211 .076
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .122 .006
2 .500 .112
3 .211 .0753
Threshold VECM
Moment [DELTA] G [DELTA] R
1 .130 .014
2 .463 .132
3 .189 .077
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .126 .015
2 .469 .138
3 .199 .0824
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .126 .003
2 .548 .228
3 .185 .057
(B) Four-Quarters-Ahead Forecasts
AR Linear VECM
Moment [DELTA] G [DELTA] R [DELTA] G [DELTA] R
1 .147 .001 .174 .004
2 .465 .326 .544 .199
3 .362 .093 .264 .050
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .162 .0103
2 .624 .161
3 .245 .116
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .154 .006
2 .539 .108
3 .249 .0593
Threshold VECM
Moment [DELTA] G [DELTA] R
1 .097 .012
2 .592 .119
3 .148 .075
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .108 .007
2 .657 .095
3 .188 .0304
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .095 .011
2 .625 .105
3 .147 .053
(C) Eight-Quarters-Ahead Forecasts
AR Linear VECM
Moment [DELTA] G [DELTA] R [DELTA] G [DELTA] R
1 .134 .002 .165 .006
2 .459 .301 .568 .152
3 .319 .099 .272 .058
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .172 .008
2 .606 .175
3 .265 .089
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .168 .008
2 .505 .147
3 .251 .0656
Threshold VECM
Moment [DELTA] G [DELTA] R
1 .121 .008
2 .526 .081
3 .181 .057
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
1 .107 .009
2 .592 .063
3 .175 .028
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
1 .094 .007
2 .512 .097
3 .137 .041
Notes: The table records the p values for [chi square] LM tests of
serial correlation (up to fourth order) for the first, second, and
third moments of the PIT series.
TABLE 7
Berkowitz Test
(A) One-Quarter-Ahead Forecasts
AR Linear VECM
[DELTA] G [DELTA] R [DELTA] G [DELTA] R
.000 .000 .000 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.000 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.000 .000
Threshold VECM
[DELTA] G [DELTA] R
.000 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.072 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.104 .000
(B) Four-Quarters-Ahead Forecasts
AR Linear VECM
[DELTA] G [DELTA] R [DELTA] G [DELTA] R
.013 .000 .000 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.000 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.000 .000
Threshold VECM
[DELTA] G [DELTA] R
.010 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.013 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.012 .000
(C) Eight-Quarters-Ahead Forecasts
AR Linear VECM
[DELTA] G [DELTA] R [DELTA] G [DELTA] R
.000 .000 .000 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.000 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.000 .000
Threshold VECM
[DELTA] G [DELTA] R
.074 .000
[DELTA] G/ [DELTA] R/
[DELTA] R [DELTA] G
.031 .000
([DELTA] G/ ([DELTA] R/
[DELTA] R) [DELTA] G)
x [DELTA] R x [DELTA] G
.048 .000
Notes: The entries are the p values of the Berkowitz (1999)
likelihood ratio test for joint null of normality and iid in PIT *,
which is the inverse of the cumulative normal distribution of the
PIT.
TABLE 8
Probability Forecast Evaluation
QPS LIPS
(A) Government Spending
AR .467 .660
.514 .707
.497 .690
Linear VECM .473 .664
.443 .635
.441 .633
Nonlinear VECM .464 .652
.434 .626
.440 .632
(B) Tax Revenues
AR .478 .671
.505 .698
.499 .692
Linear VECM .496 .692
.478 .671
.479 .672
Nonlinear VECM .584 .873
.481 .678
.470 .662
Notes: The three entries in each cell (from the top to the bottom)
are the QPS and LPS scores for the one, four, and eight quarters
ahead.