Rising inequality: transitory or persistent? New evidence from a panel of U.S. tax returns.
Debacker, Jason ; Heim, Bradley ; Panousi, Vasia 等
VII. The Role of the Federal Tax System
This section explores the role of the federal tax system in the
increase in income inequality over our sample period. In particular, we
examine whether the trend in inequality for after-tax household income
differs materially from that for pre-tax income. As discussed in section
II.B, our measure of after-tax household income reflects all federal
personal income taxes (obtained from Form 1040), including all
refundable tax credits such as the earned income tax credit and the
child tax credit, as well as payroll taxes (calculated using information
from W-2 forms).
The last two columns of table 4 present point estimates and
standard errors for our ECM estimated on after-tax household income
using both our male-headed households sample and our broader sample of
all households. Figure 9 plots the total, persistent, and transitory
variances of both pre-tax and after-tax household income for the sample
of all households. As the figure shows, the total variance of after-tax
income is on average 0.10 squared log point, or roughly 15 percent,
smaller than the variance of pre-tax income, reflecting the overall
progressivity of the federal tax system. The effect of the tax system in
reducing income inequality appears relatively stable over the sample
period, but for the period as a whole, pretax household income
inequality increased by more than after-tax income inequality (0.13
versus 0.08 squared log point). That is, the tax system appears to have
reduced the increase in household income inequality over the sample
period. Nonetheless, as was already seen in figure 1, this attenuating
effect was insufficient to alter the broad trend toward rising
inequality for after-tax household income.
[FIGURE 9 OMITTED]
The relatively constant effect of the federal tax system on
reducing the level of inequality during our sample period might appear
surprising in light of the high-profile reductions in marginal tax
rates, especially at the high end of the income distribution, in 2001
and 2003. However, the changes in top marginal tax rates were
accompanied by (smaller) reductions in marginal tax rates for other
income groups as well as by significant expansions of the earned income
tax credit and the child tax credit. Our results suggest that the net
effect on after-tax income inequality of all these changes to the
federal tax system was relatively small. (46)
VIII. Conclusions
We have used a confidential panel of tax returns from the Internal
Revenue Service to analyze the role of persistent and transitory income
components in changes in inequality in male labor earnings and total
household income, both before and after taxes, in the United States over
the period 1987-2009. We first documented an increase in inequality in
male earnings and in
pre-tax and after-tax household income in our data during this
period, consistent with what other studies have documented using
different data sets. We then examined the contributions of persistent
and transitory income components to this increase in inequality, as
measured by the cross-sectional variance of log income.
We have used two broad sets of methods in our analysis. First, we
employed a variety of simple nonparametric decomposition methods that
use a strict definition of transitory income, which is not allowed to be
serially correlated, and a broad definition of persistent income, which
captures income with varying degrees of persistence. Second, we employed
rich nonstationary error components models of income dynamics, which
fully specify the process that generates income over time, and
essentially decomposed income into a highly persistent piece and another
transitory piece that allows for some limited degree of serial
correlation. Our paper is the first to estimate rich nonstationary ECMs
of income on U.S. administrative data, and among the first to apply
nonstationary ECMs to household-level income. Here the quality and
significant size of our data set allow us to obtain very precise
estimates of our models.
Overall, our data yield very robust results for the trends in the
variance of persistent and transitory income components. For male labor
earnings, we find that the variance of the persistent component of
earnings increased over the sample period, but the variance of the
transitory component did not. Hence the increase in male earnings
inequality was driven entirely by the increase in the persistent
component, thus reflecting an increase in persistent inequality. For
household income, both before and after taxes, the increase in
inequality over this period derived mostly (although not entirely) from
the persistent component. The increase in the variance of the transitory
component of total household income reflects an increase in the
transitory variance of spousal labor earnings and investment income. We
also find evidence that the federal tax system helped reduce the
increase in household income inequality, but this attenuating effect was
insufficient to significantly alter the broad trend toward rising
inequality.
Our findings, along with economic theory, suggest that the increase
in income inequality observed in roughly the last two decades should
translate into increases in consumption inequality and is therefore
likely to be welfare-reducing, at least according to most social welfare
functions. Although measurement problems with household consumption data
in the United States have made it difficult to convincingly measure the
increase in consumption inequality, some recent studies that attempt to
control for these measurement issues, such as Aguiar and Bils (2012) and
Attanasio, Hurst, and Pistaferri (2012), suggest that it was indeed
substantial. This is consistent with our findings of a large role of the
persistent component of income in rising income inequality.
APPENDIX A
An Alternative Nonstationary ECM Specification
A few papers (for instance, Heathcote, Storesletten, and Violante
2010, Blundell, Pistaferri, and Preston 2008, and Heathcote, Perri, and
Violante 2010) have estimated versions of an alternative nonstationary
ECM specification, in which the variance of persistent shocks can change
over calendar time, but which are simpler along other dimensions of the
model. Here we present estimates for a version of this alternative
specification in order to check the robustness of our results. The
general model can be expressed as
(A.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(A.2) [p.sup.i.sub.a,t] = [[psi][p.sup.i.sub.a-1,t-1] +
[[phi].sub.t][[eta].sup.i.sub.a,t]
(A.3) [[tau].sup.i.sub.a,t] = [[pi].sub.t][[epsilon].sup.i.sub.a,t]
+ [[theta].sub.1][[pi].sub.t- 1][[epsilon].sup.i.sub.a-1,t-1] +
[[theta].sub.2][[pi].sub.t-2][[epsilon].sup.i.sub.a-2,t-2]
(A.4) [[alpha].sup.i] ~ i.i.d.(0, [[sigma].sup.2.sub.[alpha]),
[[eta].sup.i.sub.a,t] ~ i.i.d.(0, [[sigma].sup.2.sub.[eta]]),
[[epsilon].sup.i.sub.a,t] ~ i.i.d.(0, [[sigma].sup.2.sub.[epsilon]]).
In this specification the [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
IN ASCII] parameter multiplies the [[alpha].sup.i] component only, and a
new set of parameters [[phi].sub.t] allow the variance of the persistent
shocks [[eta].sup.i.sub.a,t] to change over calendar time. (Note that
parameters [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are
different from the [[lambda].sub.t] in our baseline model, since
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] allow only the value
of [[alpha].sup.i] to change over time, and not that of the persistent
characteristics [p.sup.i.sub.a,t].) The previous studies typically use a
simpler version of this model that excludes the [MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII] from equation A. 1. For our purposes the
inclusion of the [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
component is necessary, because we cannot remove the income variation
that is due to characteristics such as education, and in our context it
is key to allow the prices of such characteristics to change over time.
(47)
Table A.1 presents point estimates and standard errors for the
above model for male earnings and for total pre-tax household income,
the latter using our sample of all households. Figure A. 1 shows the
corresponding decompositions of the cross-sectional variance of male
earnings. Note that the component of the variance labeled
"persistent" is the sum of the contributions of both
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and
[p.sup.i.sub.a,t] to the cross- sectional variance. As in our baseline
ECM, the persistent variance component displays a clearly increasing
trend, rising from 0.39 squared log point in 1987 to 0.50 squared log
point in 2009. Fitting a linear time trend to this series yields an
estimated trend coefficient of 0.0041 (with a standard error of 0.0003),
similar to that obtained with our baseline nonstationary specification.
The transitory part of the variance, the lowest line in figure A. 1,
again exhibits no trend (an estimated linear time trend yields a
coefficient of essentially zero).
Figure A.2 separates the persistent variance component in this
model into the contributions of the terms [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII] and [p.sup.i.sub.a,t]. As the figure shows, the
increase in this component is driven by an increase in the variance of
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], whereas the
variance of [p.sup.i.sub.a,t] fluctuates but does not exhibit any clear
trend. As the absence of a trend in var([p.sup.i.sub.a,t]) implies, the
estimated variance of the persistent shocks ([MATHEMATICAL EXPRESSION
NOT REPRODUCIBLE IN ASCII]) in table A.2 varies substantially from year
to year but has remained relatively stable on average over our sample
period. (48) For the question addressed in this paper, these results are
very similar to those obtained with our baseline model.
[FIGURE A.1 OMITTED]
[FIGURE A.2 OMITTED]
[FIGURE A.3 OMITTED]
Figure A.3 shows the decomposition, using the alternative model, of
the cross-sectional variance of total pre-tax household income for our
sample of all households. Here, too, the results are similar to those
obtained with our baseline specification. There is a clear rising trend
in the persistent component of the variance, and this increase is
concentrated in the first half of the sample period. The transitory
variance component fluctuates but overall is largely flat, except
perhaps for a small increase in the last few years of the period.
Fitting a linear time trend to the persistent and transitory variance
components yields trend coefficients of 0.0055 (0.0005) and 0.0005
(0.0005), respectively. Again, most of the increase in the
cross-sectional variance of total pre-tax household income was driven by
the variance of the persistent component of income. In fact, this
specification implies that the transitory variance component played even
less of a role than in our baseline model (compare the 0.0005 estimated
trend coefficient on the transitory variance component with the 0.0013
coefficient shown in the bottom right panel of table 5).
[FIGURE A.4 OMITTED]
Figure A.4 shows the contributions of var([bar.[lambda]],
[[alpha].sup.i]) and var([p.sup.i.sub.a,t]) to the persistent variance
component in the same decomposition and indicates that, similar to the
case of male earnings, the increase in the persistent variance component
was driven by an increase in the variance of var([bar.[lambda]],
[[alpha].sup.i]), that is, by an increase in the [bar.[lambda]]. Fitting
a linear time trend to the var([p.sup.i.sub.a,t]) series yields a trend
coefficient of 0.0008 (0.0005), implying only a minor increase of about
0.02 squared log point over 23 years.
Overall, for the question asked in this paper, the results obtained
with this alternative specification are very similar to those obtained
with our baseline model.
Table A.1. Estimates of the Alternative Nonstationary Error
Components Model (a)
Male labor earnings
Persistent Transitory
Parameter component component
[[sigma].sup.2.sub.[alpha]] 0.1458
(0.0235)
[[??].sub.t] polynomial (b)
[b.sub.1] 0.0136
(0.0419)
[b.sub.2] (x 10) 0.0190
(0.0584)
[b.sub.3] (x 100) -0.0175
(0.0337)
[b.sub.4] (x 1000) 0.0038
(0.0069)
[psi] 0.9619
(0.0058)
[[sigma].sup.2.sub.[eta]] 0.0296
(0.0040)
[[theta].sub.1] 0.2396
(0.0163)
[[theta].sub.2] 0.1353
(0.0179)
[[sigma].sup.2.sub.[epsilon]] 0.1749
(0.0166)
[phi] or [pi] (c)
1987 1.0000 1.0000
1988 1.1382 1.0867
(0.3929) (0.0571)
1989 1.2149 1.0160
(0.2613) (0.0584)
1990 0.8679 0.9985
(0.3494) (0.0530)
1991 1.0022 0.9845
(0.2808) (0.0536)
1992 0.7569 1.0887
(0.3553) (0.0505)
1993 1.1759 1.0444
(0.2280) (0.0483)
1994 0.0051 1.0659
(0.2890) (0.0509)
1995 1.2536 1.0071
(0.1737) (0.0580)
1996 0.7085 1.0319
(0.3096) (0.0567)
1997 1.0256 0.9884
(0.1961) (0.0552)
1998 0.7510 1.0236
(0.2669) (0.0599)
1999 0.9245 1.0070
(0.2239) (0.0578)
2000 0.8369 1.0463
(0.2959) (0.0654)
2001 1.1827 0.9787
(0.1766) (0.0613)
2002 1.2415 0.9859
(0.1344) (0.0538)
2003 0.7567 1.0239
(0.1758) (0.0601)
2004 1.0366 0.9898
(0.1428) (0.0534)
2005 0.7861 1.0202
(0.2186) (0.0565)
2006 1.0178 1.0685
(0.1622) (0.0582)
2007 0.7559 1.0508
(0.2257) (0.0553)
2008 1.2584 1.0277
(0.1457) (0.0544)
2009 1.0208
(0.0557)
Pre-tax household
income, all households
Persistent Transitory
Parameter component component
[[sigma].sup.2.sub.[alpha]] 0.1313
(0.0187)
[[??].sub.t] polynomial (b)
[b.sub.1] 0.0170
(0.0425)
[b.sub.2] (x 10) 0.0488
(0.0567)
[b.sub.3] (x 100) -0.0443
(0.0316)
[b.sub.4] (x 1000) 0.0097
(0.0064)
[psi] 0.9693
(0.0041)
[[sigma].sup.2.sub.[eta]] 0.0248
(0.0025)
[[theta].sub.1] 0.2877
(0.0114)
[[theta].sub.2] 0.1703
(0.0142)
[[sigma].sup.2.sub.[epsilon]] 0.1533
(0.0122)
[phi] or [pi] (c)
1987 1.0000 1.0000
1988 1.3067 1.0068
(0.2610) (0.0496)
1989 0.9528 1.0198
(0.3390) (0.0443)
1990 0.9199 0.9910
(0.3456) (0.0465)
1991 1.0393 0.9619
(0.2798) (0.0418)
1992 0.0028 1.0833
(0.3144) (0.0474)
1993 1.2418 1.0038
(0.2059) (0.0492)
1994 0.1117 1.0396
(0.2974) (0.0522)
1995 1.2674 0.9879
(0.1697) (0.0520)
1996 0.8273 1.0129
(0.2178) (0.0525)
1997 0.8840 1.0257
(0.1912) (0.0544)
1998 1.1198 1.0285
(0.1457) (0.0550)
1999 0.4886 1.0629
(0.2776) (0.0520)
2000 0.8887 1.0652
(0.2519) (0.0555)
2001 1.2008 0.9645
(0.1599) (0.0527)
2002 1.1278 0.9388
(0.1612) (0.0502)
2003 1.1011 0.9511
(0.1411) (0.0494)
2004 1.1579 1.0118
(0.0958) (0.0470)
2005 0.9594 1.0476
(0.1427) (0.0481)
2006 1.1553 1.0720
(0.1235) (0.0478)
2007 0.7027 1.1380
(0.1846) (0.0482)
2008 1.1606 1.0100
(0.1313) (0.0474)
2009 0.9954
(0.0491)
Source: Authors' calculations using SOI data.
(a.) Estimates of equations A.1 through A.4 using a minimum distance
estimator (see section V.Q. Boot-strap standard errors based on 200
replications are in parentheses.
(b.) See appendix D for specification of the polynomial.
(c.) Panel reports estimates of parameters 0 (for the persistent
component) and n (for the transitory component) corresponding to each
year of the sample period (1987-09); parameters are normalized to
equal 1 in 1987 (see appendix D).
APPENDIX B
KSS and GM Methods
Let [[xi].sup.i.sub.t] be residual log income, where t is the
calendar year, and where the age index a is suppressed for convenience.
In the KSS methodology, the persistent variance in year t is var ([1/P]
[[summation].sup.t+k.sub.j=t-k][[xi].sup.i.sub.j]), where k= (P - 1)/2,
and where the variance is computed across all individuals (or
households) for whom [1/P]
[[summation].sup.t+k.sub.j=t-k][[xi].sup.i.sub.j] is defined for a given
t. The transitory variance at t is var([[xi].sub.it] - [1/P]
[[summation].sup.t+k.sub.j=t-k][[xi].sup.i.sub.j]). Following Kopczuk,
Saez, and Song (2010), we set P = 5.
In the GM methodology, let N be the number of individuals,
[T.sub.i] [less than or equal to] P the number of years (within the
P-year window) that person i is observed, [bar.[[xi].sub.i]] the
person-specific average residual log income over [T.sub.i] years, [??]
the mean of residual log income across the full sample, and [bar.T] the
mean years covered by the window over the individuals in the sample.
Then, the exact formula (within each fixed-size window) for the
transitory variance is [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN
ASCII], and for the persistent variance is [1/N -
1][summation].sup.N.sub.i-1][([bar.[[xi].sub.i]] - [xi]).sup.2] -
[[[??].sup.2.sub.[upsilon]]/T].
The persistent and transitory variances from GM are similar,
although not identical, to the KSS ones. The main difference lies in the
presence of the term--([[??].sup.2.sub.[upsilon]]/T) in the persistent
GM variance (see Gottschalk and Moffitt 2009, footnote 2).
Note that Gottschalk and Moffitt use P = 9 (rather than our P = 5
in the main text). This slightly reduces the share of the total variance
attributed to the persistent component, and slightly increases the share
attributed to the transitory component, but has no effect on the trends
of the two components.
APPENDIX C
Estimation of the Error Components Model
This appendix provides details of our minimum distance estimator.
As mentioned in the text, the estimator matches the model's
theoretical variances and autocovariances (specified in levels) to their
empirical counterparts. In particular, given any triplet (a, t, k) of
normalized age a, calendar year t, and lead k, the error components
model in equations 6 through 9 implies a specific parametric form for
each autocovariance of residual income, such as cov([[xi].sub.a,t],
[[xi].sub.a+k,t+k]). For instance, for (a = 2, t = 1995, k = 0), this
would be the variance (since k = 0) in the incomes across all
individuals of age 26 in year 1995. These theoretical variances and
autocovariances, denoted by cov(a, t, k), are functions of the model
parameters [[sigma].sup.2.sub.[alpha]], [psi],
[[sigma].sup.2.sub.[eta]], [[sigma].sup.2.sub.[epsilon]],
[[theta].sub.1], and [[theta].sub.2], and [[lambda].sub.t], and
[[pi].sub.t] for t = 1987, ..., 2009. We estimate these model parameters
by minimizing the distance between, on the one hand, the theoretical
variances and autocovariances implied by the model, and on the other,
their empirical counterparts, which we compute from our longitudinal tax
return data for a = 1, ..., 36; t = 198,7 ..., 2009; and k = 0, ..., 22.
This yields 7,912 variances and autocovariances that are matched in
estimation. Our minimum distance estimator uses a diagonal matrix as the
weighting matrix, with weights equal to the inverse of the number of
observations used to compute each empirical statistical moment. (49) We
do not use an optimal weighting matrix, for reasons discussed in Altonji
and Segal (1996).
APPENDIX D
Moment Conditions
Let a be "normalized age" or "potential
experience," defined as a = age - 25 + 1, or years starting with
age 25. Then, the theoretical moments implied by our baseline error
components model in equations 6 through 9 are as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where 1[] is an indicator function equal to either zero or 1.
For t = 1987, 2 [less than or equal to] a [less than or equal to]
36,
var([p.sub.a,1987]) = [[sigma].sup.2.sub.[eta]] [1 -
[[psi].sup.2a]/1 - [[psi].sup.2]].
For 1987 [less than or equal to] t [less than or equal to] 2009, a
= 1,
var([p.sub.1,t]) = [[sigma].sup.2.sub.[eta]].
For 1988 [less than or equal to] t [less than or equal to] 2009, 2
[less than or equal to] a [less than or equal to] 36,
var([p.sub.a,t]) = [[psi].sup.2] var([p.sub.a-1,t-1]) +
[[sigma].sup.2.sub.[eta]].
To obtain identification, we impose the normalization [[lambda].sub.t] = [[pi].sub.t] = 1 for all calendar years t [less than
or equal to] 1987, where 1987 is the first year in the sample. Parameter
[[lambda].sub.t] (normalized) is restricted to lie on a fourth-order
polynomial of the following form: for 1988 [less than or equal to] t
[less than or equal to] 2009, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
IN ASCII], where [??] = t - 1987. (50)
APPENDIX E
Sample Age Distribution by Calendar Year
Sample Age Distribution by Calendar Year
Age (years) (a)
Male earnings All households
sample sample (b)
Year Mean SD Mean SD
1987 39 9.9 39 10.0
1988 39 9.8 39 10.0
1989 39 9.8 39 9.9
1990 39 9.7 40 9.8
1991 39 9.6 40 9.8
1992 40 9.7 40 9.8
1993 40 9.6 40 9.7
1994 40 9.6 40 9.7
1995 40 9.6 40 9.8
1996 40 9.6 40 9.8
1997 40 9.6 41 9.8
1998 41 9.7 41 9.8
Age (years)
Male earnings All households
sample sample (b)
Year Mean SD Mean SD
1999 41 9.6 41 9.8
2000 41 9.6 41 9.8
2001 41 9.7 41 9.9
2002 41 9.7 41 9.9
2003 41 9.7 42 10.0
2004 41 9.8 42 10.0
2005 41 9.9 42 10.1
2006 41 10.0 42 10.2
2007 41 10.0 42 10.2
2008 42 10.1 42 10.3
2009 42 10.1 42 10.3
Source: Authors' calculations using SOI data.
(a.) SD = standard deviation.
(b.) Age is that of the primary filer.
ACKNOWLEDGMENTS We are grateful to the editors, to our discussants
Greg Kaplan, Lindsay Owens, and David Grusky, and to Chris Carroll for
extremely useful feedback and suggestions. We thank Joe Altonji, Eric
Engen, Michael Golosov, Michael Palumbo, Emmanuel Saez, Dan Sichel, and
Paul Smith for very helpful comments and discussions. We also thank the
participants at the Brookings Panel and at numerous other seminars and
conferences. The views presented here are solely those of the authors
and do not necessarily represent those of the Treasury Department, the
Board of Governors of the Federal Reserve System, or members of their
staffs. The authors report no relevant conflicts of interest.
References
Abowd, John, and David Card. 1989. "On the Covariance Structure of Hours and Earnings Changes." Econometrica 57, no.
2:411-45.
Aguiar, Mark, and Mark Bils. 2012. "Has Consumption Inequality
Mirrored Income Inequality?" Princeton University and University of
Rochester (May).
Altonji, Joseph, and Lewis M. Segal. 1996. "Small Sample Bias
in GMM Estimation of Covariance Structure." Journal of Business and
Economic Statistics 14, no. 3: 353-66.
Altonji, Joseph, Anthony Smith, and Ivan Vidangos. Forthcoming.
"Modeling Earnings Dynamics." Econometrica.
Attanasio, Orazio, Eric Battistin, and Hide Ichimura. 2007.
"What Really Happened to Consumption Inequality in the US?" In
Hard-to-Measure Goods and Services: Essays in Honor of Zvi Griliches,
edited by E. Berndt and C. Hulten. University of Chicago Press.
Attanasio, Orazio, Eric Battistin, and Mario Padula. 2011.
"Inequality in Living Standards since 1980: Income Tells Only a
Small Part of the Story." Washington: American Enterprise
Institute.
Attanasio, Orazio, Erik Hurst, and Luigi Pistaferri. 2012.
"The Evolution of Income, Consumption, and Leisure Inequality in
the US, 1980-2010." Working Paper no. 17982. Cambridge, Mass.:
National Bureau of Economic Research (April).
Autor, David, Lawrence F. Katz, and Melissa S. Kearney. 2008.
"Trends in U.S. Wage Inequality: Revising the Revisionists."
Review of Economics and Statistics 90: 300-23.
Baker, Michael. 1997. "Growth-Rate Heterogeneity and the
Covariance Structure of Life-Cycle Earnings." Journal of Labor
Economics 15: 338-75.
Baker, Michael, and Gary Solon. 2003. "Earnings Dynamics and
Inequality among Canadian Men, 1976-1992: Evidence from Longitudinal
Income Tax Records." Journal of Labor Economics 21 : 289-321.
Blundell, Richard, Luigi Pistaferri, and Ian Preston. 2008.
"Consumption Inequality and Partial Insurance." American
Economic Review 98: 1887-1921.
Bound, John, and George E. Johnson. 1992. "Changes in the
Structure of Wages in the 1980s: An Evaluation of Alternative
Explanations." American Economic Review 82: 371-92.
Card, David, and Thomas Lemieux. 1996. "Wage Dispersion,
Returns to Skill, and Black-White Wage Differentials." Journal of
Econometrics 74, no. 2:319-61.
Carroll, Christopher. 1992. "The Buffer-Stock Theory of
Saving: Some Macroeconomic Evidence." BPEA, no. 2: 61-135.
Carroll, Christopher, and Andrew Samwick. 1997. "The Nature of
Precautionary Wealth." Journal of Monetary Economics 40, no. 1:
41-71.
Celik, Sule, Chinhui Juhn, Kristin McCue, and Jesse Thompson. 2012.
"Recent Trends in Earnings Volatility: Evidence from Survey and
Administrative Data." B.E. Journal of Economic Analysis and Policy
12, no. 2: article 1 (December).
Congressional Budget Office. 2008. "Recent Trends in the
Variability of Individual Earnings and Household Income."
Washington. www.cbo.gov/ftpdocs/95xx/ doc9507/06-30-Variability.pdf.
Dynan, Karen E., Douglas W. Elmendorf, and Daniel E. Sichel. 2012.
"The Evolution of Household Income Volatility." B.E. Journal
of Economic Analysis and Policy 12, no. 2: article 3 (December).
Gottschalk, Peter, and Robert Moffitt. 1994. "The Growth of
Earnings Instability in the U.S. Labor Market." BPEA, no. 2:
217-72.
--. 2009. "The Rising Instability of U.S. Earnings."
Journal of Economic Perspectives 23: 3-24.
Guvenen, Fatih. 2009. "An Empirical Investigation of Labor
Income Processes." Review of Economic Dynamics 12: 58-79.
Haider, Steven J. 2001. "Earnings Instability and Earnings
Inequality of Males in the United States, 1967-1991." Journal of
Labor Economics 19: 799-836.
Hause, John C. 1980. "The Fine Structure of Earnings and the
On-the-Job Training Hypothesis." Econometrica 48:1013-29.
Heathcote, Jonathan, Fabrizio Perri, and Gianluca L. Violante.
2010. "Unequal We Stand: An Empirical Analysis of Economic
Inequality in the United States, 1967-2006." Review of Economic
Dynamics 13: 15-51.
Heathcote, Jonathan, Kjetil Storesletten, and Gianluca L. Violante.
2010. "The Macroeconomic Implications of Rising Wage Inequality in
the United States." Journal of Political Economy 118, no. 4:
681-722.
Hryshko, Dmytro. 2012. "Labor Income Profiles Are Not
Heterogeneous: Evidence from Income Growth Rates." Quantitative
Economics 3: 177-209.
Juhn, Chinhui J., Kevin M. Murphy, and Brooks Pierce. 1993.
"Wage Inequality and the Rise in Returns to Skill." Journal of
Political Economy 101: 410-42.
Katz, Lawrence F., and David Autor. 1999. "Changes in the Wage
Structure and Earnings Inequality." In Handbook of Labor Economics,
vol. III, edited by Orley Ashenfelter and David Card. Amsterdam: North
Holland.
Katz, Lawrence F., and Kevin M. Murphy. 1992. "Changes in
Relative Wages, 1963-87: Supply and Demand Factors." Quarterly
Journal of Economics 107: 35-78.
Kopczuk, Wojciech, Emmanuel Saez, and Jae Song. 2010.
"Earnings Inequality and Mobility in the United States: Evidence
from Social Security Data since 1937." Quarterly Journal of
Economics 125: 91-128.
Krueger, Dirk, and Fabrizio Perri. 2006. "Does Income
Inequality Lead to Consumption Inequality? Evidence and Theory."
Review of Economic Studies 73: 163-93.
Lemieux, Thomas. 2008. "What Do We Really Know about Changes
in Wage Inequality?" University of British Columbia.
Lillard, Lee A., and Yoram Weiss. 1979. "Components of
Variation in Panel Earnings Data: American Scientists 1960-1970."
Econometrica 47: 437-54.
Lillard, Lee A., and Robert J. Willis. 1978. "Dynamic Aspects
of Earning Mobility." Econometrica 46: 985-1012.
Low, Hamish, Costas Meghir, and Luigi Pistaferri. 2010. "Wage
Risk and Employment Risk over the Life Cycle." American Economic
Review 100, no. 4: 1432-67.
MaCurdy, Thomas E. 1982. "The Use of Time Series Processes to
Model the Error Structure of Earnings in a Longitudinal Data
Analysis." Journal of Econometrics 18:83-114.
Meghir, Costas, and Luigi Pistaferri. 2004. "Income Variance
Dynamics and Heterogeneity." Econometrica 72: 1-32.
Moffitt, Robert, and Peter Gottschalk. 1995. "Trends in the
Covariance Structure of Earnings in the U.S.: 1969-1987." Johns
Hopkins University and Boston College.
--. 2011. "Trends in the Transitory Variance of Male Earnings
in the U.S., 1970-2004." Working Paper no. 16833. Cambridge, Mass.:
National Bureau of Economic Research.
Murphy, Kevin M., and Finis Welch. 1992. "The Structure of
Wages." Quarterly Journal of Economics 107: 285-326.
Piketty, Thomas, and Emmanuel Saez. 2003. "Income Inequality
in the United States, 1913-1998." Quarterly Journal of Economics
118: 1-39.
--. 2007. "How Progressive Is the U.S. Federal Tax System? A
Historical and International Perspective." Journal of Economic
Perspectives 21: 3-24.
Primiceri, Giorgio, and Thijs van Rens. 2009. "Heterogeneous
Life-Cycle Profiles, Income Risk and Consumption Inequality."
Journal of Monetary Economics 56: 20-39.
Sabelhaus, John, and Jae Song. 2009. "Earnings Volatility
across Groups and Time." Working paper. University of Maryland and
U.S. Social Security Administration.
--. 2010. "The Great Moderation in Micro Labor Earnings."
Journal of Monetary Economics 57: 391-403.
Shin, Donggyun, and Gary Solon. 2011. "Trends in Men's
Earnings Volatility: What Does the Panel Study of Income Dynamics
Show?" Journal of Public Economics 95: 973-82.
Slesnick, Daniel T. 2001. "Consumption and Social Welfare:
Living Standards and Their Distribution in the United States."
Cambridge University Press.
Comments and Discussion
COMMENT BY GREG KAPLAN
This paper by Jason DeBacker and coauthors provides a new
perspective on the much-documented rise in income inequality in the
United States, by exploiting confidential data on labor earnings and
household income from the Internal Revenue Service (IRS). The IRS data
contain information from a large panel of tax returns over the period
from 1987 to 2009. The authors use these data to ask whether the recent
rise in inequality is mostly due to persistent or to transitory factors.
Other authors have answered this question using survey data,
predominantly from the Panel Study of Income Dynamics (PSID), and for
earlier periods. But this paper breaks new ground in its use of
high-quality administrative data to decompose the rise in inequality in
the 1990s and 2000s.
DeBacker and his coauthors reach a stark conclusion: all of the
recent rise in inequality in male earnings is due to persistent factors;
transitory factors have made no contribution to the increase in
inequality. Their findings for total household income are similar but
less extreme. The authors reach these conclusions using two different
approaches. First, they employ simple nonparametric methods, which
effectively measure the persistent component of income as a rolling
average of income in a given number of adjacent years, and the
transitory component as the residual from this rolling average. Second,
they estimate error components models (ECMs) for earnings. The ECM
approach involves specifying and estimating the parameters of a
time-varying stochastic process for income. The persistent and
transitory components are then inferred from the estimated model. The
authors' conclusions about the relative importance of persistent
versus transitory factors are consistent across the two methods.
In this discussion I will elaborate on three issues that are
related to these findings, focusing exclusively on the ECM analysis of
male labor earnings. First, I will use data from the PSID to investigate
how the particular choice of ECM framework may have influenced the
authors' conclusions. In doing so I will distinguish between
factors that are fixed at the time of entry into the labor market, and
shocks that are realized after entry. I will attempt to shed light on
which of these factors is responsible for the increase in the persistent
variance. I will also explain how an increase in the variance of shocks
that occurred before 1987 could be responsible for the observed increase
in inequality from 1987 to 2009 even in the absence of any changes in
the labor market during this period. Second, I will use the PSID data to
investigate the importance of changes in the returns to education in
accounting for the authors' findings. I will show that the findings
are mostly consistent across the two data sets and are not substantially
affected by controlling for education. Third, I will highlight an issue
that the authors do not address, but that is a natural one to raise in
light of their findings, and given their access to the IRS data: in
which part of the income distribution is the recent rise in inequality
concentrated? I will conduct a simple decomposition using the PSID data
to investigate this issue.
How do the publicly available PSID data compare with the
confidential IRS data used by the authors? The baseline sample of male
earners from the IRS contains 221,099 person-year observations on 20,859
individuals over the period 1987-2009. In all of the analyses that
follow, I use a sample of male heads of households from the PSID that
imposes the same selection criteria for age and minimum annual earnings
as the authors impose on the IRS data. The resulting sample contains
70,479 person-year observations on 6,778 individuals over the period
1970-2008 (the data are biennial after 1996). Thus, the IRS sample is
about three times the size of the PSID sample, both in terms of
individuals and in terms of individual-year observations.
My figure 1 plots inequality in male earnings, as measured by the
standard deviation of the logarithm, in the two data sets over time. For
the period over which the two samples overlap, the trends in inequality
are very similar. The level of inequality is about 0.1 log point higher
in the IRS data, likely because of undersampling of very high earners in
the PSID. Moreover, the IRS series appears far less noisy than the PSID
series, which reinforces the view that the IRS data are useful for
reevaluating questions that have been addressed using PSID data in the
existing literature, such as the cyclicality of idiosyncratic labor
income risk (compare, for example, the difference in the increase in
inequality during the 1990-91 recession in the two series in this
figure).
Figure 1 also puts in perspective the magnitude of the rise in
inequality that DeBacker and coauthors decompose. Although inequality
has undoubtedly increased between 1987 and 2009 in the IRS data, the
magnitude of the increase is smaller (about 0.05 log point) than that in
the 1970s and 1980s in the PSID data (about 0.15 log point). Both data
sets have advantages and disadvantages. The IRS data set is cleaner and
larger and has better coverage at the top of the earnings distribution.
Yet it is confidential and lacks data on demographic information, such
as education. The PSID data, on the other hand, are publicly available
and contain many demographic and financial variables.
[FIGURE 1 OMITTED]
The ECM framework that DeBacker and his coauthors employ is one of
many possible choices. Consider the following parametric model for
residual log earnings of individual i in year t, [[xi].sup.i.sub.t]:
(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [[epsilon].sup.i.sub.t] and [[eta].sup.i.sub.t] are mean-zero
i.i.d (over time) shocks with constant variances
[[sigma].sup.2.sub.[epsilon]] and [[sigma].sup.2.sub.[eta]], and
[[alpha].sup.i] is a mean-zero fixed effect with variance
[[sigma].sup.2.sub.[alpha]]. The authors refer to the component
([[alpha].sup.i] + [p.sup.i.sub.t]) as the persistent component and to
[[tau].sup.i.sub.t] as the transitory component. I will adopt the same
terminology. The process in equation 1 differs from the one in the paper
only in that the transitory component is modeled as an MA(1) rather than
an MA(2) process. This difference is not consequential and helps to
simplify the analysis.
Decomposing changes over time in the variance of residual log
earnings requires allowing some or all of the parameters in equation 1
to change over time. There are many ways to do this. One natural way,
which I will refer to as version A, allows the variances of the two
shocks to change over time, and the price of the fixed effect to change
over time. Thus, in version A the variances of the two shocks become
[[sigma].sup.2.sub.[epsilon]t], and [[sigma].sup.2.sub.[eta]t], and the
first line in equation 1 is modified to read
(2) [[xi].sup.i.sub.t] = [[lambda].sub.[alpha],t] [[alpha].sup.i] +
[p.supp.i.sub.] + [[tau].sup.i.sub.t]
where a normalization is imposed on [[lambda].sub.[alpha],0]. In
this interpretation the ECM changes over time for two reasons:
individuals experience persistent and transitory shocks that are drawn
from a more or less dispersed distribution, and the market price of an
individual's fixed skills is changing over time.
Figure 2 shows the results from estimating ECM version A using the
PSID data. The estimate of the autoregressive parameter, [psi], is
0.962, and the estimate of the moving average parameter, [theta], is
0.215. To keep the procedure as close as possible to that in the paper,
I have restricted the price of skills, [[lambda].sub.[alpha],t], and the
variance of persistent shocks, [[sigma].sup.2.sub.[eta]t], to lie on
fourth- degree polynomials in r The variance of the transitory shock,
[[sigma].sup.2.sub.[epsilon]t], is left unrestricted. Consistent with a
large existing literature, the estimates reveal that the variance of
persistent shocks increased from the late 1970s to the late 1980s, but
was then constant until the mid-2000s before starting to rise again. The
variance of the fixed component,
[[lambda].sup.2.sub.[alpha],t][[sigma].sup.2.sub.[alpha] also increased
during the 1970s and 1980s, but then declined substantially in the 1990s
and early 2000s.
The implied variance of the total persistent component,
[[alpha].sup.2.sub.[alpha],t] + var([p.sup.i.sub.t]), is shown in the
left-hand panel of figure 3. The PSID estimates of ECM version A suggest
that the variance of the persistent component of income increased
sharply from 1975 to 1990, but was flat (or declined slightly) between
1990 and 2005. After 2005 the variance began to increase again. The
behavior of the variance of the persistent component in the 1990s
contrasts with DeBacker and coauthors' finding of an increase in
the 1990s in the IRS data. Yet given the estimated variances in figure
2, one might be surprised that the PSID estimates do not reveal an even
larger decline in the variance of the persistent component: that graph
shows that the 1990s were a period with no increase in the variance of
persistent shocks, while the variance of the fixed component declined
substantially. The reason why the variance of the total persistent
component does not decline more is that even though there was no
increase in the variance of persistent shocks during this period, the
earlier increases in [[sigma].sub.[eta]], during the 1980s led to a
continued increase in the variance of the persistence component,
var([p.sup.i.sub.t]), well into the 1990s. This occurs because it takes
time for the older cohorts who were subject to the small shocks of the
1970s to be replaced by the younger cohorts who were subject to large
shocks for their entire working life.
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
The cohort effect that arises from changes in [[sigma].sub.[eta]t],
is something to bear in mind when interpreting the findings in this
paper. The IRS data begin only in 1987, which is exactly when the
variance of the persistent shocks levels off in the PSID data. Thus if,
as one might expect, there was also an increase in [[sigma].sub.[eta]t]
before 1987 in the IRS data, one would expect to see an increase in the
variance of the persistent component in the 1990s. This increase would
not be due to changes that occurred after 1987, yet estimation using the
authors' strategy with IRS data would necessarily attribute the
increase to a change that occurred after 1987, since their framework
cannot handle lagged effects of pre-1987 changes. Unfortunately, little
can be done about this given the available data, and a similar criticism
might apply to the PSID estimates regarding changes that occurred before
1970.
An alternative way to allow the parameters in equation 1 to change
over time is to fix [[sigma].sub.[eta]] but modify the first line of the
equation to read
(3) [[xi].sup.i.sub.t] = [[lambda].sub.[alpha],t][[alpha].sup.i] +
[[lambda].sub.p,t][p.sup.i.sub.t] + [[tau].sup.i.sub.t].
I will refer to this model as ECM version B. Here the
interpretation is that the dispersion of the persistent shocks that hit
individuals does not change over time. Instead the accumulation of these
shocks, [p.sup.i.sub.t], is interpreted as slow movement in a stock of
individual-specific human capital or skills, which command a price in
the labor market [[lambda].sub.p,t]. The price of these skills is
allowed to change over time, which leads to changes in the
cross-sectional variance of residual earnings. The conceptual
distinction between [[alpha].sup.i] and [p.sup.i.sub.t] in this
interpretation is that [[alpha].sup.i] reflects skills that are
determined at the time of entry into the labor market, whereas
[p.sup.i.sub.t] reflects skills that continue to evolve stochastically after entry. Finally, one could also impose the restriction that
[[lambda].sub.[alpha],t] = [[lambda].sub.p,t] = [[lambda].sub.t], so
that the first line of equation 1 reads
(4) [[xi].sup.i.sub.t] = [lambda]([[alpha].sup.i] +
[p.sup.i.sub.t]) + [[tau].sup.i.sub.t].
I will refer to this model as ECM version C. Here the
interpretation is that the market does not distinguish between the value
of skills obtained before entry into the labor market (such as formal
education) and the value of skills acquired later in life (such as
on-the-job training or job-specific human capital). This is the
interpretation that the authors adopt, since version C is the
specification that the authors estimate with the IRS data.
How does the choice of ECM affect one's conclusions about the
rise in the persistent variance of earnings? My figure 3 attempts to
answer this question by reporting estimates of versions B and C from the
PSID as well as of version A. The left-hand panel shows that the
variance of the total persistent component is essentially identical in
all three versions (the three versions also deliver very similar
estimates for the autoregressive and the moving-average parameters).
Thus, to the extent that these findings carry over to the IRS data, it
is unlikely that the authors' conclusions about the rise in the
variance of the total persistent component would have been changed by
adopting either version A or version B.
Although the three versions of the ECM yield the same estimates
over time for the variance of the total persistent component, they yield
very different estimates for how this variance is divided between
factors that are fixed at the time of entry to the labor market,
[[alpha].sup.i], and factors that evolve stochastically over time,
[p.sup.i.sub.t]. These differences are illustrated in the right-hand
panel of figure 3, which shows the variance of the fixed effect,
[[lambda].sup.2.sub.[alpha],t][[sigma].sup.2.sub.[alpha],t], for each of
the three versions. Version A, which allows for the size of persistent
shocks to change over time, attributes a much bigger role to movements
in the price of fixed skills in accounting for changes in the variance
of the persistent component, compared with either version B or version
C. The distinction between cross-sectional variation in earnings due to
fixed factors and variation due to the realization of shocks is
potentially important. First, the two views of the increase in earnings
inequality may have different implications for the increase in
consumption inequality (and thus welfare) in a structural life cycle
model of intertemporal consumption choice, since the impact of the
changes in [[lambda].sub.t], depends crucially on the assumptions one
makes about how these changes enter workers' information sets.
Second, the appropriate policy interventions for influencing the
earnings distribution are different: the latter view points to the
importance of labor market interventions, whereas the former points to
education interventions.
[FIGURE 4 OMITTED]
Cross-sectional variation in the fixed effect, [[alpha].sup.i], is
partly due to cross-sectional differences in observed education and
partly due to cross-sectional differences in unobserved cognitive and
noncognitive skills. Given the importance of changes in [[lambda].sub.t]
in accounting for the change in earnings inequality in the IRS data, it
is natural to ask whether these changes reflect an increase in returns
to traditional measures of education or an increase in returns to the
unobserved components of skills. This question cannot be answered with
the IRS data, but it can be answered with the PSID data. To address
this, my figure 4 presents estimates using data on residual log
earnings, [[xi].sup.i.sub.t], that are constructed in two different
ways. The lines labeled "without education controls" are
estimates based on data where [[xi].sup.i.sub.t] is constructed as the
residual from a regression of log earnings on a full set of age dummies
in each year. This is the same approach followed by DeBacker and
coauthors. The lines in figure 4 labeled "with education
controls" are estimates based on data where [[xi].sup.i.sub.t] is
constructed as the residual from a regression of log earnings on a full
set of age dummies, education dummies, and education x age interactions
in each year.
The left-hand panel of figure 4 displays parameter estimates of ECM
version A with and without education controls. Both the estimates of the
variance of the fixed effects
[[lambda].sup.2.sub.[alpha],t][[sigma].sup.2.sub.[alpha]] and the
variance of persistent shocks [[sigma].sup.2.sub.[eta]t] are affected by
the education controls. At least half of the increase in the variance of
the fixed effects and the subsequent decline between 1970 and 2000 is
due to returns to education, but the increase in the variance in the
2000s is the same in both specifications. This result is useful in
interpreting DeBacker and coauthors' findings, since they cannot
control for education in the IRS data. Using the PSID findings as a
guide, one might conclude that the recent increase in the market price
of skills that the authors document would remain largely unchanged if
they were able to control for education. It appears that the increase is
driven by an increase in the returns to unobserved skills rather than
returns to formal education.
The right-hand panel of figure 4 offers an alternative perspective
on the likely effect of controlling for education on DeBacker and
coauthors' findings, by estimating ECM version C (the authors'
preferred specification) with and without education controls on the PSID
data. These estimates also indicate that the biggest differences in
trends under the two specifications occur before the 1990s, further
reinforcing the view that the increase in the variance of the persistent
component in the IRS data reflects an increase in returns to unobserved
skills within education groups.
Before concluding, I will raise one additional issue that the
authors do not tackle, but that could be addressed with their IRS data.
The authors focus their analysis on determining whether the recent
increase in earnings inequality has been persistent or transitory in
nature, and conclude that it is entirely the former. In addition, one
might ask which individuals have been most affected by this increase in
the variance of the persistent component. Specifically, many researchers
and policymakers are interested in understanding whether changes in
inequality affect mostly high-earnings individuals, low-earnings
individuals, or individuals in the middle of the income distribution.
The IRS data set is well suited to address this issue, again because it
is larger and cleaner than the PS1D (particularly at the top of the
distribution). One possible approach to answering this question is to
decompose the cross-sectional variance of log earnings (or residual log
earnings), [y.sup.i.sub.t], in each year as follows:
(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
where [r.sup.i.sub.t] is the rank of individual i in the year t
earnings distribution. The first two terms in equation 5 are the
variances of earnings within the bottom and the top half of the earnings
distribution, respectively. The third term in equation 5 is the
component of the variance of earnings that is due to the difference in
average earnings between the top half and the bottom half. The
decomposition here focuses on the overall cross-sectional variance, but
the panel nature of the IRS data lends itself to a similar decomposition
of only the persistent component of earnings, for example by first
employing the authors' simple nonparametric methods.
[FIGURE 5 OMITTED]
My figure 5 displays the results from implementing this
decomposition in the PSID. All three components are normalized to 1 in
1970. The figure shows that since the early 1980s, there has been
essentially no increase in the variance of earnings in the bottom half
of the distribution. By contrast, the variance within the top half of
the distribution has increased steadily since 1980 and continues to
rise. The gap between average earnings in the two halves of the
distribution has also continued to widen in recent years. Thus, the PSID
data suggest that there are important asymmetries in the earnings
distribution and that the recent increase in inequality is a more
complicated phenomenon than just changes in dynamics of the first and
second moments of the earnings process.
Given these asymmetries, a useful step forward for the literature
would be to move toward richer, possibly nonlinear, models of earnings
dynamics that can shed light on the complicated changes in the earnings
distribution observed in recent years. This paper is a useful starting
point. The IRS data set, a large panel of earnings data that is mostly
free of measurement error and top-coding, is an ideal resource for such
an investigation. Efforts to further improve this data set could lead to
large benefits for researchers, policymakers, and ultimately the welfare
of individuals. Such efforts might be focused on extending the sample
back before 1987 or on making a suitably anonymized version of the data
available for wider use.
COMMENT BY LINDSAY A. OWENS and DAVID B. GRUSKY (1)
It has long been argued that the ongoing increase in income and
earnings inequality cannot be well understood until it is decomposed
into persistent and transitory components. The persistent component
pertains to the inequality generated by the permanent characteristics of
individuals (their education, unobserved ability, and the like), whereas
the transitory component pertains to the inequality generated by
temporary shocks (such as a temporary illness, transitory unemployment,
or a change in jobs). It is not implausible that the takeoff in income
inequality partly reflects the emergence of a labor market that is
increasingly subject to transitory shocks in the form of a growing risk
of unemployment, underemployment, or job change. If this is indeed the
case, it might change our understanding of both the sources of the
takeoff and its implications for social welfare.
The key contribution of Jason DeBacker and his coauthors in this
paper is to bring a large panel of tax returns to bear on this debate.
The results reveal that the entire rise in inequality in male earnings,
and most of that in household income, is attributable to an increase in
the dispersion of the persistent component.
We leave it to others to comment on the models, the data, and other
technical features of this analysis. It suffices for our purposes to
stress that the analysis is noteworthy because of the extraordinary data
upon which it rests. The confidential panel of Internal Revenue Service
(IRS) tax returns delivers unusually high-quality earnings and income
data for an unusually large sample. Moreover, the authors apply an
impressive range of parametric and nonparametric approaches to the IRS
data, with reassuringly similar results. The authors also supplement the
more conventional and usual analyses of earnings data with revealing
analyses of pre-tax and after-tax household income. For all of these
reasons, the authors have contributed an important paper, and their
results merit close attention.
We are so impressed with the paper that we are inclined to
stipulate that it is a major contribution, forgo the usual internal
critique, and instead take on the task of considering how the analyses
might be usefully elaborated upon in light of the opportunities that the
IRS data open up. We approach this question from the point of view of
better understanding the welfare implications of inequality. The
long-standing presumption in this regard is that, insofar as the takeoff
in inequality is mainly generated by an increase in transitory shocks,
it is less consequential for welfare because individuals can always
borrow against future income and smooth out the effects of such shocks.
The takeoff in inequality might therefore be understood from a welfarist stance as entailing little more than the nuisance of engaging in more
smoothing than had before been necessary.
This comment will consider whether considerations of welfare are
indeed adequately understood in these terms. We first suggest that a
welfarist stance, if rigorously adopted, instead leads us to privilege
the concept of lifetime income and to move toward IRS-based analyses of
trends in lifetime income. We next argue for extending the
characteristic focus on intraindividual transfers to a more encompassing
consideration of interindividual transfers.
THE CASE FOR A LIFETIME INCOME APPROACH The simple point with which
we begin is that, insofar as one is willing to assume away liquidity
constraints that prevent smoothing, it seems appropriate to do so
wholeheartedly and move directly to analyzing data on lifetime income.
The obvious virtue of this approach is that it obviates the need to
parameterize the potentially complicated ways in which a shock may or
may not have short-term or long-term effects. If, for example, a lottery winner decides to immediately exit the labor market, this decision will
ultimately be revealed in his or her lifetime income. The same applies
to such shocks as pregnancy, unemployment, job shifting, or receipt of
program benefits (such as the earned income tax credit, food stamps, or
unemployment insurance). Although the authors very elegantly model how
the income effects of such shocks tend to dissipate over time, an
attractively nonparametric alternative is simply to examine trends in
the inequality of lifetime income, an approach that is approximately
equivalent to applying the method of Kopczuk, Saez, and Song (2010) with
a very large P parameter (where P refers to the number of years over
which income is averaged).
What makes this nonparametric approach attractive? If one cares
about the welfare implications of inequality, surely the first cut at
understanding those implications is to examine the first moment of each
individual's own distribution of income across years. The
presumption, in other words, is that individuals operating under a veil
of ignorance about their own distribution of future annual earnings
would, more than anything else, want to know how much they will make on
average per year (as well as the number of years they will have
earnings). It follows that the inequality of those lifetime averages,
calculated separately for each birth cohort, would speak rather directly
to matters of welfare, arguably more directly than any of the parametric
or nonparametric approaches deployed in this paper. That said, we well
appreciate that conventional parametric and nonparametric approaches are
useful for a host of other objectives, including making inferences about
consumption and consumption inequality. It must also be conceded that a
lifetime income approach implies a rather delayed reading of trends,
because each birth cohort enters the series only after its members
complete their labor force participation. This is clearly a disadvantage
insofar as real-time reporting is desired. We are merely suggesting that
a lifetime income approach is but one additional tool that happens to be
especially useful when one is making judgments about welfare.
It bears noting that such an approach entails a shift of emphasis
from period analyses to cohort analyses of trends in income inequality.
This shift is attractive because it allows one to better capture the
effects of forces that operate in cohort-specific ways. For example,
recessions have especially prominent effects on birth cohorts that come
of age during the recession itself, and these effects in turn serve to
suppress lifetime earnings (see Kahn 2010). Insofar as recessions are
inequality enhancing (because they hit poorly credentialed workers the
hardest), a cohort approach will reveal that effect especially clearly.
There is good reason to believe that other important sources of the
trend in inequality (such as changes in schooling institutions or
early-childhood antipoverty interventions) likewise operate in
cohort-targeted ways that will be obscured by the field's typical
emphasis on period effects.
It is also attractive to focus on cohorts because the invidious comparisons that individuals make tend to feature their same-age peers.
As life unfolds, individuals compete in schools and in the workplace
with members of their own birth cohort, and the outcome of that
age-specific competition is likely to affect self-assessments. We
expect, for example, that individuals will be more troubled and jealous when they see their same-age peers benefiting disproportionately from
the takeoff than when members of some distant birth cohort are the
principal beneficiaries. It follows that a cohort approach is especially
relevant to considerations of welfare insofar as social comparison
processes and their subjective fallout are taken into account.
THE INEQUALITY-EXAGGERATING EFFECTS OF INTERINDIVIDUAL TRANSFERS
For those interested in making judgments about welfare, lifetime income
is of interest because it is assumed that, without any constraints on
liquidity, individuals can freely borrow against their future income
stream or freely draw on savings from past streams. This form of
borrowing or saving may be understood as an intraindividual transfer
from the past or future to the present. If the transitory variance has
risen substantially, as some (Heathcote, Perri, and Violante 2010) have
claimed, then the takeoff is presumably less troubling because such
transfers can smooth out these transitory shocks. The literature has
thus focused on the possibility that the usual cross-sectional analyses
may overstate the welfare consequences of rising inequality.
The purpose of this section is to shift the focus to various types
of interindividual transfers that, if properly taken into account, may
lead to the conclusion that the welfare consequences of the takeoff are
in fact worse than is usually supposed. That is, whereas a consideration
of intraindividual transfers may lead one to overstate the welfare costs
of the takeoff in inequality, a consideration of interindividual
transfers leads to precisely the opposite conclusion. We develop this
argument by considering the welfare effects of interindividual transfers
between spouses, among households within a neighborhood, and between
parents and children.
Interspousal transfers To illustrate the argument, we begin by
considering the well-known tendency of spouses to pool income, a type of
interindividual transfer that motivates the field's long-standing
interest in analyzing household or family income inequality. This
pooling will increase inequality insofar as there is some amount of
income-based "marital homogamy" in which high-income men tend
to marry high-income women. In the United States, this form of homogamy
is intensifying over time (Schwartz 2010, Mare and Schwartz 2006), a
development that contributes to the takeoff in inequality. As Schwartz
(2010) reports, the correlation between the earnings of spouses almost
tripled between 1967 and 2003, leading in turn to an approximately 25
percent rise in the earnings inequality of families (Schwartz 2010).
Although conventional analyses of individual income inequality will not
reflect this transfer-based source of rising inequality, there is, of
course, a long tradition of analyzing family or household inequality (in
which the effects of such homogamy are "built in").
It is striking, however, that critics of conventional
cross-sectional analyses of individual income inequality often complain
about the possible inequality-exaggerating effects of ignoring
intraindividual transfers without acknowledging the opposing
inequality-suppressing effects of ignoring interindividual transfers
(within households). If one type of transfer-induced bias is to be
corrected, then surely the other, opposing bias should be corrected as
well. This selective acknowledgment of "transfer bias" cannot
be explained by differences in the reliability with which such transfers
can be effected. To the contrary, spouses tend to pool income relatively
freely on the basis of informal agreements (see Bennett 2013), whereas
individuals typically have to engage more formally with friends,
parents, or financial intermediaries when seeking to borrow from these
sources against their future income. The resulting constraints on
liquidity can be substantial (Blank and Barr 2009). This suggests that,
if anything, the bias arising from ignoring interindividual transfers
should be more troubling than that arising from ignoring intraindividual
transfers.
Interneighbor transfers The example of transfers between spouses
is, of course, well known. What is perhaps less appreciated is that
residential neighbors also engage in pooling and that, by virtue of
rising residential segregation, this pooling is leading to a more
unequal distribution of valued goods (Reardon and Bischoff 2011). The
key dynamic here is again a growth in segregation. That is, just as
spouses have increasingly similar incomes (marital homogamy is rising),
so too neighborhoods are becoming increasingly homogeneous by income
(residential segregation is rising). This means that high-income
families are increasingly likely to be living in high-income
neighborhoods that give them indirect access to the considerable
resources of their neighbors. Because neighborhood goods are often
financed by property taxes, it is advantageous to live with high-income
neighbors who will contribute substantially to schools, parks, police
protection, fire protection, local government, and other public goods.
The ongoing takeoff in residential segregation means that this
particular advantage, like the advantage of marrying a high-income
spouse, increasingly accrues to those with relatively high incomes
themselves. This advantage is concealed in conventional analyses of
individual income inequality because the "income" takes the
form of in-kind resources.
The analogy between these two types of interindividual transfers is
by no means perfect. Most obviously, in the United States one makes no
overt payment (no dowry) for the privilege of marrying a high-income
spouse, whereas one does overtly pay for the privilege of living in a
high-income neighborhood. It is accordingly possible that, as
neighborhoods become increasingly income segregated, the resulting
interneighborhood differences in public goods advantages come to be
reflected in the purchase price of homes, thus complicating any effort
to understand the effects of this rising segregation on inequality. The
second main difference is that spouses typically engage in quite
substantial income pooling, whereas residential neighbors are far less
collectivist, in effect pooling their income only for a relatively small
number of local public goods. The total effects of interneighbor
transfers on inequality are, as a result, likely to be comparatively
limited.
Intergenerational transfers The third type of interindividual
transfer of interest occurs between generations of a family as well as
between relatives of the same generation (such as siblings). This type
of transfer is closely related to the previous two: it may be understood
either as entailing transfers among members of a "virtual
neighborhood" defined by kinship ties, or as entailing transfers
among members of a "virtual household" that extends beyond
those actually living together. Under either interpretation, the key
force at work is again rising segregation, which now expresses itself as
growing intergenerational elasticities of income. This force, if indeed
it is at work, implies that the offspring of high-income families are
increasingly likely to find themselves ensconced in virtual households
that provide them with access to high-income parents, high-income
grandparents, and high-income siblings. It is unclear, however, whether
such elasticities are indeed increasing. In a recent review, Chul-In Lee
and Gary Solon (2009) conclude that available estimates on trends in
intergenerational elasticities are "highly imprecise" (p.
766), mainly because the available data sets (principally the Panel
Study of Income Dynamics) are extremely small.
There is nonetheless good reason to worry that these elasticities
are on the rise (see Krueger 2012). If indeed they are, what does it
mean for our understanding of trends in income inequality? It suggests
that high-income offspring may be more likely to receive gifts or
substantial inheritances that then generate investment income. Because
these income transfers are at least partly revealed as individual income
(among the offspring), they will not be concealed in conventional
individual analyses of income inequality. However, many of the transfers
again take an in-kind form, such as unreported gifts, access to lavish
parental vacation homes, or "parental buffering" of children
when they experience unemployment or other labor market difficulties.
The provision of such goods will tend to increase inequality insofar as
they are disproportionately available to high-income offspring.
CONCLUSION It is testimony to our high regard for the analysis in
this paper that, rather than carry out the usual critique of its methods
or conclusions, we have instead sought to consider various extensions of
their analysis. We began by suggesting that the welfare implications of
inequality might be better understood by supplementing the usual
parametric approach with a nonparametric analysis of lifetime income
inequality. The IRS tax data are well suited to the cohort analysis that
such an approach implies.
We have also argued that an exclusive focus on intraindividual
transfers may have distracted scholars from appreciating how various
interindividual transfers may create inequalities that conventional
individual analyses miss. Because high-income individuals are
increasingly embedded in networks that provide access to income or
in-kind benefits provided by others (spouses, parents, extended
families, neighbors), existing models of individual income inequality
may understate the welfare implications of rising inequality, a bias
that is precisely the opposite of that emphasized by those who attend
exclusively to intraindividual transfers. It is unclear why the field
has been so captivated by intraindividual transfers when the
countervailing effects of interindividual transfers may be more
important. The IRS data provide an opportunity to develop models that
can at once capture changes in inequality as well as these possible
changes in income dependencies within households, neighborhoods, and
extended families.
REFERENCES FOR THE OWENS AND GRUSKY COMMENT
Bennett, Fran. 2013. "Researching Within-Household
Distribution: Overview, Developments, Debates, and Methodological
Challenges." Journal of Marriage and Family 75, no. 3: 582-97.
Blank, Rebecca M., and Michael S. Barr, eds. 2009. Insufficient
Funds: Savings, Assets, Credit and Banking among Low-Income Households.
New York: Russell Sage.
Kahn, Lisa B. "The Long-Term Labor Market Consequences of
Graduating from College in a Bad Economy." Labor Economics 17, no.
2: 303-16.
Krueger, Alan B. 2012. "The Rise and Consequences of
Inequality in the United States." Washington: Executive Office of
the President. www.whitehouse.gov/
sites/default/files/krueger_cap_speech_final_remarks.pdf.
Lee, Chul-In, and Gary Solon. 2009. "Trends in
Intergenerational Income Mobility." Review of Economics and
Statistics 91: 766-72.
Mare, Robert D., and Christine R. Schwartz. 2006. "Educational
Assortative Mating and the Family Background of the Next Generation: A
Formal Analysis." Sociological Theory and Methods 2 l: 253-77.
Reardon, Sean, and Kendra Bischoff. 2011. "Growth in
Residential Segregation of Families by Income, 1970 to 2009."
US2010 Research Brief. New York: Russell Sage.
Schwartz, Christine R. 2010. "Earnings Inequality and the
Changing Association between Spouses' Earnings." American
Journal of Sociology 115, no. 5: 1524-57.
(1.) The Stanford Center on Poverty and Inequality is supported by
grant number AE00101 from the U.S. Department of Health and Human
Services, Office of the Assistant Secretary for Planning and Evaluation
(awarded by the Substance Abuse Mental Health Service Administration).
The contents of this comment are solely the responsibility of the
authors and do not necessarily represent the official views of the U.S.
Department of Health and Human Services, Office of the Assistant
Secretary for Planning and Evaluation.
GENERAL DISCUSSION John Haltiwanger noted that job destruction
rates and unemployment inflow rates had declined over the authors'
study period. If job flows and unemployment are treated in the model as
transitory shocks to income, those trends should be driving the
temporary component in income inequality downward, to the point where
the permanent component alone might account for, or more than account
for, the observed results.
William Brainard agreed with the discussants' suggestion that
the authors address the differences between their tax data and other
data sets in widespread use. He also pointed out that there is
substantial heterogeneity in individuals' lifetime income profiles.
Some occupations have a period of apprenticeship, which causes the
profile for those workers to be flat initially; unionized workers, in
contrast, have a very different pattern. Because the authors' model
does not account for these individual differences, Brainard thought, all
of them would show up in the permanent component, when in fact they are
caused by interaction with the individual's education and other
factors. Brainard suggested that the authors take the structural
differences between individuals more thoroughly into account by
including age and education covariates.
Justin Wolfers requested that the authors clarify how they
distinguished between permanent and transitory shocks. In reply, Greg
Kaplan described the method with reference to a random walk model. In
each period a shock occurs that either increases or decreases the
individual's income. The sum of these shocks over time was taken to
be the permanent component for the individual, and a time-varying,
universal weighting factor was applied to that sum. The more traditional
method, Kaplan noted, would be to apply the weighting factor to the
individual shocks rather than to their sum.
Christopher Carroll called the authors' method an interesting
innovation but observed that, since their data were also novel, it was
difficult to determine what portion of the difference between their
results and those of others working in this area was being driven by
their modeling choice and what portion by their novel data set. He urged
the authors to go back and apply the simplest standard model to their
data, to serve as a benchmark, and from there do further analysis to see
what cannot be explained by that simple model.
Carol Graham noticed an upward tick in permanent income inequality
and a downward one in transitory inequality in the authors' data
around 2007. She wondered whether those movements represented merely
transient phenomena or whether what was happening in that period might
explain some of the difference between permanent and transitory income.
Replying to a comment made by David Grusky in his formal
discussion, Robert Gordon questioned whether the income homogeneity of
neighborhoods is in fact increasing. His own impression was that
demographic changes have been making income more heterogeneous as blacks
move into the suburbs and back to the South while Hispanics and affluent
whites move into the inner cities. New York's East Village, for
example, was uniformly poor 30 years ago, but more recently the boom in
the city's financial services and entertainment industries had
brought some very wealthy people into the neighborhood, leading to a mix
of incomes. Gordon also challenged Grusky's implication that a
decline in intergenerational mobility was not yet in evidence. He cited
recent findings that the United States today has one of the lowest
levels of intergenerational mobility among developed economies. One can
almost predict, Gordon added, that this trend will persist, as it is
being reinforced by the behaviors of those at both the top and the
bottom: the wealthy are taking pains to ensure that their children learn
foreign languages (and economics), while the share of children in the
poorest third of the white population living with both parents continues
to decline.
Gita Gopinath commented that an increase in permanent income
inequality will have implications for consumption inequality, and thus
that looking at consumption decisions should make it easier to determine
whether a given income shock is transitory or permanent. Gopinath was
curious to know whether the paper's results were driven by the
fixed effects or the random walk component of the income shock. The
answer, she thought, could help determine whether today's income
inequality was caused by a widening difference in payoffs between high-
and low-ability workers.
Richard Cooper agreed with Brainard that the authors'
distinction between permanent and temporary income was highly suspect.
He also thought it would be valuable to compare individuals'
reported W-2 (wage) income with their income reported on Schedules C and
D (business income and capital gains, respectively) of their IRS Form
1040. That information could help determine how much income inequality
is due to proprietary income and how much to earnings from labor. Gordon
remarked that a paper by Thomas Piketty and Emmanuel Saez had done just
such an analysis and found that the increase in inequality came mainly
from labor earnings. A caveat to that finding, however, was that stock
options--an important contributor today to incomes at the top--are
inappropriately reported as labor earnings.
Responding to the discussion, Ivan Vidangos argued that the
distinction between permanent and transitory components was necessarily
fairly arbitrary. In the real world shocks can be very transitory, very
permanent, or anywhere in between, but one has to draw the line
somewhere. Their strategy was to select two points near the ends of the
continuum and see if the results differed dramatically. They had
experimented with many different specifications, including one that
indicated that permanent factors were capturing 87 percent of the
variance and another that put it at 36 percent, but in all cases the
trends showed that the rise in inequality was driven by the permanent
component.
JASON DEBACKER
Middle Tennessee State University
BRADLEY HELM
Indiana University
VASIA PANOUSI
Board of Governors of the
Federal Reserve System
SHANTHI RAMNATH
U.S. Department of the Treasury
IVAN VIDANGOS
Board of Governors of the
Federal Reserve System
(1.) In this study our baseline measure of income inequality is the
cross-sectional variance (that is, the variance across all individuals
or households in our sample at a given time) in the logarithm of annual
income. We use the terms "persistent inequality" and
"persistent variance" to refer to the variance of the
persistent component of income. Therefore, an increase in inequality is
called "persistent" if it is driven by an increase in the
variance of the persistent component of income. A similar interpretation
will apply to "transitory inequality" and "transitory
variance."
(2.) The analysis was conducted at and approved by the U.S.
Treasury Department to ensure that the strictest confidentiality is
preserved.
(3.) Throughout the paper, we refer to error components models as
nonstationary if model parameters are allowed to change over calendar
time so as to capture changes over time in the distribution of income
(including its dispersion).
(4.) For instance, Kopczuk, Saez, and Song (2010) use longitudinal
earnings data from SSA records to document that inequality in annual
earnings among men has been rising since around 1970. See also the
earlier contributions by Bound and Johnson (1992), Katz and Murphy
(1992), Murphy and Welch (1992), Juhn, Murphy, and Pierce (1993), Katz
and Autor (1999), and more recently, Autor, Katz, and Kearney (2008).
(5.) Baker and Solon (2003) find broadly similar results for Canada
using administrative data.
(6.) Heathcote, Peril, and Violante (2010) document patterns in
inequality over time in a number of variables at the individual and the
household level. Their decomposition of changes in the variance of
earnings into transitory and persistent components is not the main focus
of their paper. Also, they use hourly wages, rather than annual
earnings, and estimate a simpler error components model. Our approach is
closer to that of Moffitt and Gottschalk (2011).
(7.) In our online appendix, however, we present some results
suggesting that the transitory component might play more of a role in
the PSID data than in administrative data. Online appendixes for papers
in this volume may be found at the Brookings Papers website,
www.brookings.edu/about/projects/bpea, under "Past Editions."
(8.) Blundell, Pistaferri, and Preston (2008) find an increase in
the variance of persistent income shocks in the early 1980s, followed by
an increase in the variance of transitory shocks in the late 1980s. We
cannot directly compare our results with theirs, as our sample periods
barely overlap.
(9.) Dynan, Elmendorf, and Sichel (2012) find a continuous increase
in the volatility of male earnings in the PSID over the 1967-2004
period. However, their measure of earnings includes income from
self-employment and hence is not directly comparable to ours or to that
of the studies mentioned above.
(10.) The fraction of U.S. households filing tax returns is
generally around 90 to 95 percent (see, for example, Piketty and Saez
2003). Most households who do not file taxes are low-income households.
Therefore, our data might miss some changes in income inequality at the
bottom of the income distribution. However, we do not view this as a
first-order concern, because, as documented by Autor, Katz, and Kearney
(2008) and Kopczuk, Saez, and Song (2010), changes in income inequality
over our sample period have been concentrated in the upper part of the
income distribution.
(11.) On tax returns in which a married couple is filing jointly,
the primary filer is the individual listed first on Form 1040. This is
usually, although not always, the husband. On tax returns of single
fliers, the primary filer is the individual who filed the return.
(12.) The full 1987 stratified random sample actually consisted of
two parts: the random sample mentioned in the text and a high-income
oversample. We do not use the high-income oversample in our analysis in
this paper.
(13.) In addition, it is well known that changes in income at low
levels of income can unduly affect estimates of models of the income
process. Two commonly used approaches to address this issue are to
exclude low-income observations or to left-censor them. Given the issues
discussed above, we choose to exclude them.
(14.) This is the same threshold as used by Kopczuk, Saez, and Song
(2010). The threshold equals $2,575 in 2004 and is indexed for other
years by nominal average wage growth. In the online appendix we check
the sensitivity of our results to setting lower and higher minimum
thresholds.
(15.) For household income the figures use our "all
households" sample. In our "male-headed households"
sample, the cross-sectional variance (of the log) increases by 0.22
squared log point for pre-tax and 0.17 squared log point for after-tax
household income.
(16.) Furthermore, in the online appendix we examine the robustness
of our results to alternative treatments of household size and
composition.
(17.) Indeed, for most specifications of an income process,
volatility and the variance of transitory income changes tend to move
closely together, although in many cases volatility also captures part
of the variance of persistent income changes. See Shin and Solon (2011)
for a detailed discussion.
(18.) For 1-year changes the estimated coefficient is 0.00037, with
a standard error of 0.00050. This coefficient would imply an increase of
less than 0.01 in the standard deviation over 23 years. For 2-year
changes the coefficient is 0.00046, with a standard error of 0.00058.
(19.) Note that, by taking averages across periods, this method
attenuates somewhat the increase in both persistent and transitory
inequality, and thereby in total inequality, constructed here as the sum
of its persistent and transitory parts.
(20.) That is, to compute the variance of [[alpha].sub.i] and
[[epsilon].sub.it] in a given year t, the method treats the data in the
P-year window centered around t as if they were the entire data set
available.
(21.) The difference between the KSS and GM methods essentially
reflects a "bias correction term" in the random effects formula upon which the GM decomposition is based. For the exact formulas
used by the GM method, see appendix B. Also see the discussion of the
method in Gottschalk and Moffitt (2009).
(22.) The lines in the figure labeled "ECM-predicted"
correspond to predicted values from the nonstationary model that we
introduce in the next section and are discussed in section V.D.
(23.) More precisely, and as we discuss below, the objective in
estimation is to match the entire set of variances and autocovariances
that can be computed from the data.
(24.) Stationary, univariate error components models have been
estimated in a large number of papers. An incomplete list includes the
early contributions of Lillard and Willis (1978), Lillard and Weiss
(1979), and MaCurdy (1982). See also Carroll (1992), Baker (1997),
Carroll and Samwick (1997), and more recently, Guvenen (2009) and
Hryshko (2012). Richer, multivariate stationary models have recently
been estimated in Low, Meghir, and Pistaferri (2010) and Altonji, Smith,
and Vidangos (forthcoming).
(25.) The index a actually represents "normalized age" or
"potential experience," defined as a = age - 25 + 1, or years
starting with age 25.
(26.) The covariates [X.sup.i.sub.a,t] used for the g(*) component
in these regressions correspond exactly to the discussion in section
III. The residuals [[??].sup.i.sub.a,t] obtained from equation 1 are
thus identical to the residuals discussed in section III, and equation 1
formalizes their definition. As noted in section III, the regressions
are run separately by calendar year.
(27.) For the variance profile, a value of [psi] of exactly 1 would
imply an exactly linear increase in the variance of [p.sup.i.sub.a,t] as
a function of age. For the autocovariance function, the decline in the
covariances after the first couple of years in the model is entirely
determined by the value of [psi]. The slow gradual decline seen in the
data requires a value of [psi] that is close to, but smaller than, 1.
(28.) See, for example, Meghir and Pistaferri (2004), Baker (1997),
MaCurdy (1982), and Abowd and Card (1989).
(29.) Our estimation methodology is discussed in the next section,
in the more general context of our nonstationary model, which nests the
stationary specifications presented here.
(30.) One difference is that, as should be expected, our estimate
of parameter [[sigma].sup.2.sub.[alpha]] is larger than the estimates
typically found by studies using residuals that have removed the effects
of education.
(31.) As already noted, the lines labeled "ECM-predicted"
in figure 4 show the fit of the nonstationary version of this model and
are discussed in section V.D.
(32.) Card and Lemieux (1996) provide evidence in support of this
idea.
(33.) In appendix A we present results for an alternative
nonstationary specification in which the [[lambda].sub.t] parameters
multiply the [[alpha].sup.i] component only, and in which the variances
of the persistent shocks are allowed to vary over time. The results from
that alternative specification are consistent with the results obtained
with our baseline model.
(34.) Using a quadratic or a cubic polynomial instead yields
similar results. In general, we have found that restricting the
[[lambda].sub.t] parameters to lie on a polynomial has little effect on
the trend captured by the [[lambda].sub.t] series. The restriction also
has little effect on the model's ability to match the trend in the
total variance, since the [[pi].sub.t] parameters pick up the transitory
part of the variation in the (fully unrestricted) [[lambda].sub.t].
Results for the unrestricted [[lambda].sub.t] are presented in the
online appendix and yield similar conclusions.
(35.) The "ECM-predicted" series are constructed in the
same way as the "empirical" series, but using the theoretical
moments implied by the estimated model rather than the empirical
moments.
(36.) We could also use the estimated model to compute similar
decompositions for any age group, or for any age distribution. In the
online appendix we perform the decomposition assuming a constant age
distribution, and the results are essentially unchanged.
(37.) We do not show separately the empirical cross-sectional
variance of log residual male earnings because it looks
indistinguishable from the top line in figure 5. However, the latter
differs somewhat from the variance of log male earnings shown in figure
l, because figure 5 uses residuals that have removed the variation in
earnings that is due to age, whereas figure 1 uses the raw data.
(38.) Along the same lines, previous versions of this paper
included specifications where the transitory component followed an
ARMA(1, 1) process, which exhibited more persistence than the transitory
component in the ECM model presented above (in those specifications, the
[p.sup.i.sub.a,t] component was restricted to a random walk). As
suggested by the previous discussion, those specifications attributed a
larger share of the total variance to the transitory component at any
given point in time, but the results for the trends were essentially
identical.
(39.) Using the sample of all households, on average over
1987-2009, male labor earnings account for about 54 percent of total
household income, female labor earnings for 26 percent, retirement and
transfer income for 5 percent, investment income for 8 percent, and
business income for 7 percent.
(40.) It also adds some household observations for which labor
earnings of the male filer are below the minimum threshold, but for
which total household income is above the minimum threshold.
(41.) In the online appendix we investigate the robustness of our
results to alternative treatments of household size and composition.
(42.) Note that the total variance of household income in figure 8
is lower in any given year than the total variance of male earnings
shown earlier. The reason is that these are variances of residuals,
which in the case of household income have removed all variation
explained by household size and composition. If we were to compare the
raw data instead, the variance of household income would be larger than
that of male earnings, as seen in figure 1.
(43.) As already noted in section I, in the online appendix we
present estimates of our nonstationary ECM, and the corresponding
variance decompositions, for a sample of male labor earnings and total
household income from the PSID. In the PSID samples, the transitory
variance component appears to have played more of a role for both male
earnings and total household income.
(44.) We analyze increasingly broad income aggregates, rather than
individual income categories separately, because for many households,
income from at least some of these individual categories is zero. The
large number of zero-income observations makes it difficult to estimate
the ECM separately for each income category.
(45.) We use our male-headed households sample so as not to
confound the effects of moving to broader measures of income with the
effects of moving to broader samples.
(46.) See, however, Piketty and Saez (2007), who find a decrease in
progressivity between 1960 and 2004, which was driven primarily by
changes in corporate taxes and in estate and gift taxes, which are not
included in our analysis.
(47.) The inclusion of the [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII] component renders the estimation of the model
more challenging. Indeed, we have found the estimation of this model to
be much less numerically stable than that of our baseline ECM, and the
estimates of the variance of the persistent innovations ([MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII]) are very noisy. As in the case of
our baseline ECM, we impose smoothness restrictions on the [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII] series by restricting it to a
fourth-degree polynomial, for the reasons discussed in section V.B. We
thank Greg Kaplan for sharing computer code that helped with the
estimation of this specification. Note also that in this specification
the timing of the effects of changes in model parameters on changes in
income inequality is different from that in our baseline model, because
of the presence of the [[phi].sub.t] parameters (changes in the variance
of persistent shocks). In particular, in the alternative ECM, changes in
the variance of persistent shocks will have lagged effects on income
inequality. To see this, suppose for simplicity that [psi] = 1, so that
[p.sup.i.sub.a,t] is a random walk and the persistent shocks
[[eta].sup.i.sub.a,t] accumulate over time. Next, suppose, for example,
that the variance of persistent shocks experiences a one-time permanent
increase in year t (there is a one-time permanent jump in
[[phi].sub.t]). Then, over time, as new cohorts enter the adult (ages
25-60) population, they will face the larger persistent shocks, and
these shocks accumulate over time. Therefore, the one-time permanent
increase in the variance of persistent shocks in year t would continue
to lead to increases in inequality in future periods, as younger cohorts
(facing larger persistent shocks) replace the older cohorts (which have
accumulated smaller persistent shocks over their lifetime). One
implication of this is that, if the model in equations A. 1 through A.4
were the correct representation of the world (and especially if [psi] =
1), and if it were the case that the variance of persistent shocks had
increased permanently some time before 1987 (the beginning of our
sample), then part of the increase in income inequality after 1987 would
be the result of the increase in the variance of the persistent shocks
before 1987. Our baseline ECM would likely attribute such changes in
inequality to [[lambda].sub.t]. We thank Greg Kaplan for making this
observation.
(48.) According to this model specification (and our data), there
has been no distinct trend in the variances of persistent or transitory
shocks in our sample period. All the increase in the variance of the
persistent component of earnings comes from an increase in the
"price" of permanent characteristics. This is entirely
consistent with our findings from our baseline ECM, where the rise in
the variance comes from an increase in the price of permanent and
persistent characteristics. One might ask, both in the context of this
alternative model and in the context of our baseline ECM, to what extent
this increase in the price of certain permanent or persistent
characteristics represents increases in the returns to observable
characteristics (such as education and experience) versus unobservable
ones. The large causal literature on earnings and wage inequality in
labor economics indicates that the answer is both, as it generally finds
increases in inequality both between and within narrowly defined
education and experience groups (see, for instance, Lemieux 2008).
(49.) We have also estimated the model using the identity matrix as
weighting matrix. The results (not reported) are very similar.
(50.) When [[lambda].sub.t] is unrestricted, we use the
normalization [[pi].sub.2008] = [[pi].sub.2009], since in that case
[[lambda].sub.t] and [[pi].sub.t] cannot be identified separately in the
last year of the sample, t = 2009. Results for the unrestricted version
are presented in the online appendix.
Table 1. Descriptive Statistics for the Income Measures, 1987-2009 (a)
Log of male earnings
No. of
Year obs. Mean SD
1987 8,180 10.38 0.78
1988 8,670 10.36 0.81
1989 9,019 10.33 0.81
1990 9,081 10.33 0.81
1991 8,891 10.32 0.81
1992 8,899 10.32 0.83
1993 9,240 10.29 0.84
1994 9,354 10.30 0.83
1995 9,522 10.32 0.83
1996 9,498 10.33 0.83
1997 9,608 10.37 0.82
1998 9,806 10.39 0.83
1999 9,865 10.43 0.82
2000 9,933 10.45 0.82
2001 9,978 10.46 0.82
2002 9,946 10.45 0.84
2003 9,895 10.42 0.84
2004 9,980 10.43 0.84
2005 10,048 10.43 0.84
2006 10,317 10.43 0.85
2007 10,574 10.44 0.84
2008 10,505 10.42 0.85
2009 10,290 10.39 0.87
Total or 221,099 10.38 0.83
average
Log of pre-tax household income
Male-headed households All households
No. of No. of
Year obs. Mean SD obs. Mean SD
1987 8,161 10.65 0.78 12,789 10.45 0.85
1988 8,643 10.65 0.81 13,217 10.45 0.87
1989 8,991 10.63 0.83 13,625 10.43 0.88
1990 9,048 10.63 0.82 13,871 10.42 0.88
1991 8,858 10.60 0.83 14,058 10.40 0.88
1992 8,875 10.62 0.84 14,227 10.40 0.90
1993 9,215 10.61 0.85 14,461 10.39 0.90
1994 9,339 10.62 0.85 14,669 10.39 0.90
1995 9,494 10.64 0.85 14,980 10.40 0.92
1996 9,466 10.66 0.86 14,931 10.42 0.93
1997 9,588 10.69 0.87 15,253 10.45 0.93
1998 9,784 10.72 0.88 15,626 10.49 0.94
1999 9,830 10.77 0.88 15,772 10.52 0.94
2000 9,896 10.79 0.88 15,956 10.53 0.95
2001 9,939 10.78 0.87 16,114 10.53 0.94
2002 9,905 10.77 0.88 16,155 10.53 0.93
2003 9,848 10.76 0.87 16,198 10.51 0.94
2004 9,920 10.76 0.89 16,339 10.52 0.96
2005 10,001 10.75 0.90 16,540 10.51 0.96
2006 10,272 10.77 0.92 16,944 10.52 0.97
2007 10,516 10.77 0.92 17,469 10.51 0.97
2008 10,468 10.74 0.90 17,427 10.49 0.95
2009 10,247 10.72 0.91 17,354 10.45 0.95
Total or 220,304 10.70 0.86 353,975 10.47 0.92
average
Log of after-tax household income
Male-headed households All households
No. of No. of
Year obs. Mean SD obs. Mean SD
1987 8,155 10.48 0.73 12,783 10.29 0.80
1988 8,634 10.48 0.76 13,211 10.29 0.82
1989 8,982 10.46 0.78 13,616 10.27 0.82
1990 9,045 10.45 0.77 13,859 10.26 0.82
1991 8,849 10.43 0.77 14,045 10.25 0.82
1992 8,867 10.45 0.79 14,216 10.25 0.83
1993 9,209 10.44 0.79 14,457 10.24 0.84
1994 9,336 10.46 0.79 14,668 10.25 0.83
1995 9,487 10.47 0.79 14,987 10.26 0.84
1996 9,459 10.49 0.80 14,944 10.28 0.85
1997 9,582 10.52 0.80 15,252 10.31 0.85
1998 9,775 10.56 0.82 15,630 10.35 0.86
1999 9,826 10.60 0.82 15,773 10.38 0.86
2000 9,893 10.62 0.82 15,958 10.39 0.87
2001 9,936 10.62 0.81 16,117 10.39 0.86
2002 9,899 10.62 0.81 16,161 10.40 0.85
2003 9,843 10.62 0.82 16,198 10.39 0.86
2004 9,916 10.63 0.83 16,342 10.41 0.88
2005 9,995 10.62 0.84 16,541 10.40 0.88
2006 10,270 10.63 0.86 16,960 10.41 0.89
2007 10,511 10.63 0.87 17,474 10.40 0.90
2008 10,463 10.61 0.84 17,425 10.38 0.88
2009 10,241 10.59 0.84 17,355 10.36 0.87
Total or 220,173 10.54 0.81 353,972 10.33 0.85
average
Source: Authors' calculations using data from the Statistics of
Income Division (SOI) of the Internal Revenue Service.
(a.) See sections 11.13 and II.C in the text for definitions of the
income measures and of the samples, respectively. SD = standard
deviation.
Table 2. Estimated Linear Time Trends of Persistent and Transitory
Variance in Male Labor Earnings (a)
Estimated component and
decomposition method
Persistent component
Error
KSS GM components
method method model
Coefficient 0.0037 0.0037 0.0038
on linear (0.0002) (0.0002) (0.0003)
time
trend
p value 0.000 0.000 0.000
[R.sup.2] 0.95 0.94 0.89
Estimated component and
decomposition method
Transitory component
Error
KSS GM components
method method model
Coefficient 0.0000 0.0001 0.0001
on linear (0.0002) (0.0002) (0.0004)
time
trend
p value 0.947 0.610 0.746
[R.sup.2] 0.00 0.02 0.01
Source: Authors' regressions using SOI data.
(a.) Each column reports results of an ordinary least squares
regression of the persistent or the transitory component of the
variance in male labor earnings, as calculated by the indicated
decomposition method, on a constant (not reported) and a linear
trend. Standard errors are in parentheses.
Table 3. Estimates of Stationary Error Components Models,
Income measure and
sample
Pre-tax household income
Male labor Male-headed All
Parameter earnings households households
Unrestricted model
[[sigma].sup.2.sub.[alpha]] 0.1968 0.1885 0.1960
(0.0018) (0.0018) (0.0016)
[psi] 0.9623 0.9717 0.9669
(0.0010) (0.0012) (0.0007)
[[sigma].sup.2.sub.[eta]] 0.0293 0.0183 0.0269
(0.0007) (0.0006) (0.0006)
[[sigma].sup.2.sub.[epsilon]] 0.1826 0.1405 0.1577
(0.0034) (0.0038) (0.0032)
[[theta].sub.1] 0.2286 0.3072 0.2766
(0.0144) (0.0191) (0.0148)
[[theta].sub.2] 0.1231 0.2131 0.1639
(0.0151) (0.0206) (0.0154)
Restricted model ([psi = 1)
[[sigma].sup.2.sub.[alpha]] 0.2431 0.2162 0.2391
(0.0014) (0.0014) (0.0013)
[[sigma].sup.2.sub.[eta]] 0.0093 0.0076 0.0095
(0.0001) (0.0001) (0.0001)
[[sigma].sup.2.sub.[epsilon]] 0.2069 0.1512 0.1756
(0.0035) (0.0040) (0.0033)
[[theta].sub.1] 0.3477 0.3830 0.3875
(0.0116) (0.0168) (0.0127)
[[theta].sub.2] 0.2895 0.3276 0.3313
(0.0145) (0.0207) (0.0160)
Income measure and
sample
After-tax household
income
Male-headed All
Parameter households households
Unrestricted model
[[sigma].sup.2.sub.[alpha]] 0.1533 0.1579
(0.0015) (0.0013)
[psi] 0.9805 0.9770
(0.0011) (0.0007)
[[sigma].sup.2.sub.[eta]] 0.0135 0.0187
(0.0005) (0.0004)
[[sigma].sup.2.sub.[epsilon]] 0.1199 0.1387
(0.0031) (0.0026)
[[theta].sub.1] 0.3066 0.2772
(0.0186) (0.0136)
[[theta].sub.2] 0.2185 0.1734
(0.0203) (0.0142)
Restricted model ([psi = 1)
[[sigma].sup.2.sub.[alpha]] 0.1713 0.1854
(0.0012) (0.0011)
[[sigma].sup.2.sub.[eta]] 0.0072 0.0089
(0.0001) (0.0001)
[[sigma].sup.2.sub.[epsilon]] 0.1262 0.1492
(0.0032) (0.0026)
[[theta].sub.1] 0.3608 0.3528
(0.0168) (0.0119)
[[theta].sub.2] 0.2998 0.2852
(0.0202) (0.0142)
Source: Authors' calculations using SOI data.
(a.) Estimates of equations 2 through 5 in the text using a minimum
distance estimator (see section V.Q. Asymptotic standard errors are
in parentheses.
Table 4. Estimates of Nonstationary Error Components Model (a)
Income measure and
sample
Pre-tax household
income
Male labor Male-headed All
Parameter earnings households households
Persistent component
[[sigma].sup.2.sub.[alpha]] 0.1742 0.1566 0.1701
(0.0027) (0.0027) (0.0022)
[psi] 0.9631 0.9751 0.9687
(0.0010) (0.0012) (0.0008)
[[sigma].sup.2.sub.[eta]] 0.0246 0.0129 0.0209
(0.0008) (0.0005) (0.0005)
[[lambda].sub.t] polynomial (b)
[b.sub.1] 0.0226 0.0275 0.0132
(0.0048) (0.0055) (0.0039)
[b.sub.2] (x 10) -0.0273 -0.0198 0.0019
(0.0089) (0.0103) (0.0071)
[b.sub.3] (x 100) 0.0151 0.0073 -0.0060
(0.0061) (0.0071) (0.0049)
[b.sub.4] (x 1000) -0.0029 -0.0012 0.0016
(0.0014) (0.0016) (0.0111)
Transitory component
[[theta].sub.1] 0.2343 0.3273 0.2905
(0.0141) (0.0181) (0.0133)
[[theta].sub.2] 0.1262 0.2306 0.1762
(0.0148) (0.0198) (0.0140)
[[sigma].sup.2.sub.[epsilon]] 0.1834 0.1354 0.1493
(0.0119) (0.0101) (0.0100)
[[pi].sub.87] (c) 1.0000 1.0000 1.0000
[[pi].sub.88] 1.0792 1.0737 1.0642
(0.0447) (0.0557) (0.0499)
[[pi].sub.89] 1.0352 1.0715 1.0493
(0.0443) (0.0521) (0.0456)
[[pi].sub.90] 0.9763 0.9597 1.0015
(0.0439) (0.0553) (0.0459)
[[pi].sub.91] 0.9611 0.9666 0.9853
(0.0469) (0.0549) (0.0476)
[[pi].sub.92] 1.0266 1.0058 1.0141
(0.0544) (0.0586) (0.0503)
[[pi].sub.93] 1.0342 0.9858 1.0130
(0.0480) (0.0589) (0.0493)
[[pi].sub.94] 0.9657 0.9304 0.9573
(0.0479) (0.0551) (0.0445)
[[pi].sub.95] 0.9925 0.9584 0.9997
(0.0449) (0.0553) (0.0466)
[[pi].sub.96] 0.9798 0.9604 1.0039
(0.0430) (0.0516) (0.0441)
[[pi].sub.97] 0.9628 1.0012 1.0126
(0.0447) (0.0559) (0.0457)
[[pi].sub.98] 0.9684 1.0396 1.0584
(0.0438) (0.0574) (0.0466)
[[pi].sub.99] 0.9548 1.0224 1.0226
(0.0442) (0.0488) (0.0405)
[[pi].sub.00] 0.9785 1.0029 1.0217
(0.0497) (0.0556) (0.0443)
[[pi].sub.01] 0.9665 0.9652 0.9760
(0.0466) (0.0581) (0.0453)
[[pi].sub.02] 1.0284 1.0175 0.9769
(0.0496) (0.0543) (0.0435)
[[pi].sub.03] 1.0155 0.9576 1.0044
(0.0457) (0.0548) (0.0455)
[[pi].sub.04] 0.9909 1.0385 1.0872
(0.0503) (0.0623) (0.0476)
[[pi].sub.05] 0.9810 1.0941 1.1010
(0.0497) (0.0612) (0.0482)
[[pi].sub.06] 1.0379 1.1863 1.1457
(0.0513) (0.0624) (0.0507)
[[pi].sub.07] 0.9854 1.1695 1.1512
(0.0521) (0.0645) (0.0520)
[[pi].sub.08] 1.0335 1.0562 1.0522
(0.0483) (0.0613) (0.0489)
[[pi].sub.09] 1.0763 1.0989 1.0555
(0.0479) (0.0625) (0.0500)
Income measure and
sample
After-tax household
income
Male-headed All
Parameter households households
Persistent component
[[sigma].sup.2.sub.[alpha]] 0.1328 0.1445
(0.0024) (0.0020)
[psi] 0.9831 0.9784
(0.0012) (0.0007)
[[sigma].sup.2.sub.[eta]] 0.0103 0.0158
(0.0004) (0.0004)
[[lambda].sub.t] polynomial (b)
[b.sub.1] 0.0269 0.0126
(0.0056) (0.0040)
[b.sub.2] (x 10) -0.0279 -0.0090
(0.0106) (0.0074)
[b.sub.3] (x 100) 0.0143 0.0041
(0.0073) (0.0050)
[b.sub.4] (x 1000) -0.0029 -0.0009
(0.0017) (0.0011)
Transitory component
[[theta].sub.1] 0.3230 0.2880
(0.0184) (0.0131)
[[theta].sub.2] 0.2316 0.1827
(0.0202) (0.0139)
[[sigma].sup.2.sub.[epsilon]] 0.1123 0.1264
(0.0089) (0.0086)
[[pi].sub.87] (c) 1.0000 1.0000
[[pi].sub.88] 1.0910 1.0815
(0.0597) (0.0519)
[[pi].sub.89] 1.0911 1.0651
(0.0559) (0.0475)
[[pi].sub.90] 0.9915 1.0242
(0.0571) (0.0470)
[[pi].sub.91] 0.9676 0.9935
(0.0576) (0.0489)
[[pi].sub.92] 1.0191 1.0318
(0.0606) (0.0505)
[[pi].sub.93] 0.9989 1.0384
(0.0608) (0.0495)
[[pi].sub.94] 0.9209 0.9432
(0.0567) (0.0455)
[[pi].sub.95] 0.9273 0.9869
(0.0572) (0.0475)
[[pi].sub.96] 0.9409 0.9974
(0.0536) (0.0449)
[[pi].sub.97] 1.0140 1.0197
(0.0578) (0.0475)
[[pi].sub.98] 1.0503 1.0764
(0.0595) (0.0485)
[[pi].sub.99] 1.0430 1.0331
(0.0510) (0.0419)
[[pi].sub.00] 1.0320 1.0374
(0.0584) (0.0457)
[[pi].sub.01] 0.9653 0.9743
(0.0605) (0.0461)
[[pi].sub.02] 1.0059 0.9835
(0.0556) (0.0445)
[[pi].sub.03] 0.9712 1.0222
(0.0569) (0.0471)
[[pi].sub.04] 1.0560 1.1077
(0.0641) (0.0490)
[[pi].sub.05] 1.1173 1.1317
(0.0642) (0.0493)
[[pi].sub.06] 1.2276 1.1971
(0.0664) (0.0511)
[[pi].sub.07] 1.2034 1.1978
(0.0688) (0.0534)
[[pi].sub.08] 1.0700 1.0815
(0.0646) (0.0506)
[[pi].sub.09] 1.0909 1.0707
(0.0641) (0.0510)
Source: Authors' calculations using SOI data.
(a.) Estimates of equations 6 through 9 in the text using a minimum
distance estimator (see section V.Q. Asymptotic standard errors are
in parentheses.
(b.) See appendix D for specification of the polynomial.
(c.) Parameters [[pi].sub.87] through [[pi].sub.09] correspond to the
years of the sample period (1987-09) and are normalized to equal 1 in
1987 (see appendix D).
Table 5. Estimated Linear Time Trends of Persistent and Transitory
Variance in Pre-Tax Household Income (a)
Estimated component and decomposition
method
Persistent component
KSS GM Error components
Sample method method model
Male-headed households
Coefficient on linear 0.0050 0.0048 0.0040
time trend variable (0.0003) (0.0003) (0.0005)
p value 0.000 0.000 0.000
[R.sup.2] 0.94 0.93 0.73
All households
Coefficient on linear 0.0056 0.0054 0.0048
time trend variable (0.0004) (0.0004) (0.0005)
p value 0.000 0.000 0.000
[R.sup.2] 0.94 0.93 0.80
Estimated component and decomposition
method
Transitory component
KSS GM Error components
Sample method method model
Male-headed households
Coefficient on linear 0.0007 0.0009 0.0016
time trend variable (0.0001) (0.0001) (0.0006)
p value 0.000 0.000 0.010
[R.sup.2] 0.78 0.92 0.28
All households
Coefficient on linear 0.0008 0.0010 0.0013
time trend variable (0.0001) (0.0001) (0.0005)
p value 0.000 0.000 0.010
[R.sup.2] 0.77 0.85 0.28
Source: Authors' regressions using SOI data.
(a.) Each column in each panel reports results of an ordinary least
squares regression of the persistent or the transitory component of
the variance in household income, as calculated by the indicated
decomposition method, on a constant (not reported) and a linear time
trend. Standard errors are in parentheses.