首页    期刊浏览 2025年12月04日 星期四
登录注册

文章基本信息

  • 标题:Measuring income and wealth at the top using administrative and survey data.
  • 作者:Bricker, Jesse ; Henriques, Alice ; Krimmel, Jacob
  • 期刊名称:Brookings Papers on Economic Activity
  • 印刷版ISSN:0007-2303
  • 出版年度:2016
  • 期号:March
  • 语种:English
  • 出版社:Brookings Institution
  • 关键词:Economic surveys;Finance;Financial statistics;Income distribution

Measuring income and wealth at the top using administrative and survey data.


Bricker, Jesse ; Henriques, Alice ; Krimmel, Jacob 等


ABSTRACT Most available estimates of U.S. wealth and income concentration indicate that the top shares are high and have been rising in recent decades, but there is some disagreement about specific levels and trends. Household surveys are the traditional data source used to measure the top shares, but recent studies using administrative tax records suggest that these survey-based top share estimates may not be capturing all of the increasing concentration. In this paper, we reconcile the divergent top share estimates, showing how the choices of data sets and methodological decisions affect levels and trends. Relative to the new and most widely cited top share estimates based on administrative tax data alone, our preferred estimates for both wealth and income concentration are lower and have been rising less rapidly in recent years.

**********

Understanding the determinants and effects of wealth and income inequality are mainstays of political economy. Within the general topic of inequality, the study of the top wealth and income shares garners particular interest. Measuring and explaining wealth and income concentration has challenged economists at least since Vilfredo Pareto (1896) and Simon Kuznets (1953), and the high-quality, micro-level administrative tax data that have recently been made available are generating renewed interest in the shares of resources controlled by the top wealth and income groups. Indeed, the striking trends in top U.S. wealth and income shares reported in the most widely cited studies based on these newly available administrative data sets are now accepted as facts to be embraced and potentially addressed by policymakers. These observations about levels and trends in top wealth and income shares have begun to transcend academic debates, entering the mainstream political arena through best sellers such as those by Raghuram Rajan (2010), Joseph Stiglitz (2012), and Thomas Piketty (2014), and through political movements such as Occupy Wall Street.

Despite the political controversies generated by the estimated top wealth and income shares, relatively little attention has been paid to these estimates' sensitivity to data and methodology. (2) For example, using administrative income tax data, Emmanuel Saez and Gabriel Zucman (2016) estimate that the top 1 percent (by wealth) had a wealth share of 42 percent in 2013, up from 29 percent in 1992. However, the Survey of Consumer Finances (SCF), which combines administrative and survey data, shows less than half the increase in the top 1 percent's wealth share, rising from 30 percent in 1992 to 36 percent in 2013 (figure 1). (3) Similarly, Piketty and Saez (2003) (4) show that the top 1 percent (by income) had a 23 percent income share in 2012, an increase of 10 percentage points since 1992. The SCF shows a 20 percent income share for the top 1 percent in 2012, an increase of 8 percentage points since 1991 (figure 2). (5) Differences in levels and trends in the top wealth and income shares at higher fractiles, such as the top 0.1 percent, are even more striking. (6)

The goals of this paper are to investigate why the various types of data and approaches are giving different answers about top wealth and income shares, and to provide preferred estimates that reflect what can best be gleaned from all the available data, including macro data. The two main sources of micro data used here are administrative tax records and the SCF household survey. These data sources rely on different wealth and income concepts as well as different measurements of wealth and income. In this paper we document that resolving these conceptual and measurement differences also resolves most of the difference in wealth and income concentration estimates from the two data sources.

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

In the case of wealth, concentration measures derived from administrative income tax records can yield improbable results and are sensitive to model assumptions. There are no administrative wealth data in the United States, so "administrative" estimates of wealth must infer wealth by capitalizing taxable income through a common rate of return on asset types. Wealth inferred in this way is heavily dependent on model parameters, and wealth share estimates can be sensitive to small deviations in assumed rates of return. For instance, the return on fixed-income assets of the wealthy assumed by Saez and Zucman (2016) implies as much as four times more wealth than does a market rate of return, and two times more wealth than rates of return estimated from estate tax filings. When wealth concentration is reestimated, changing only the return on fixed-income assets to either of these alternate rates of return, the trend and level of wealth concentration over the past 10 years are identical to SCF estimates that are constrained to use administrative data wealth concepts and units of measurement. Essentially, the entire difference in wealth concentration estimates is due to assumptions about measurement and data construction.

Adjusting income concepts and the unit of measurement generally also brings estimated income shares in the administrative tax data (Piketty and Saez 2003) and the SCF into agreement. However, neither data set is able to provide a full accounting of total personal income in the United States.

The central goal of this paper, then, is to go beyond reconciliation and provide preferred top share estimates of wealth and income. These preferred estimates marry the concepts from macro data to micro data and cover the full target population, which is all U.S. families. We provide evidence that augmenting the SCF gets us close to this ideal. Overall, the top share estimates derived in this paper show much lower and less rapidly increasing top shares than the widely cited values from the Saez and Zucman (2016) and Piketty and Saez (2003) studies mentioned above (figures 1 and 2). (7)

To produce new and improved estimates of wealth and income concentration, we begin by considering the preferred concept of wealth and income from an economic point of view. The preferred concept of wealth includes all assets over which a family has a legal claim that can be used to finance its present and future consumption. This concept mirrors the household wealth concept used in the Financial Accounts of the United States (FA) because it includes a family's liabilities and both its financial and nonfinancial assets, as well as its rights to defined-benefit (DB) pensions. (8) The preferred income concept includes all income received by a family, whether or not it is fully taxed, partially taxed, or untaxed. This concept mirrors the personal income category in the National Income and Product Accounts (NIPA). Both the FA and NIPA are aggregate data, however, and micro data sets are needed for distributional analysis.

Several challenges must be confronted when estimating wealth and income distributions with micro data, such as the SCF and the administrative tax data. The first is that micro data sets do not include every FA wealth concept or every NIPA income concept. Untaxed income, such as the value of employer-provided health insurance and some government transfer income, is never collected in the income tax data and is only sometimes collected in a survey. The SCF wealth estimate typically does not include DB pensions, while most forms of consumer debt cannot be estimated when wealth is inferred from income tax data.

A second estimation challenge concerns differences in population coverage and measurement between these micro data sets. Household surveys are generally thought to reliably cover the full income and wealth distribution, save perhaps the very top. Administrative tax data can reliably cover the top, but coverage suffers at the bottom of the distribution because many families are not required to file tax returns.

Differences in measurement also arise in the units of analysis, which are tax units in the income tax data and the family in a household survey. There are many more tax units (161 million) than families (122 million). Families in the bottom 99 percent are often split into multiple tax units, but a tax unit in the top 1 percent is almost always a family. Counting the top 1 percent (1.61 million) of tax units, then, effectively includes more families than counting the top 1 percent (1.22 million) of families in a survey.

In addition to the conceptual, coverage, and unit-of-analysis difficulties that plague efforts to measure either income or wealth concentration, estimating top wealth shares using administrative tax data introduces yet another potential source of errors. Wealth can only be measured indirectly in income tax data--meaning that wealth is inferred mainly by "capitalizing" income flows--which is at the heart of the approach taken by Saez and Zucman (2016). (9) In a survey like the SCF, wealth is measured directly by asking families about their balance sheets. Accounting for these measurement differences by constraining the SCF to match administrative tax data concepts resolves the discrepancies between the various top wealth share estimates. In particular, the evidence given here and by Wojciech Kopczuk (2015b) shows the sensitivity of wealth inferred from income tax data.

By marrying the concepts from the macro data to the micro data, we can provide preferred top share estimates that cover the full target population: all U.S. families. We provide evidence that augmenting the SCF gets us close to this ideal. We first demonstrate that the SCF represents the full family income and wealth distribution, save for the Forbes 400. By augmenting the SCF household survey along these lines, and by aligning the preferred wealth and income concepts and measurement laid out above, we derive preferred top share estimates.

Our preferred estimates for wealth shares at the top are lower and growing more slowly than in the widely cited capitalized administrative tax data from Saez and Zucman (2016), but this is mostly for methodological reasons, especially the specific capitalization factors used to estimate certain types of wealth (cited above). Indeed, our preferred top wealth share estimates are quite similar to the published SCF values--because one adjustment, adding the Forbes 400, pulls up the SCF top wealth shares; and another adjustment, distributing DB pension wealth, pushes top shares down by a similar amount (figure 1).

Our preferred estimates for top income shares are also lower and rising less rapidly than the recent and widely cited estimates from Piketty and Saez (2003), which were derived from administrative tax data (figure 2). However, those administrative tax data income shares are similar (on an equivalent basis) to SCF top shares, and thus the preferred income top shares are also lower and growing more slowly than published estimates based on the SCF. The differences in levels for incomes at the top (by income) are affected to some extent by the choice of measuring incomes for tax units versus families; but in the end, the wedge in the trends between our preferred and the available top income share estimates is largely driven by the failure of the available micro data to capture cash and in-kind transfers, which are growing rapidly as a share of total income over time. (10)

The reasons for focusing on both wealth and income in one paper are mostly practical. Wealth and income are strongly correlated, so the decisions about how to measure top wealth shares are not neatly separated from the decisions about how to measure top income shares. Indeed, the principle of capitalizing specific income flows forms the basis for wealth inferences in the administrative income tax data and is also used to infer who should be surveyed in the SCF. (11) This process ties top wealth and income share estimates together in an important way.

In addition to the statistical issues, there is also an important conceptual reason for considering both wealth and income concentration in the same paper. Neither income nor wealth concentration tells us everything we want to know about key questions in political economy; but together, the two tell us most of what we want to know. The top income shares are interesting because changes in the flow of returns from current production suggest that something may be amiss in how factor payments are being determined. And the top wealth shares are interesting above and beyond top income shares because disproportionate or increasing control over the level of economic resources may reflect increasing and persistent income concentration--assuming the rich are saving more of their increased incomes--but it could also be driven by trends in relative asset prices and heterogeneous returns on assets. Though dynastic wealth may be less important today than in the past in determining the wealthiest (Kopczuk 2015a), both wealth and income concentration may reflect and shape inequality of opportunity (Yellen 2014).

Some distributional shifts in income might be attributable to fundamental economic factors such as skill-biased technological change, but this probably does not explain increased income concentration within the top 1 percent. Institutional factors may be having an impact across factors of production generally (capital versus labor) and within factors (managerial versus production labor), such that those with the highest incomes are able to capture even higher future shares. Conversely, changes in the way that labor is compensated may be mechanically affecting measured top income shares if (unmeasured) health care and retirement costs are disproportionately pushing down incomes for the nonrich.

One specific concern is that wealth concentration may feed on itself if undue political influence is being exercised by those who can (sometimes independently) finance election campaigns and generate an even more favorable tax or regulatory environment for themselves in subsequent periods. The primary concerns about the effects of rising wealth inequality involve investment and economic growth. Rising wealth concentration may intensify financing constraints for the nonwealthy, affecting investment in education, entrepreneurship, and other types of risk-taking for those with diminished resources. As with incomes, however, it is important to consider what may be driving the estimates of top wealth shares before recommending policies to address those trends.

Identifying the potential biases in top wealth and income share estimates begins with a comprehensive discussion of data and concepts, which is the subject of section I of this paper. Section II then focuses on deriving the preferred estimates for top wealth shares, and section III focuses on top income shares. For both wealth and income, in the course of generating the preferred top shares, we also show how to reconcile the existing SCF and administrative tax data top share estimates. The reconciliation shows that the first-order divergence between the SCF and administrative tax data is basically conceptual in nature, and not a problem of population coverage. The reconciliations generally involve the differences between micro and macro concepts, the unit of analysis, whether and how certain groups are represented in the micro data, and potential survey reporting for different types of incomes.

I. Measuring Wealth and Income Concentration: Concepts and Data Sources

Our starting points for measuring top wealth and income shares are the aggregate concepts and estimates of household sector net worth and income built into the Financial Accounts of the United States and the National Income and Product Accounts. The distributional analysis itself is based on two distinct (but related) micro data sets. Top income and wealth shares are first estimated using the Survey of Consumer Finances, a household survey micro data set collected by the Federal Reserve Board. The top income and wealth shares are then estimated from administrative income tax data produced by the Statistics of Income (SOI) Division of the Internal Revenue Service. These SOI administrative micro tax data are the direct source of the top income shares in Piketty and Saez (2003), the indirect source of the top wealth shares in Saez and Zucman (2016), and the basis for drawing the sample of SCF high-end respondents.

This section describes how the various wealth concepts, income concepts, aspects of population coverage, and units of analysis compare and contrast across these four data sets. Thus, it sets the stage for developing preferred estimates of the top wealth shares in section II, and the top income shares in section III.

I.A. Wealth Concepts and Data

Our starting point for measuring wealth concentration is the concept of net worth owned by the household sector, as embodied in the FA. (12) From an economic point of view, this concept of wealth includes all assets over which a family has a legal claim that can be used to finance its present and future consumption. The net worth of a family is its assets net of liabilities.

This definition excludes some wealth under the control of a family--most notably charitable foundations--as well as expected future Social Security payments. We exclude foundations because a family does not consume goods and services from the assets in the foundation, even though they may be able to consume (nontangible) reputational benefits. (13) We exclude expected future net Social Security benefits mostly for practical reasons. The Social Security wealth measure that one would like to capture is the present value of expected future benefits less expected future taxes, but one would need to make a number of assumptions and projections to actually implement those calculations, beginning with whether or how promised but unfunded benefits will actually be paid. However, given the generally progressive nature of Social Security, it is clear that adding estimates of Social Security wealth would push the more expansive concentration numbers below our preferred estimates. (14)

Our unit of organization is the family, rather than the individual or tax unit, because decisions about future and current consumption are usually made with at least some weight from and consideration for all members of the immediate family. (15) Tax units are frequently families, but tax-filing rules often split one family into many tax units.

There is little difference in the conceptual measure of wealth across the micro data (SCF and administrative tax) and macro data (FA). The FA include assets held in the nonprofit sector, and though it is possible to separate nonprofit real estate holdings, financial assets owned by nonprofits are always included in the overall household net worth measure in the FA. (16)

There are, however, key differences in how various balance sheet items are estimated in the two sets of micro data, as shown in table 1. The most notable difference is that income-generating financial and business assets are estimated in the administrative tax data by applying "gross capitalization" to the observed income flows, while those assets are estimated directly in the SCF through the survey questionnaire. A key assumption in gross capitalization is that all assets of a given type earn a single rate of return, and thus there is a direct relationship between the stock and the flow. (17)

Implementing the gross capitalization approach also requires choosing a gross capitalization factor for each asset type, which Saez and Zucman (2016) solved by using the ratio of a given FA asset balance for the corresponding aggregate administrative tax data flow. This approach generates micro-level wealth totals that, by construction, match the macro-level wealth totals. However, any mismatch between the micro and macro data concepts will lead to bias in capitalization factors and a misallocation of wealth. For example, if the FA aggregate for some asset includes holdings of nonprofit institutions, whereas the micro income flows do not, then too much wealth will be assigned (per $1 of income) at the micro level. Similarly, if the micro data miss small income flows--say, the modest interest earned on checking and savings accounts in a low-interest-rate environment--the corresponding FA assets will be assigned only to those families with large and reported interest flows. These possibilities are more than theoretical, as we show later in the paper that implausible capitalization factors are the key to understanding differences between the survey and administrative tax data estimates for top wealth shares.

Assets that do not generate observable income flows, such as housing and pension wealth, are allocated in the gross capitalization framework using correlations with other observables in the administrative tax data, such as property taxes and wages, and are benchmarked to available external sources, such as the SCF or published Internal Revenue Service statistics. Again, those assets are measured directly in the SCF, along with nonmortgage liabilities for which there are no useful correlates in the tax data that can be used for distribution. The one asset category that requires inference in the SCF is DB pension wealth. The approach for distributing future DB claims in our preferred top share estimates involves using the survey reports of wages, current DB coverage, and years in a plan for those still working, and current benefits for those already receiving benefits. (18)

I.B. Income Concepts and Data

Our starting point for estimating top income shares is the concept of personal income (PI), as measured in the NIPA. (19) PI is a very broad concept, and is meant to capture all forms of income received by individuals, nonprofit institutions serving households, private noninsured welfare funds, and trust funds. It includes income that is taxed, partly taxed (such as Social Security benefits), and untaxed (mostly transfers, whether cash or in-kind). In particular, we augment the family-level income data in the SCF which already includes market income, Social Security benefits, and some transfers to include estimates of employer health insurance benefits, Medicare benefits, Medicaid benefits, food stamps, and other in-kind transfer payments.

We recognize that there are a variety of ways to measure a "more complete" income (Congressional Budget Office 2014; Burkhauser, Larrimore, and Simon 2012; Burkhauser and others 2012; Smeeding and Thompson 2011), and that the definition of income may depend on the economic exercise. We take great comfort, however, from the fact that top income shares based on our measure of income have the same level and trend as the Congressional Budget Office's measure, which is another hybrid of administrative and survey data (see the online appendix for more detail).

In this section we discuss the conceptual differences between administrative tax data, the SCF, and NIPA, thereby establishing the underpinnings for our preferred top shares estimates presented later in the paper. Although our starting point for measuring top income shares is PI, we acknowledge that there are some irreconcilable differences between the micro and macro data, a key timing adjustment, and one notable addition on the micro side, for realized capital gains. (20) These differences are highlighted in table 2.

In many ways the SCF and administrative tax data are closely related, and are generally consistent with the concept of NIPA PI. Most forms of income from current production--including wages and salaries, business income, interest and dividends paid directly to persons, and other smaller types of "market" income--are conceptually (and empirically) similar in the two micro data sources. To some extent this is by construction, because the SCF income module invites respondents to refer to their income tax returns when answering those questions. The two sets of micro data are in turn mostly consistent with the NIPA in those categories, though NIPA makes adjustments for the underreporting of proprietors' incomes and imputes certain incomes, such as the rental value of owned housing and the value of financial services provided by banks.

The two sets of micro data both count realized capital gains as part of the core income measure, while NIPA does not count capital gains in PI. The NIPA exclusion is based on fundamental national income accounting principles. That is, capital gains are not tied directly to current production; nor do they constitute a transfer from one sector to another. However, for the purpose of measuring top income shares, we choose to include realized gains because they do constitute a flow of current resources over which the family has control.

The treatment of retirement incomes is also different in the micro and the macro data. In the NIPA, and again, based on the principle that incomes should be derived from current production or arising from transfers across sectors, retirement income occurs when employers contribute to retirement plans on their employees' behalf, or when the retirement assets generate interest and dividends. The actual payment of retirement benefits is a mixed bag in the NIPA, with withdrawals and benefits paid from private plans not included, and payments from government plans showing up as transfer income. In the micro data, employer contributions and capital income earned by retirement plans are generally unobserved, but withdrawals are (though to a differing degree in the SCF and administrative tax data) generally observed.

To some extent the appropriate treatment of retirement income cannot be separated from the frequency over which incomes are being measured. On a lifetime basis, it would not matter whether private retirement income was counted, as it was accrued or when it was paid out, but the distinction does matter when using annual data. Given the availability of cash flow-oriented micro data at an annual frequency, the top shares estimates we present are based on realized benefits, which implicitly adjusts the NIPA PI concept for a portion of "net saving" in retirement plans, where net saving is new contributions plus interest and dividends earned on plan assets, less pension benefits paid. However, the fact that some new employee contributions (employee-paid Social Security taxes) to retirement plans are still counted (in the micro data) as part of nonretirement income means that the adjustment is only partial.

The more substantial conceptual differences between our preferred income top share estimates and those available in the micro data are associated with nontaxable government transfers and in-kind compensation. In principle, the SCF captures government cash transfers, but the administrative tax data by construction do not, and the rising share of transfers in NIPA PI means that less total income is being distributed over time when using either micro data set. (21) Neither the SCF nor the administrative tax data make any adjustment for in-kind compensation and transfers, which, especially through employer-provided health care plans and the major government health care programs, have roughly doubled as a share of total NIPA PI since 1988. Our conceptually preferred measure for top income shares allocates these missing income pieces, which brings our overall income concept close to NIPA PI. The remaining conceptual differences are in the imputations and retirement income timing, as discussed above.

I.C. Population Coverage and Units of Analysis

The population of interest in our analysis of top wealth and income shares is all U.S. households. In some ways, this is a simplistic statement, because households are the ultimate claimants on all private incomes and wealth. However, there is substantial private income received and wealth owned by nonprofit institutions that should be excluded, and that is not completely feasible to sort out given the available macro data. In addition to these sectoral coverage issues, there are also differences in population coverage and measurement across the distribution of households, with administrative income tax data generally perceived to be more accurate at the top of the distribution, and household surveys like the SCF thought to provide better coverage at the bottom. These comparisons are further confounded by the differences in the unit of observation across the micro data, with the administrative data collected for tax units, and the survey data collected for households.

Table 3 summarizes the differences in population coverage and the unit of analysis across the four data sets with which we are working. The first key difference between the two sets of micro data is the unit of analysis. In the U.S. income tax data, observations are for tax filing units, not families. The number of tax units (about 161 million in 2012) is approximately 30 percent higher than the number of families (122 million in the SCF). (22)

Most of the tax units at the very top are also families, meaning that many of those observed as a single family in the survey data but multiple tax units in the tax data are found in the bottom 99 percent of the wealth and income distribution. In the 2010 SCF, for example, fewer than 3 percent of coupled families in the top 1 percent filed separately, while about 17 percent of couples in families in the bottom 99 percent filed separately. The implication, then, is that any top share fractile estimate is effectively based on a population that may include 30 percent more family units than the fractile suggests.

There are many reasons to prefer the household (or family, which is close to household) as the unit of analysis for measuring top wealth and income shares. Many of the tax units residing in multiple-tax-unit families are dependent filers with very low incomes, and therefore they are effectively sharing resources with the other members of the household (usually their parents) who are able to claim them on their taxes. The same can be argued for unmarried partners sharing living arrangements and resources but filing taxes separately. It makes sense to pool their resources when characterizing their share of income or wealth. One can argue that roommates who are not sharing resources could be treated as separate units; but in the end, the issue is really about what one means when measuring the wealth or income shares of "the" top 1 percent. Is this the top 1.22 million families in 2012, or the top 1.61 million tax units? Our preferred estimate is based on families, and the substantial difference between the total counts of families and tax units will turn out to be a key driver of the wedge between existing estimates of the levels of top wealth and income shares.

Sectoral coverage matters when comparing the SCF to administrative tax data, and between the two sets of micro data and the two sets of macro data. The micro data sets do not attempt to measure wealth and income received by nonprofit institutions, and the only available adjustment on the macro side is in the FA balance sheet measure, which separates the real estate holdings of nonprofit institutions. This sectoral overlap becomes important when thinking about the total income or wealth in the denominator of the concentration measures, and whether, for example, a given income flow or asset holding should be allocated to a given top shares group or spread more evenly throughout the distribution. In particular, the capitalization approach to estimating top wealth shares relies on administrative income tax data flows calibrated to FA levels. This approach will assign nonprofit, nonhousing asset holdings across groups based on measured incomes, exacerbating any differences in actual wealth holdings.

There is also a key difference between the micro data sets in population coverage, and this has a potentially first-order bearing on estimated top shares. The goal of the SCF is to survey the entire noninstitutional population using a standard, nationally representative, area probability sample along with the "list sample derived from administrative tax returns, designed to correct for low survey response rates among wealthy families. (23) The members of the Forbes 400 in the year the sample is drawn are explicitly excluded from the SCF sample. (24) In our preferred top wealth and income share estimates, we add in the Forbes 400, but there is some question as to whether the SCF captures the rest of the top of the distribution, particularly those just below the Forbes 400 (see more on this in the next section).

The population coverage for administrative income tax data is necessarily limited to the population that files income taxes. Although there are many more tax units than there are families, there are many families (low-income and retired) where no individual or couple is required to file a tax return. Indeed, of the 161 million estimated tax units in 2012, only 145 million actually filed tax returns. Using other household survey data, Piketty and Saez (2003) supplement the tax-based income-concentration measures by increasing the denominator (total income) to account for nonfilers. (25)

Both the SCF and the administrative income tax data face challenges vis-a-vis population coverage. The coverage challenge for the administrative tax data is mostly about nonfilers, and, to some extent, the coverage problems cannot be cleanly separated from the concept of income being measured, because the income composition of nonfilers is very different than the income composition of filers. The SCF also faces issues in capturing certain types of income, but the more immediate concern is whether the SCF actually captures the top of the distribution, as the sampling strategy is designed to accomplish.

I.D. Does the SCF Capture the Top End?

It is difficult to argue with the presumption that administrative tax data should provide better estimates of top wealth and income shares, because participation in the administrative data is required by law, and traditional household surveys are well known to suffer from an underrepresentation of very wealthy families. (26) In addition, administrative tax data are subject to audit, and thus (again) one presumes that income and other reporting will be more accurate in those data. Unlike most other household surveys, the SCF is designed to overcome the underrepresentation problem, because administrative tax data are used to select the sample, and rigorous targeting and accounting for wealthy families participation assures that those families are properly represented. Also, SCF cases are reviewed for internal consistency (to some extent guided by the administrative sampling data), but this review process may fail to capture all reporting errors. In this subsection we show that the SCF does a very good job identifying and surveying wealthy families, and there may be some downward bias in capturing certain types of income at the very top.

The SCF strategy begins with the view that a combination of survey and administrative data is better than either in isolation. The benefit of the survey component is straightforward, in that the data collector can control the population being studied and the specific wealth and income concepts being measured. However, for the purposes of studying top wealth and income shares, this benefit can be dwarfed by a failure to survey wealthy families. Measuring top wealth and income shares by expanding on simple random sampling in a traditional household survey is not a viable solution, because thin tails at the top lead to enormous sampling variability, and disproportional nonparticipation at the top biases down top share estimates.

The SCF effectively overcomes the problems of thin tails and differential nonparticipation by oversampling at the top, relying on administrative data derived from tax records, and by verifying that the top is represented using targeted response rates in several high-end strata. (27) The SCF "list" sample actually comprises seven strata, where the first basically overlaps the address-based random sample, and the remaining strata identify increasingly wealthy groups of families up to (but not including) the Forbes 400. In very general terms, the top four strata in any given year, made up of roughly 1,000 SCF families, effectively represent the top 1 percent of all families. The targeted response rates in the list sample do vary across strata in an expected manner, with participation rates falling as predicted wealth rises. The response rate in the wealthiest SCF stratum is about 12 percent, increasing to 25 percent in the second-wealthiest stratum, 30 percent in the third-wealthiest, 40 percent in the fourth- and fifth-wealthiest, and then about 50 percent in the two least-wealthy. These high-end response rates are considerably lower than the roughly 70 percent response rate observed in the SCF area probability sample.

The fact that participation rates are lower for very wealthy SCF families does not mean that the sample is biased by underrepresentation at the very top, however; it just reflects the fact that very wealthy families are much more difficult to contact and then, given contact, are less likely to participate in the survey. Sample weights are systematically varied across the top strata in order to correct for the differential nonresponse. The important question is whether the families that eventually participate in the survey, thus representing their respective wealth stratum, are statistically distinguishable from sampled nonparticipants. (28) Indeed, a regular step in the SCF's quality control process involves comparing and contrasting participants and nonparticipants within a stratum, in order to identify these sorts of potential biases. These comparisons are based on looking at administrative data incomes in the years preceding the survey. (29)

The administrative data underlying the SCF sampling are consistent with participants being representative of nonparticipants within each high-end stratum. The distributions of total incomes for SCF participants are similar to those of sampled nonrespondents (top panel of figure 3). Moving from the fourth-highest stratum to the highest stratum, one sees the substantial nonlinearity of incomes that characterize the top end, as each successive log scale for income shifts to the right in dramatic fashion. The range of incomes in the top four SCF strata completely cover the top 1 percent in an overlapping way--meaning, for example, that the top of the fourth-highest stratum overlaps with the bottom of the third-highest stratum, and so on. The capital income distributions of SCF respondents are also similar to those of nonrespondents (bottom panel of figure 3), and the nonlinearity in incomes as one moves from the fourth-highest to the highest stratum is even more dramatic. (30)

In general, statistical tests confirm the visual indication that participants and sampled nonparticipants within strata have very similar income distributions. The null hypothesis is that the two distributions come from the same underlying distribution, and the test statistics generally fail to the reject the null hypothesis, using a rank-sum test (either Kolmogorov-Smirnov or Wilcoxon). The specific results vary by year and across strata, but in the 2013 sample, the null hypothesis was rejected for only the second-highest stratum for total income. (31)

[FIGURE 3 OMITTED]

Focusing on the means of the distributions across strata, average total incomes for both participants and sampled nonparticipants in the fourth-highest stratum are generally about $500,000, whereas the average total incomes in the highest stratum are above $50 million (top panel of figure 4, shown, again, on a log scale). The averages for total income versus capital income only differ noticeably for the fourth-highest and third-highest strata (bottom panel of figure 4). In the top two strata, average total income is dominated by and effectively equivalent to capital income. As with differences in the distributions, one can test for differences in the means by income measure, stratum, and year. In general, the tests fail to reject the null hypothesis that the means for participants and sampled nonparticipants are the same. (32)

In addition to average levels, one can also compare SCF respondents and nonrespondents in terms of observable presurvey income volatility. This metric also shows that SCF participants are similar to nonrespondents for both total income (top panel of figure 5) and capital income (bottom panel of figure 5). Income at the top is known to be much more volatile than in the rest of the income distribution, and the trend seems to be toward higher relative volatility at the top. (33) In the SCF sampling data, for the top four strata covering the top 1 percent, about one-fourth of 2013 families experienced income changes below -50 percent or above +50 percent. The similarity between SCF respondents and nonrespondents means that potential distortionary effects from sampling families with very high or very low transitory income shocks is not a problem.

Although it would violate SCF protocol to directly evaluate the accuracy of any given SCF respondent's reported income, it is possible to get an estimate of reported income accuracy, on average, using two distributional comparisons against the entire SOI data set for a given survey year. The first approach is to compare the growth distribution of incomes reported by SCF respondents with the growth distribution observed in the SOI administrative data for families with comparable income levels. The second approach involves looking at how many SCF families report incomes above the published SOI thresholds, and how much income in total is reported by those in a given top income group. (34)

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

[FIGURE 6 OMITTED]

High-income and high-wealth families typically have volatile incomes. For example, in the complete 2011 SOI data set, about 60 percent of the families with an adjusted gross income (AGI) greater than $500,000 realized a decline in AGI in their 2012 tax filing (figure 6, right bars). At the tails, about 22 percent of the families in 2011 with an AGI greater than $500,000 had a decline in income of 50 percent or more, and about 11 percent had an increase in income of 50 percent or more. However, of the 2011 SOI families with an AGI greater than $500,000 that responded to the SCF, about 74 percent reported an annual income decline (survey-reported income relative to the last year of administrative sampling income), and nearly 32 percent reported a decline in income of 50 percent or more (figure 6, left bars). Thus, although the patterns of income change in figure 6 are broadly similar, some high-income SCF respondents may be, on net, underreporting 2012 income, and the SCF data editing process does not correct for this underreporting. One possible explanation is that many high-income SCF families had not filed their taxes at the time of their interview, so they may have been unaware of their actual 2012 income during the interview. "

In addition to comparing growth rate distributions, it is possible to look at whether the SCF is capturing the very top of the SOI income distribution in any given year. One of the (now regular) tables published in the SOI Bulletin shows income thresholds for various top share groups, along with the amount of income earned above these thresholds. (36) Thus, it is possible to look at various SOI cutoffs (for the top 10 percent, top 1 percent, and top 0.1 percent) and investigate whether the SCF finds the right number of families above these cutoffs, and the right amount of total income above the threshold. These comparisons are far from perfect, because the SCF is set up on a family basis while SOI is organized in tax units, and (although SCF respondents are asked to refer to their tax returns) the value of income they report may differ from the AGI concept in the SOI tables. (37) Indeed, the modest biases one expects show up clearly: The SCF has more families above any given threshold and generally more income (additional family income will increase a given tax unit's income, which pushes a few more families over the threshold) except for the top 0.1 percent, for which the SCF finds roughly the same total income (the tax unit versus family distinction is less important as one gets closer to the very top). It is particularly important that we do not observe any trend in how well the SCF captures top incomes over time.

Though the SCF covers the top end of the income distribution, other comparisons of SCF and SOI incomes by source suggest that more general reporting challenges for capital income--such as interest, dividend, and business income--are likely affecting top families. For example, Barry Johnson and Kevin Moore (2008) show that aggregate total income in the SCF generally matches total aggregate income published by SOI, but the aggregates of some forms of capital income in the SCF appear to be understated, while wages and other types of income are overstated relative to the tax data. Saez and Zucman (2016) also state that the capital income concentration in the SCF is lower than the capital income concentration in the income tax data, and argue that this is evidence that the SCF is not capturing the top of the distribution.

How can the SCF capture the top of the income distribution and match total taxable income but have understated capital income shares? We argue that understated capital income in the SCF is mainly due to the classification of income. Wages as a share of the total income of the wealthiest SCF families has grown more than in the tax data since 2001. (38) We concede that some of what respondents call "wages" may, in fact, be "business income," as the two could be thought of interchangeably by business owners. Business income is the largest source of capital income in both the SCF and the income tax data. (39)

The question posed at the beginning of this section is whether the SCF accomplishes its goal of identifying and surveying high-end families. The answer is basically yes, though given the restriction on auditing respondents, there will always be some uncertainty about exactly who is being included and whether their reported incomes are accurate. The importance of showing that the SCF captures families at the very top is, in one sense, a first-order point for our purposes here. But in another sense, it is just a corollary to the fact established later in the paper that, after being made conceptually equivalent, top wealth and income shares in the SCF and administrative tax data are effectively the same. Given that the populations in the two sets of micro data are effectively aligned, the more salient questions involve what we should be measuring conceptually, and how we should be measuring these desired concepts.

II. Top Wealth Shares in Administrative and Survey Data

Wealth concentration has been at the center of recent media discussions (Feldstein 2015; Harwood 2015; Wolfers 2015) and academic discussions (Auerbach and Hassett 2015; Mankiw 2015; Piketty 2015; Weil 2015). In addition to concerns about the causes and effects of rising wealth concentration, some of the debate exists because different wealth concentration estimates paint contrasting pictures about what is actually happening. Published SCF household survey estimates indicate that wealth concentration at the top is high but increasing slowly (Bricker and others 2014), with a trajectory similar to that for estate tax data (Kopczuk and Saez 2004), though the level of wealth concentration is higher in the SCF. The inferences about top wealth shares using capitalized income tax data (Saez and Zucman 2016) indicate much higher and more rapidly growing wealth shares at the top of the wealth distribution, which has led to a substantial widening between levels of estimated wealth concentration in recent years.

In this section we present our preferred estimates of top wealth shares, and we show how these estimates compare with and contrast to both published SCF and gross capitalization estimates. Our preferred top share estimate is constructed by starting with the SCF wealth measures, adding the estimated wealth of the Forbes 400, and then distributing the value of DB pensions as measured in the FA. As described in section I.A, this preferred concept of wealth includes all assets (net of liabilities) over which a family has a legal claim that can be used to finance its present and future consumption.

We also investigate the source of divergence in growth rates and levels by constraining the SCF to conceptually match Saez and Zucman (2016). Using this approach, we are able to confirm that the differentials in wealth concentration are not attributable to the wealth concept per se, nor to population coverage or survey-reporting errors, and are, in fact, attributable to assumptions and methodology.

II.A. Preferred Estimates of the Top Wealth Shares

In all the estimates discussed here, the top wealth shares in the United States are very high and have been increasing over time. The top panel of figure 1 shows the estimated share of wealth owned by the top 1 percent for the period 1989-2013 based on three different measures, and the bottom panel of figure 1 shows the same for the top 0.1 percent wealth shares. In general, the estimated top wealth shares using the gross capitalization method applied to administrative tax data produced by Saez and Zucman (2016) are higher and have been growing more rapidly than the top wealth shares in published SCF estimates, and are also higher than those based on our preferred measure.

Our preferred measure of the top wealth shares begins with the published SCF Bulletin concept and estimates, next adds the wealth known to be missing because the Forbes 400 is excluded from the SCF sample, and then adds the value of DB pensions. (40) With these two adjustments, the preferred measure is conceptually equivalent to household sector net worth in the FA, but excludes nonprofit institutions. (41) Thus, the measure encompasses all the private resources available to families for present and future consumption. Most of this wealth is "marketable," in the sense of being available to trade for current consumption, with the exception of DB wealth, but this reflects private claims on future consumption.

The preferred measure shows slower growth in wealth concentration than in Saez and Zucman (2016). In fact, the preferred top shares' growth rate is very similar to the SCF. (42) Estimates of top wealth shares for both the top 1 percent and the top 0.1 percent were closer across the methods in the early years of the SCF than they are now, but differential growth rates have led to very different levels in recent years. In the most recent period, the preferred estimate of the top 1 percent wealth share is about 33 percent of total wealth, while the capitalized income value is nearly 42 percent. In a proportional sense, the divergence in the most recent years is even larger for the top 0.1 percent, with the preferred measure showing a share just under 15 percent of total wealth, and the capitalized income value more than 22 percent. The different measures all agree that wealth concentration is increasing within the top 1 percent, though the gross capitalization estimates are the most extreme in this regard.

II.B. Reconciling the Wealth Concentration Estimates

If the SCF sampling strategy does a good job capturing the top end of the wealth distribution, and SCF respondents do a good job reporting the values of their assets and liabilities, what is causing the substantial divergence between estimated top wealth shares in the SCF-based preferred and gross capitalization measures? Our approach to answering this question involves constraining the SCF to be conceptually and empirically similar to the gross capitalization estimates, and showing that most of the divergence is eliminated. In particular, when we measure top wealth shares after constraining SCF totals to match FA aggregates and adjusting the number of families in the top fractile to be consistent with tax unit counts, most of the recent level differences are eliminated, or at least are brought within the range of SCF statistical confidence.

The effects of constraining the SCF-based preferred top wealth share estimates to be conceptually and empirically equivalent to the gross capitalization estimates are shown in the top panel of figure 7 for the top 1 percent, and in the bottom panel of figure 7 for the top 0.1 percent. The first adjustment, which involves moving from the "Preferred" line to the "Preferred, FA concepts and values" line, is based on calibrating the sum of SCF values to match FA values across asset and liability categories. In general, the SCF and FA aggregates track very well over long periods of time. (43) There are notable differences in levels and trends, however. Most important, the SCF finds a higher and (since 2001) more rapidly rising estimate for the value of owner-occupied housing, which has pushed up the ratio of SCF to FA net worth in recent years. (44) Thus, when the SCF house values (and other asset and liability categories) are scaled to match the corresponding FA aggregates, owner-occupied housing is disproportionately scaled down. This differential rescaling is important, because the divergence in owner-occupied housing aggregates implies that benchmarking administrative data to FA instead of the SCF lowers wealth more below the top fractiles than above them, and more so for the top 0.1 percent than even the top 1 percent.

[FIGURE 7 OMITTED]

The second set of constraints imposed on the SCF adjustment involves shifting the top fractile cutoffs to be on a tax unit instead of a household basis. (45) The shift from the "Preferred, FA concepts and values" lines in both panels of figure 7 reflects the impact of imposing this constraint, and the lines labeled "Preferred, FA concepts and values, tax units" are again noticeably shifted up. We also add the shaded area around the second constrained top share estimates, which represents the 95 percent confidence interval. (46) Indeed, all the differences in recent top 1 percent wealth shares are effectively eliminated when we constrain the SCF, and all but the most recent periods are reconciled for the top 0.1 percent. The exercise does raise questions about why, for example, the SCF top 1 percent wealth shares are above the capitalized values in the early years of the survey, and why the top 0.1 percent shares have been growing much more rapidly in recent years. But the magnitude of the adjustments and range of the confidence intervals makes it clear that top wealth shares are very sensitive to the specific data and methods being used.
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有