The missing "one-offs": the hidden supply of high-achieving, low-income students.
Hoxby, Caroline ; Avery, Christopher
ABSTRACT We show that the vast majority of low-income high
achievers do not apply to any selective college. This is despite the
fact that selective institutions typically cost them less, owing to
generous financial aid, than the two-year and nonselective four-year
institutions to which they actually apply. Moreover, low-income high
achievers have no reason to believe they will fail at selective
institutions since those who do apply are admitted and graduate at high
rates. We demonstrate that low-income high achievers' application
behavior differs greatly from that of their high-income counterparts
with similar achievement. The latter generally follow experts'
advice to apply to several "peer," a few "reach,"
and a couple of "safety" colleges. We separate low-income high
achievers into those whose application behavior is similar to that of
their high-income counterparts ("achievement-typical") and
those who apply to no selective institutions
("income-typical"). We show that income-typical students are
not more disadvantaged than the achievement-typical students. However,
in contrast to the achievement-typical students, income-typical students
come from districts too small to support selective public high schools,
are not in a critical mass of fellow high achievers, and are unlikely to
encounter a teacher who attended a selective college. We demonstrate
that widely used policies--college admissions recruiting, campus visits,
college mentoring programs--are likely to be ineffective with
income-typical students. We suggest that effective policies must depend
less on geographic concentration of high achievers.
**********
In this study we show that a large number--probably the vast
majority--of very high-achieving students from low-income families do
not apply to a selective college or university. (1) This is in contrast
to students with the same test scores and grades who come from
high-income backgrounds: they are overwhelmingly likely to apply to a
college whose median student has achievement much like their own. This
gap is puzzling because we find that the subset of high-achieving,
low-income students who do apply to selective institutions are just as
likely to enroll and progress toward a degree at the same pace as
high-income students with equivalent test scores and grades. Added to
the puzzle is the fact that very selective institutions not only offer
students much richer instructional, extracurricular, and other
resources, but also offer high-achieving, low-income students so much
financial aid that these students would often pay less to attend a
selective institution than the far less selective or nonselective
postsecondary institutions that most of them do attend.
We attempt to unravel this puzzle by characterizing low-income,
very high achieving students in the U.S. using a rich array of data,
including individual-level data on every student who takes one of the
two college assessments, the ACT and the SAT. We divide the low-income,
very high-achieving students into those who apply similarly to their
high-income counterparts ("achievement-typical" behavior) and
those who apply in a very dissimilar manner ("income-typical"
behavior). We do this because we are interested in why some low-income
high achievers appear to base their college-going on their achievement,
whereas others base it on their income. We find that income-typical
students are fairly isolated from other high achievers, both in terms of
geography and in terms of the high schools they attend. In fact, their
lack of concentration is such that many traditional strategies for
informing high-achieving students about college--for instance, college
admissions staff visiting high schools, or after-school programs that
provide mentoring--would be prohibitively expensive. We also show that
income-typical students have a negligible probability of meeting a
teacher, high school counselor, or schoolmate from an older cohort who
attended a selective college.
In contrast, we show that achievement-typical students are highly
concentrated. Some of these low-income students attend a small number of
"feeder" high schools that contain a critical mass of high
achievers. Some feeder schools admit students on the basis of an exam or
previous grades; others are magnet schools; still others contain a
subpopulation of low-income students in a student body that is generally
affluent. Since these high schools are nearly all located in the largest
school districts of very large metropolitan areas (not even in
medium-size metropolitan areas), their students are far from
representative of high-achieving, low-income students in general.
Moreover, we show evidence that suggests that these schools may be
"tapped out"--that their students are already so intensively
recruited by selective colleges that further recruitment may merely
shift students among similar, selective colleges, and not cause students
to change their college-going behavior in more fundamental ways.
The evidence that we present is descriptive, not causal. This is an
important distinction. For instance, we cannot assert that a
high-achieving, low-income student would act like an achievement-typical
student rather than an income-typical student if he or she were moved to
a large metropolitan area with a high school that practices selective
admission. Moreover, we do not assert that income-typical students would
have higher welfare if they applied to college in the same way that
achievement-typical and high-income high achievers do. We leave such
causal tests for related studies in which we conduct randomized,
controlled interventions. Nevertheless, our descriptive evidence makes
three important contributions. First, it documents that the number of
low-income high achievers is much greater than college admissions staff
generally believe. Since admissions staff see only the students who
apply, they very reasonably underestimate the number who exist. Second,
our evidence suggests hypotheses for why so many low-income high
achievers apply to colleges in a manner that may not be in their best
interest, and that is certainly different from what similarly
high-achieving, high-income students do. Most of our hypotheses are
related to the idea that income-typical students--despite being
intelligent, literate, and on colleges' search lists (that is, the
lists to which selective colleges mail brochures)--lack information or
encouragement that achievement-typical students have because they are
part of local, critical masses of high achievers. Third, our descriptive
evidence allows us to explain why some traditional interventions are
unlikely to change the situation and allows us to identify other
interventions that could plausibly do so.
Our previous work (Avery, Hoxby, and others 2006) was perhaps the
first to identify the phenomena described in this paper, but there is
now a small literature on the topic of "undermatching." We
especially note the work of William Bowen, Matthew Chingos, and Michael
McPherson (2009), Eleanor Dillon and Jeffrey Smith (2012), and Amanda
Pallais (2009). Relative to those studies, our study's strengths
are its comprehensiveness (we analyze the entire population of
high-achieving students, not a sample); our complete characterization of
each U.S. high school, including its history of sending its students to
college; our ability to map students to their exact high schools and
neighborhoods (this allows us to investigate exactly what they
experience); and our use of accurate administrative data to identify
students' aptitude, application behavior, college enrollment, and
on-time degree completion. The sheer comprehensiveness and accuracy of
our data are what allow us to test key hypotheses about why some
high-achieving, low-income students are income-typical and others are
achievement-typical. Our data also allow us to assess which
interventions might plausibly (and cost-effectively) alter such
behavior.
The paper is organized as follows. In the next section we present
some background on college policies directed toward low-income high
achievers. In section II we describe our data sources. In section III we
present a descriptive portrait of very high-achieving U.S.
students--their family incomes, parents' education, race,
ethnicity, and geography. In section IV we show that high-achieving
students' college application behavior differs greatly by family
income. We also show that, conditional on applying to a college,
students' enrollment, college grades, and degree receipt do not
differ by family income (among students with similar incoming
qualifications). In section V we divide low-income high achievers into
achievement-typical and income-typical groups. We then compare factors
that might affect the college application behavior of these groups. In
section VI we consider several interventions commonly directed toward
low-income high achievers, and we demonstrate that they are likely to be
cost-prohibitive for income-typical students. To drive the point home,
we contrast colleges' difficulty in identifying low-income high
achievers with their ease in identifying top athletes. In section VII we
conclude by discussing which hypotheses we have eliminated and which
still need testing, and we speculate on the sort of interventions that
could plausibly test whether income-typical students' welfare would
be greater if they were better informed.
I. Background on College Policies Directed toward Low-Income High
Achievers
Many students from low-income families have poor college outcomes:
they do not attend college, they drop out before attaining a degree,
they earn so few credits each term that they cannot graduate even in 1.5
times the "correct" time to degree, or they attend
institutions with such poor resources that even when they do graduate,
they earn much less than the median college graduate. These poor college
outcomes are often attributed to low-income students being less
academically prepared for college and less able to pay for college.
These are certainly valid concerns. As we show later, high-income
students (those from families in the top income quartile) are in fact
much more likely to be high achievers at the end of high school than are
low-income students. Nevertheless, some low-income students are very
high achievers: at the end of high school, they have grades and college
assessment scores that put them in the top 10 percent of students who
take one of the ACT or SAT college assessment exams or, equivalently,
the top 4 percent of all U.S. secondary school students.
High-achieving, low-income students are considered very desirable
by selective colleges, private and public, which are eager to make their
student bodies socioeconomically diverse without enrolling students who
are unprepared for their demanding curricula. The ultimate evidence of
colleges' eagerness is their financial aid policies, which, as we
shall show, are very generous toward such students. However, we have
also observed this eagerness personally among hundreds of college
leaders and their admissions staff. Many spend considerable amounts on
recruiting the low-income students who do apply and on (not always
successful) programs designed to increase their numbers of low-income
applicants. There are many reasons for selective institutions to prefer
socioeconomic diversity. These include, to name just a few, a deep
respect for merit regardless of need; the fact that students whose lives
were transformed by highly aided college education tend to be the most
generous donors if they do become rich; a belief that a diverse student
body makes instruction and research more productive; and pressure from
society.
In recent years, selective schools' aid for low-income high
achievers has become so generous that such students' out-of-pocket
costs of attendance are zero at the nation's most competitive
schools, and small at other very selective schools. Figure 1 shows the
distribution of annual income in 2008 for families with a child in the
12th grade--a good indicator for a family having a child of
college-going age in the next year. The 20th percentile of this
distribution was $35,185. Table 1 shows the out-of-pocket costs
(including loans) such a student would have experienced in the 2009-10
school year at a variety of selective and nonselective institutions. The
table is organized based on institutions' selectivity as classified
by Barron's Profiles of American Colleges: most competitive, very
competitive, competitive 4-year institutions, nonselective 4-year
institutions, and (nonselective by definition) community colleges and
other 2-year institutions. Table 1 also shows the colleges'
comprehensive cost for a student who needs no financial aid (the
"sticker price") and their instructional expenditure per
student. What the table reveals is that a low-income student who can
gain admission to one of the most selective colleges in the U.S. can
expect to pay less to attend a very selective college with maximum
instructional expenditure than to attend a nonselective 4-year college
or 2-year institution. In short, the table demonstrates the strong
financial commitment that selective colleges have made toward becoming
affordable to low-income students. (2)
[FIGURE 1 OMITTED]
In related work (Avery, Hoxby, and others 2006), we analyze Harvard
University's introduction of zero costs for students with annual
family incomes of $40,000 and below starting in 2005. (Harvard is a
relevant option for the students we analyze in this paper.)
Harvard's policy was
quickly imitated or outdone by the institutions with which it most
competes: Yale, Princeton, Stanford, and some others. All such
institutions subsequently raised the bar on what they considered to be a
low enough income to merit zero costs, to the point where even students
from families with income above the U.S. median can often attend such
institutions for free. Although less well endowed institutions followed
suit to a lesser extent (usually by setting the bar for zero costs at a
lower family income than the aforementioned institutions did), the
result was very low costs for low-income students at selective
institutions, as table 1 shows.
In our other work we show that Harvard's policy change had
very little effect--at least, very little immediate effect--on the
income composition of its entering class. We estimate that it increased
the number of low-income students by approximately 20, in a class of
more than 1,600 (Avery, Hoxby, and others 2006, table 1, top row).
Interestingly, this very modest effect was not a surprise to many
college admissions staff. They explained that there was a small pool of
low-income high achievers who were already "fully tapped," so
that additional aid and recruiting could do little except shift them
among institutions that were fairly similar. Put another way, they
believed that the overall pool of high-achieving, low-income students
was inelastic. Many felt that they had already tried every means open to
them for recruiting low-income students: guaranteeing need-blind
admission, (3) disproportionately visiting high schools with large
numbers of free-lunch-eligible students, (4) sending special letters to
high achievers who live in high-poverty ZIP codes, (5) maintaining
strong relationships with guidance counselors who reliably direct
low-income applicants to them, (6) coordinating with or even running
college mentoring programs for low-income students, (7) paying a
third-party organization for a guaranteed minimum number of low-income
enrollees, (8) sponsoring campus visits for students from local high
schools known to serve low-income families, and personally contacting
students whose essays suggest that they might be disadvantaged. Although
the admissions staff believed that they might succeed in diversifying
their student bodies by poaching from other selective schools or
lowering their admissions standards for low-income students, they did
not expect additional aid together with more of the same recruiting
methods to affect matters much. (9) (Note that the methods we use in
this paper to identify low-income students are not available to college
admissions staff.) (10)
In this paper, we show that--viewed one way--the admissions staff
are correct. The pool of high-achieving, low-income students who apply
to selective colleges is small: for every high-achieving, low-income
student who applies, there are from 8 to 15 high-achieving, high-income
students who apply. Viewed another way, however, the admissions staff
are too pessimistic: the vast majority of high-achieving, low-income
students do not apply to any selective college. There are, in fact, only
about 2 high-achieving, high-income students for every high-achieving,
low-income student in the population. The problem is that most
high-achieving, low-income students do not apply to any selective
college, so they are invisible to admissions staff. Moreover, we will
show that they are unlikely to come to the attention of admissions staff
through traditional recruiting channels.
II. Data Sources and Identifying High-Achieving, Low-Income
Students
We attempt to identify the vast majority of U.S. students who are
very high achieving. Specifically, we are interested in students who are
well prepared for college and who would be very likely to be admitted to
the majority of selective institutions (if they applied). Thus, as
mentioned above, we choose students whose college assessment scores
place them in the top 10 percent of test takers based on either the SAT
I (combined math and verbal) or the ACT (comprehensive). (11) Since only
about 40 percent of U.S. secondary school students take a college
assessment, these students are in the top 4 percent of U.S. students. We
include in our target group only those students who self-report a grade
point average of A- or higher in high school. In practice, this
criterion for inclusion hardly matters once we condition on having test
scores in the top 10 percent. (12)
Our key data come from the College Board and ACT, both of which
supplied us with student-level data on everyone in the high school
graduating class of 2008 who took either the ACT or the SAT I. (13)
Apart from students' test score history, these data sets contain
students' high school identifiers, self-reported grades, race and
ethnicity, and sex. Validation exercises have shown that students
self-report their grades quite accurately to the College Board and ACT
(with just a hint of upward bias), probably because students perceive
the organizations as playing a semiofficial role in the college
application process (Freeberg 1988). The data also contain answers to
numerous questions about students' high school activities and their
plans for college.
Importantly, the College Board and ACT data contain a full list of
the colleges to which students have sent their test scores. Except in
rare circumstances, a student cannot complete an application to a
selective college without having the College Board or ACT send his or
her verified test scores to the college. Thus, score sending is
necessary but not sufficient for a completed application. Put another
way, score sending may exaggerate but cannot understate the set of
selective colleges to which a student applies. Past studies have found
that score sending corresponds closely with actual applications to
selective colleges (Card and Krueger 2005, Avery and Hoxby 2004).
Students who are admitted under an Early Decision or Early Action
program often do not apply to colleges other than the one that admitted
them early. However, such students typically send scores to all of the
schools to which they would have applied had the Early school not
admitted them (Avery, Glickman, Hoxby, Metrick 2013). Thus, it is
somewhat better for our purposes to observe score sending than actual
applications: score sending more accurately reveals the set of selective
colleges to which the student would have applied. Note, however, that as
most 2-year colleges and some nonselective colleges do not require
verified ACT or SAT I scores, we do not assume that a student who sends
no scores is applying to no postsecondary institutions. Rather, that
student is applying to no selective institution.
For some of our analyses, we need to know where students actually
enrolled and whether they are on track to attain a degree on time (June
2012 for baccalaureate degrees for the class of 2008). We therefore
match students to their records at the National Student Clearinghouse,
which tracks enrollment and degree receipt. We match all low-income high
achievers and a 25 percent random sample of high-income high achievers.
We do not match all students for reasons of cost.
The addresses in the data are geocoded for us at the Census block
level, the smallest level of Census geography (22 households on
average). We match each student to a rich description of his or her
neighborhood. The neighborhood's racial composition, sex
composition, age composition, and population density are matched at the
block level. Numerous sociodemographic variables are matched at the
block group level (556 households on average): several moments of the
family income distribution, adults' educational attainment,
employment, the occupational distribution, several moments of the house
value distribution, and so on. We also merge in income data from the
Internal Revenue Service (IRS) at the ZIP code level.
In addition to these data on the graduating class of 2008, we have
parallel data for previous cohorts of students who took an ACT or a
College Board test. (We have one previous cohort for the ACT and more
than 10 previous cohorts for the College Board tests.) We use the
previous cohort data in a few ways that will become clear below.
We create a profile of every high school, public and private, in
the U.S., using administrative data on enrollment, graduates, basic
school characteristics, and sociodemographics. The sources are the
Common Core of Data (United States Department of Education 200%) and the
Private School Survey (United States Department of Education 2009b). By
summarizing our previous cohort data at the high school level, we also
create profiles for each school of their students' usual test
scores, application behavior, and college plans. For instance, we know
how many students from the high school typically apply to each selective
college or to any given group of selective colleges. Finally, we add
high schools' test scores, at the subgroup level, for each
state's statewide test mandated by the No Child Left Behind Act of
2001. These scores are all standardized to have a zero mean and a
standard deviation of 1.
We estimate a student's family income rather than rely on the
student's self-reported family income. We do this for a few
reasons. First, both the College Board's and the ACT's family
income questions provide a series of somewhat wide income
"bins" as potential answers. Second, although the College
Board's questionnaire appears to elicit unbiased self-reports of
family income, students make substantial unsystematic mistakes when
their data are compared to their verified data used in financial aid
calculations (the CSS Profile data). Third, about 62 percent of students
simply do not answer the College Board's family income question.
Fourth, although the ACT's questionnaire elicits a high response
rate, its question refers to the fact that colleges offer more generous
financial aid to students with lower family incomes. This framing
apparently induces students to underestimate their family incomes: we
find that students often report family incomes that are lower than the
10th percentile of family income in their Census block group.
We predict students' family income using all the data we have
on previous cohorts of College Board students, matched to their CSS
Profile records (data used by financial aid officers to compute grants
and loans). That is, using previous cohorts, we regress accurate
administrative data on family income using all of our Census variables,
the IRS income variables, the high school profile variables, and the
student's own race and ethnicity. In practice, the income variables
from the Census have the most explanatory power. Our goal is simply to
maximize explanatory power, and many of the variables we include are
somewhat multicollinear. We choose predicted income cutoffs to minimize
Type I error (false positives) in declaring a student to be low-income.
Specifically, we choose cutoffs such that, in previous cohorts, only 8
percent of students who are not actually in the bottom quartile of the
income distribution are predicted to be low-income. We recognize that by
minimizing Type I error, we expand Type II error, but it is less
worrisome for our exercise if we mistakenly classify a low-income
student as middle-income than if we do the reverse. This is because we
wish to characterize the college-going behavior of students who are
low-income. Since we also find that there are more high-achieving,
low-income students than college admissions staff typically believe, we
make decisions that will understate rather than overstate the
low-income, high-achieving population.
More generally, it is not important for our exercise that our
measure of income be precise. What matters for our exercise is that the
students we analyze are, in fact, capable of gaining admission at
selective colleges--at which time the college's financial aid
policies will be implemented. We are confident that the students we
analyze are capable of being admitted because we are using the same
score data and similar grade data to what the colleges themselves use.
Also, we show later that we can accurately predict the colleges at which
students enroll, conditioning on the colleges to which they applied. We
would not be able to make such accurate predictions if we lacked
important achievement and other data that colleges use in their
admissions processes.
Hereafter, we describe as low-income any student whose estimated
family income is at or below the cutoff for the bottom quartile of the
2008 distribution of incomes among families who had a child in his or
her senior year of high school: $41,472. (14) We describe as high-income
any student whose estimated family income is at or above the cutoff for
the top quartile of the same distribution: $120,776. See figure 1 for
other percentiles.
III. A Portrait of High-Achieving Students in the O.S.
Who and where are the high-achieving students in the United States?
In this section, we briefly characterize them, leaving more detailed
analysis of the low-income, high-achieving group for later.
Figure 2 shows that 34 percent of high achievers have estimated
family income in the top quartile and 27 percent have estimated family
income in the third quartile. That is, high-income families are
overrepresented in the high-achieving population. However, 22 percent
and 17 percent of high achievers have estimated family incomes in,
respectively, the second and bottom quartiles. We estimate that there
are at least 25,000 and probably about 35,000 low-income high achievers
in each cohort in the United States. (15)
Table 2 shows that among high achievers, those who are from
higher-income families do have slightly higher college assessment
scores, but the difference is small. The average low-income high
achiever scores at the 94.1th percentile. The average high-income high
achiever scores at the 95.7th percentile.
Data on the parental education of high achievers are unfortunately
very incomplete, because ACT takers are not asked to report their
parents' education, and 52 percent of SAT I takers fail to answer
the question about their parents' education. Moreover, SAT I takers
are apparently less likely to report their parents' education when
it is low. We base this assessment on the observation that parents'
education is more likely to be missing for students who live in Census
block groups with low adult education. For what they are worth, however,
the data on the parents' education are shown in figure 3. (16) More
precisely, we show the greater of the father's reported educational
attainment and the mother's reported educational attainment. Of
students who report their parents' education, 50.7 percent say that
at least one parent has a graduate degree, 27.9 percent say that at
least one parent has a baccalaureate degree, and another 6 percent cite
"some graduate school" (but no degree); 11.6 percent claim
that at least one parent has an associate's degree or "some
college or trade school" (but no degree), and only 3.8 percent
report neither parent having more than a high school diploma. Perhaps
the most interesting thing about the parents' education data is
that they seem to indicate that high achievers are reluctant to report
that they have poorly educated parents. This is in contrast to the
family income data from the same College Board questionnaire. Many
students did not answer the income question, but those who did answered
it in an unbiased (albeit fairly inaccurate) way.
Figure 4 displays information on high achievers' race and
ethnicity, which 98 percent of students voluntarily report on the ACT or
the College Board questionnaire. Of all high achievers, 75.8 percent say
that they are white non-Hispanic, and another 15.0 percent say that they
are Asian. The remaining 9.2 percent of high achievers are associated
with an underrepresented minority, (17) either Hispanic (4.7 percent),
black non-Hispanic (1.5 percent), Native American (0.4 percent), or
mixed race/ethnicity (2.6 percent). If we focus on low-income high
achievers only (figure 5), we see that 15.4 percent are underrepresented
minorities. Interestingly, the entire increase in this share comes out
of the percentage who are white. Asians make up 15.2 percent of
low-income high achievers, almost identical to their share of all high
achievers.
A key takeaway from figure 5 is that a student's being an
underrepresented minority is not a good proxy for his or her being
low-income. Thus, if a college wants its student body to exhibit income
diversity commensurate with the income diversity among high achievers,
it cannot possibly attain this goal simply by recruiting students who
are underrepresented minorities. If admissions staff do most of their
outreach to low-income students by visiting schools that are largely
Hispanic and black, the staff should realize that this strategy may lead
to a student body that is diverse on specific racial and ethnic
dimensions but that is not diverse in terms of family income.
[FIGURE 6 OMITTED]
Figure 6 is a choropleth map showing the number of high-achieving
students in each county of the United States. Counties are an imperfect
unit of observation because some are large in land area and some are
small. Nevertheless, they are the most consistent political units in the
United States. (18) The darker is the county's coloring, the more
high-achieving students it contains. What the map demonstrates is that
critical masses of high-achieving students are most likely to be found
in the urban counties in southern New England (Massachusetts,
Connecticut, Rhode Island), the Mid-Atlantic (New York, New Jersey,
eastern Pennsylvania), southern Florida, and coastal California from the
Bay Area to San Diego. The other critical masses are more scattered, but
a person familiar with U.S. geography can pick out Chicago (especially),
Houston, Dallas-Fort Worth, Atlanta, and some smaller cities. In short,
if one's goal were to visit every county where one could gather at
least 100 high achievers, one could concentrate entirely on a limited
number of cities on the East and West Coasts and a few cities in
between.
[FIGURE 7 OMITTED]
Some part of the above statement is due to the fact that
high-income, highly educated parents are somewhat concentrated in the
aforementioned areas, and such parents, as we have shown, are somewhat
more likely to have high-achieving children. However, some part of the
above statement is due purely to population density. That is, even if
children in all counties were equally likely to be high-achieving, there
would still be critical masses of them in densely populated counties,
and vice versa. The choropleth map in figure 7 illustrates the role of
population density by showing the number of high-achieving students per
17-year-old in each county. The darker a county is, the higher is its
decile on this relative measure. The map makes it clear that this
relative measure is far less concentrated than the absolute measure that
favors densely populated counties. In fact, one can see a belt of
counties that tend to produce high achievers running from Minnesota and
the Dakotas south through Missouri and Kansas. A good number of counties
in Appalachia, Indiana, and the West outside of coastal California also
tend to produce high achievers. In short, if one's goal were to
meet a nationally representative sample of high achievers, one's
trip could not be concentrated on a limited number of counties on the
coasts and a few cities in between.
IV. College Applications, Enrollment, and Degree Receipt among
High-Achieving Students in the U.S.
In this section, we analyze the college application choices,
enrollment decisions, and on-time degree receipt of high-achieving
students in the United States, paying attention to how low-income
students' behavior does or does not differ from that of high-income
students. Because colleges in the United States are so varied and large
in number, we characterize them by the college assessment score of their
median student, expressed as a percentile of the national college
assessment test score distribution. This statistic, although admittedly
insufficient to describe colleges fully, has important qualities. First,
it is probably the single best, simple indicator of selectivity--much
better than a college's admissions rate, for instance (Avery,
Glickman, Hoxby, and Metrick, 2013). Second, when an expert college
counselor advises students on how to choose a portfolio of schools to
which to apply, he or she usually tells students to apply to a few
schools that are a "reach," four or more schools that are
"peer" or "match," and one or more schools that are
"safe." Similar advice is widely available on the Internet
sites of college advising organizations with a strong reputation,
including the College Board and the ACT. Expert college counselors use
schools' median test scores to define "reach" schools
(typically, those whose median score is more than 5 percentiles above
the student's own), "peer" schools (typically, those
where the school's median score is within 5 percentiles of the
student's own), and "safety" schools (typically, those
whose median score is 5 to 15 percentiles below the student's own).
(19) Naturally, the exact cutoffs for these categories vary from expert
to expert, and high-achieving students are often advised to apply to
their state's public flagship university, even if it falls below
the safety zone. (20) High-achieving students are generally advised to
apply to at least eight schools.
IV. A. College Application Behavior: A Graphical Analysis
In this subsection, we provide graphical evidence of what
students' application portfolios look like. This presentation is
somewhat informal but useful for fixing ideas and defining categories
before we move to the formal econometric analysis in the next
subsection. In what follows, an "application" is defined as
sending a test score to a college. (21)
Figure 8 is a histogram of the application portfolios of
high-income students. It is important to understand how this and
subsequent figures are constructed. On the horizontal axis is the
difference between the applied-to college's median test score and
the student's own score, in percentiles. Thus, if an application is
located at zero, the student is applying to a peer school whose median
student has exactly the same score. An application at, say, +8 is a
reach, and an application at, say, -13 is a safety. Since nonselective
colleges do not require their students to take college assessments (and
thus do not report a median student score), an application to a
nonselective school is placed at -94, which is zero minus the average
percentile score of high-achieving students in the data. It is not
obvious where to place applications to nonselective schools, but -94 has
the advantage that such applications cannot be mistaken for applications
to a school that is selective but that sets a very low bar.
Each student is given a weight of 1 in the histogram, and this
weight is split evenly over that student's applications. This is to
ensure that the histogram does not overrepresent the behavior of
students who apply to more schools, since, after all, each student will
enroll at just one (initially at least). Thus, if a student puts all of
his or her eggs in one basket and applies to a single +8 school, that
student's full weight of 1 will show up in the +8 bar. If a student
applies to one +8 school, one +6 school, one +4 school, and so on down
to one -8 school, one 9th of that student's weight will show up in
each of the relevant bars. Note that each bar is 2 percentiles wide.
[FIGURE 8 OMITTED]
Figure 8 shows that high-income students largely follow the advice
of expert counselors. The bulk of their applications are made to peer
schools. They apply to some reach schools as well, but they are
mechanically limited in the extent to which they can do this: there are
no reach schools for slightly more than half of the high-achieving
students we study. (22) High-income high achievers also apply fairly
frequently to safety schools. Although not shown in the figure, it is
noteworthy that many such students apply to their state's flagship
university. These schools vary greatly in selectivity, so that some such
applications are in the safe range, but other applications to flagships
appear far more safe than anyone would think necessary. For instance, an
application by a high achiever to a flagship with a median score at the
50th percentile would end up at -40 to -50. Nevertheless, applying to
these schools may be well-advised (see note 20).
[FIGURE 9 OMITTED]
The reader might be surprised to find that high-achieving,
high-income students apply to some colleges that are nonselective on
academic grounds. However, the schools in question are often specialty
schools: music conservatories, art or design schools, drama or
performing arts schools, cooking schools, and so on. Some of these are
highly selective on nonacademic dimensions.
Figure 9 shows that unlike the high-income high achievers, few
low-income high achievers follow the advice of expert counselors. More
than 40 percent of the mass in the histogram loads on nonselective
schools. (This is an underestimate because scores are not sent to some
nonselective schools. If we included every nonselective enrollment as a
nonselective application, the nonselective bar on the histogram would
rise by 5.1 percentage points.) (23) Moreover, the nonselective colleges
to which low-income students apply are rarely of the specialty type
mentioned above. They are often local community colleges or local 4-year
institutions with meager resources per student and low graduation rates.
Much of the height of the nonselective bar is due to the fact that many
low-income high achievers apply only to nonselective colleges, or to a
nonselective college and a barely selective college.
Figure 10 overlays the histograms for low-income, middle-income,
and high-income students who are high-achieving. It cuts off the portion
of the histogram that shows nonselective colleges so as to focus on
application choices among colleges that are selective to at least some
degree. It will be observed that the behavior of the middle-income
students (those from families in the two middle quartiles of the family
income distribution) is about midway between that of their low- and
high-income counterparts. Moreover, even within the subset of
applications that are made to selective colleges, high-income students
apply much more to peer colleges, and low-income students apply much
more to colleges far below the safety level.
Figure 11 contains four panels. The top left-hand panel shows, for
all high-achieving, low-income students, the histogram of the most
selective college to which each student applied. The top right-hand
panel shows the same histogram for high-achieving, high-income students.
The bottom left-hand panel shows the histogram for the second most
selective college to which a low-income student applied (or the most
selective, for students who applied to a single college). The bottom
right-hand panel shows the same histogram for high-income students.
These histograms reveal that the vast majority of high-income high
achievers' most selective applications fall within 10 percentiles
of their test scores. Their second most selective applications are sent
to less competitive, but not much less competitive schools: the vast
majority fall between +10 and -15 percentiles. In contrast, low-income
high achievers send their most selective applications to the entire
range of colleges: nonselective and -60 to +10. Their second most
selective applications are, again, to less competitive (but not
necessarily much less competitive) schools. All of this suggests that
there may be two distinguishable types of low-income high achievers:
those who apply much as their high-income counterparts do, and those who
apply in a manner that is very different.
[FIGURE 10 OMITTED]
In fact, 53 percent of low-income high achievers fit the profile we
will hereafter describe as income-typical: they apply to no school whose
median score is within 15 percentiles of their own, and they do apply to
at least one nonselective college. At the other extreme, 8 percent of
low-income high achievers apply in a manner that is similar to what is
recommended and to what their high-income counterparts do: they apply to
at least one peer college, at least one safety college with a median
score not more than 15 percentiles lower than their own, and apply to no
nonselective colleges. We hereafter designate such students as
achievement-typical, noting that once a student fits the above criteria,
he or she usually applies to several peer colleges, much as high-income
students do.
[FIGURE 11 OMITTED]
The remaining 39 percent of low-income, high achievers use
application strategies that an expert would probably regard as odd. For
instance, we see some students apply to only a local nonselective
college and one extremely selective and well-known college--Harvard, for
instance. No expert would advise such a strategy because the probability
of getting into an extremely selective, well-known college is low if a
student applies to just one--even if the student's test scores and
grades are typical of the college's students. Moreover, such a
strategy reveals that the student is interested in extremely selective
institutions yet is not applying to the other schools that are, for most
purposes, indistinguishable from the one to which he or she applied.
Another strategy that appears is a student applying to a single public
college in his or her state that is selective but is much less selective
than the state's flagship university. Although about half of these
application choices could be motivated by distance from home, the other
half cannot because the flagship university is nearer. Another strategy
that falls into the idiosyncratic category is a student applying to a
single private college outside his or her state that is selective, but
much less selective and much poorer in resources than the student's
private peer colleges would be. Such choices are odd because although
the private peer colleges might offer fewer scholarships that are
explicitly merit-based, they offer much more generous need-based aid, so
that the student would pay less to attend and would enjoy substantially
more resources. Furthermore, it is almost never sensible for a
low-income student to apply to a single private, selective college: such
a student can use competing aid offers to improve the aid package at his
or her most preferred college.
We have described a few salient strategies that appear among
low-income high achievers who are neither achievement-typical nor
income-typical. However, most of these students' portfolios do not
evince any pattern that can be readily described. Thus, below we turn to
an econometric analysis, in which we can simultaneously consider a large
number of factors correlated with students' application choices.
IV. B. College Application Behavior: An Econometric Analysis
In this subsection we assess the factors that are associated with a
student's choice of his or her application portfolio, using a
conditional logit model in which a student can apply to all colleges in
the United States but decides to apply only to some. This model is based
on a random utility framework and assumes that the student prefers all
colleges to which he or she applies over the colleges to which he or she
does not apply. We do not assume anything about the student's
preference ordering within the colleges to which he or she applied. (24)
Each possible college matched with each student is an observation, and
the dependent variable is a binary variable equal to 1 if the student
submits an application to the college and zero otherwise.
The explanatory variables we consider are the difference between a
school's median test score and the student's own test score if
positive, the same difference if negative, (25) an indicator for the
school's being nonselective, the distance between the
student's home and the school, the square of this distance, an
indicator for the school being the most proximate, an indicator for the
school being public, an indicator for the school being in-state for the
student, an indicator for the school being the flagship university of
the student's state of residence, the sticker price of the college,
the likely net cost of the college for the student, and the
student-oriented resources per student at the college. We fully interact
these explanatory variables with indicators for the student being
low-income, high-income, or in between. Thus, we estimate separate
coefficients for each income group. In the tables we do not show the
coefficients for the middle-income group because they nearly always fall
between those of the high- and low-income students, but the coefficients
are available upon request.
Table 3 shows the results of this estimation. The coefficients are
expressed as odds ratios so that a coefficient greater than 1 means that
an increase in the covariate is associated with an increase in the
probability that the student applies to the school, all other covariates
held constant. Based on our graphical analysis, we expect to find very
different coefficients for low- and high-income students, and we do.
(26) Note that, although it is convenient to describe the coefficients
as though they literally revealed preference, they should not be given
such a strong interpretation or a causal interpretation. For instance,
students might "disfavor" distance not because distance itself
generates negative utility but because distant schools have, say,
distinct cultures that the student dislikes.
We find that high-income students strongly favor reach colleges and
disfavor safety colleges (those for which the score difference is
negative). Per percentile of difference, this effect is much stronger on
the reach side than on the safety side, but recall that high-achieving
students can only reach a bit whereas they can apply to very safe
schools. High-income students strongly dislike nonselective
institutions. They also dislike higher net costs but (all else equal)
like higher sticker prices. This is probably because higher sticker
prices are associated with higher per-student resources, a
characteristic they also like. High-income students dislike distance,
but the quadratic term indicates that they dislike it only up to a
point, after which they are fairly indifferent. They have a mild
preference for in-state schools and their state's flagship
university. They do not have a statistically significant preference for
publicly controlled schools.
The low-income students exhibit several immediate contrasts. Such
students strongly favor nonselective colleges. This was obvious in the
graphical evidence. They do not disfavor schools whose median scores are
lower than theirs. They slightly disfavor schools with higher sticker
prices (recall that these were favored by high-income students) and do
not have a preference for net costs that is statistically significantly
different from zero. Low-income students do favor schools with higher
expenditure per student, but not nearly as much as high-income students
do. Distance is strongly disfavored for schools within 100 miles but,
thereafter, low-income students are fairly indifferent to it. Low-income
students favor in-state schools somewhat more than high-income students
do, but low-income students do not exhibit a preference in favor of
their state's flagship university. They slightly favor publicly
controlled colleges.
Table 4 repeats the estimation but interacts the covariates with
indicators for high-income students, middle-income students, low-income
achievement-typical students, low-income income-typical students, and
other low-income students. The estimated coefficients for
achievement-typical students are fairly similar to those for high-income
students. It is the income-typical students whose coefficients are
strikingly different. Of course, these results are somewhat by design,
given the way we categorized low-income students into
achievement-typical and income-typical groups. However, the coefficients
validate the categorization: achievement-typical students do pursue
similar application strategies to high-income students. In the next
section we assess which factors predict a student being
achievement-typical and which predict a student being income-typical.