Economics of Education.
Hoxby, Caroline M.
Over the past few years, the number of Working Papers issued by
NBER's Economics of Education Program has grown rapidly, with about
five new papers added each month. To cope with the large number of
excellent submissions, a spring program meeting has been added to each
year's events, which already included a fall meeting, a Summer
Institute program, and programs dedicated to special issues. This is all
to say that education continues to be an extremely productive and
exciting area of research in economics. I attribute this to three
phenomena. First, policymakers are actively experimenting with
education-related policies, and this creates a great deal of useful
variation for researchers to analyze. Indeed, there is a virtuous circle between economic analysis and policy innovation because economics is the
inspiration for, or intertwined with, many policies: school choice,
accountability, savings and aid plans for college, incentive pay for
teachers, reducing the barriers of entry into teaching, and so on.
Second, the education program draws upon the talents of economists who
come from a variety of fields, and this makes for an exciting dynamic
owing to the opportunities for arbitrage of ideas and methods across
fields. Third, and by no means least important, is the continued, rapid
rise in the quantity and quality of data available to researchers.
Researchers may differ on the substantive effects of state
accountability laws and the federal No Child Left Behind Act, but no
researcher would deny that these laws have created a deluge of data,
much of which is longitudinal. Because of coincidence, imitation, and
similar causes, researchers' access to rich data on colleges and
foreign schools has also risen dramatically. A few states have even
created "K-20" databases that allow us to track a
student's progress from his pre-kindergarten or kindergarten entry
to his final college course. The promise of such data is immense.
Although program members continue to focus most of their research on the
United States, they are increasingly taking advantage of foreign
countries' data and willingness to conduct policy experiments. As a
result, many of the methodological advances that are launched on U.S.
data spread quickly to research around the world.
All program reviews are necessarily selective and this one is no
exception. Three major themes in recent work deserve special mention:
the effects of teachers, peer effects, and the complexity of college
students' choices. Toward the end of this review, I describe other
themes that are currently receiving less attention but are likely to
emerge as absorbing topics soon.
Teachers
It is a commonplace that teachers matter, perhaps because nearly
everyone can remember a teacher or teachers who strongly influenced his
life. Thus, economists' inability to find consistent empirical
evidence to support the idea that teachers matter has been a substantial
puzzle. For years, most studies of teachers' effects depended on
regressing students' achievement on the characteristics of their
teachers: experience, highest degree, certification, and so on. Such
studies often suffered from a selection problem--essentially, more
qualified teachers had a tendency to gravitate to schools that served
students from more privileged backgrounds. (Below, I shall have more to
say on this tendency.) The selection problem caused researchers to
overestimate the effect of teachers' credentials on achievement,
yet still there was no consensus among studies that teachers'
characteristics affected students.
This puzzle has been largely resolved in the past couple of years,
owing to studies that directly estimate teachers' effects on
achievement using longitudinal data. With a generous amount of data, the
method is fairly straightforward: students' achievement is divided,
statistically, into student fixed effects, grade fixed effects, year
fixed effects, and teacher fixed effects. Jonah E. Rockoff has done
seminal work on this topic. (1) While statistical decisions do arise,
most authors uncover fairly large differences in the effects of teachers
who teach the same grade in the same school, use the same materials, and
draw students fairly randomly from the same population. For instance,
estimates often suggest that the best teacher may raise achievement by
as much as half a standard deviation more per year than the worst
teacher who operates in identical circumstances. In other words, we are
not wrong to recall that "teacher X" raised our achievement.
Once researchers have calculated teachers' empirical effects,
these become a powerful dependent variable that can be used to explore
the effects of policy on the teaching workforce. One of the first things researchers did with the computed teachers' effects was investigate
whether they were closely related to the teacher credentials upon which
achievement was traditionally regressed. The answer was generally no:
credentials do not explain teacher effects for the most part. (The
exception is that very inexperienced teachers have worse effects, but
even the effects of increased experience plateau after four to five
years.) This brings us to the most recent work, which examines policy
changes designed to affect teachers. Donald Boyd, Pamela Grossman,
Hamilton Lankford, Susanna Loeb, and James Wyckoff (11844) and Thomas J.
Kane, Rockoff, and Douglas O. Staiger (12155) investigate New York City's recent decision to allow people from a wider variety of
backgrounds to teach: not just people who attain certification through
regular channels, but also people with alternative forms of
certification or temporary teaching licenses ("Teaching
Fellows," Teach for America, international exchange programs, and
so on). Both studies conclude that differences in certification explain
only a small fraction (if any) of the variation in achievement:
differences among teachers with the same certification dwarf the
differences associated with certification. The rather striking
implication of the evidence is that it may make sense for schools to
focus their energy on ex post selection--that is, retaining teachers who
empirically demonstrate good effects in their first few years, and not
retaining others. Kane, Rockoff, and Staiger conclude that, even after
one takes account of the effects that an ex post selection policy would
have on teacher turnover (and, therefore, on inexperience), the evidence
"suggests that selecting high quality teachers at the time of hire
may be difficult ... The large observable differences in teacher
effectiveness ex post suggest that districts should use performance on
the job, rather than initial certification status to improve average
teacher effectiveness."
The estimation of teacher effects and the subsequent finding that
they are largely unrelated to credentials reconciles a good deal of
other evidence and allows a relatively clear picture to emerge. For
instance, I mentioned earlier that regressions of student achievement on
teacher credentials produce inconsistent evidence. In retrospect, it is
easier to see that the studies that suggested that credentials affected
achievement substantially were those that did a poor job of controlling
for teachers' tendency to gravitate toward more advantaged
students. Recent working papers lay out increasingly rich evidence of
this tendency. Eric A. Hanushek, John F. Kain, Daniel M. O'Brien,
and Steven G. Rivkin (11154) and Charles T. Clotfelter, Helen F. Ladd,
and Jacob L. Vigdor (11936) demonstrate that, perhaps because they
experience only trivial wage changes when they switch schools, teachers
who are able to make voluntary switches move to schools where students
are more affluent, higher achieving, and less likely to be minorities.
Boyd, Lankford, and Loeb (9953) show that teachers strongly prefer to
teach where they live. This makes sense if the reduction in child care
costs and the increase in neighborly amenities associated with proximity
outweigh the (usually small) wage gains associated with teaching in a
distant school, especially one located in a difficult neighborhood.
Suppose that a state were, rather, to implement a policy whereby a
teacher would earn a bonus if she taught in a school that served
disadvantaged students. Would anyone respond? Clotfelter, Elizabeth
Glennie, Ladd, and Vigdor (12285) examine a North Carolina program that
offered a modest bonus of $1,800 to certified math, science, and special
education teachers who chose to work in high-poverty or academically
failing secondary schools. Their findings suggest that teachers do
respond, primarily because they leave at a slower than expected rate.
Andrew Leigh ("Teacher Pay and Teacher Aptitude", Spring 2006
program meeting) offers further evidence that teachers respond to higher
pay. Using data on the test scores of everyone admitted to Australian
universities between 1989 and 2003, he shows that a single percent rise
in starting-teacher salaries boosts the average aptitude of students
entering teacher education courses by 0.6 percentile ranks. The North
Carolina and Australia studies suggest that pay can be used to change
the pool of prospective teachers available to a school, but this may be
a far less direct way of improving teacher performance than simply
paying teachers more when they raise achievements. Victor Lavy (10622)
examines a pay-for-performance program in Israel, exploiting a natural
experiment in teachers' assignment to the program. He demonstrates
that teachers who experienced incentive-based pay raised their
students' performance on high school exams. Because Florida is
currently implementing a substantial pay-for-performance scheme, we are
likely to learn more about this topic in future years.
Peer Effects
Investigation of peer effects, broadly construed, is perhaps the
single most active area at present within the economics of education.
This is sometimes difficult to explain to policymakers because there are
no policies known as peer effect policies. Instead, understanding how
peer effects function is crucial to analyzing numerous other policies,
including selective college admissions, school tracking, desegregation,
school choice, bilingual education, and even school finance. Put another
way, peer effects are fundamental parameters that, properly estimated,
are needed for numerous other analyses. In the context of education,
economists define peer effects broadly: the effect that any student has
on any other student, regardless of the channel by which the effect
operates. That is, peer effects are not just one student's teaching
another but may include phenomena such as one student's affecting
the way a classroom operates, or a teacher teaches, and thereby
influencing his classmates.
Two problems make estimation of peer effects challenging, and
program members have made significant progress on both fronts. First,
identifying peer effects is difficult because they can be confounded
with numerous forms of selection. Most obviously, students X and Y might
be similar and spend a lot of time together. Are they similar because Y
influences X, or because similar students become friends, or because an
administrator recognized their initial similarity and forced them to
spend time together by making them roommates, putting them in the same
class, and so on? (There are other identification problems that plague
peer effects' estimation, but selection is the main one, in
practice.) Second, most policies that turn on peer effects implicitly
assume that they are nonlinear, yet it is often difficult to find data
or methods with the power to identify nonlinear effects. Linear peer
effects are not terribly interesting for policy because they imply that
if one person gains from the reassignment of a peer, there is an equal,
offsetting loss for another person. Thus, no amount of rearranging
peers, as might occur if policy-makers were to alter desegregation
programs or college admissions, could produce an outcome that was
unambiguously better for society. In contrast, if peer effects are
non-linear, it is possible that some arrangements of peers are better
for everyone (or are, at least, much better for many people and only a
bit worse for a few people).
Program members have made great progress on identifying peer
effects by finding natural and policy experiments that rearrange students. I introduced a method (7867) that exploits natural variation
in cohorts within a school; Andreas Ammermueller and Jorn-Steffen
Pischke (12180) applied it to data on European primary schools, and
Weili Ding and Steven F. Lehrer (12305) applied it to data on Chinese
secondary schools. Both studies find evidence of significant peer
effects in achievement, and the latter study suggests that they are
non-linear (a point to which I will return). Eric D. Gould, Lavy, and M.
Daniele Paserman (10844) apply the same method to a particularly
interesting problem: the effect of an influx of immigrant students. They
examine Israeli schools in which one grade experiences a substantial
influx of immigrant students and an adjoining grade does not. Their
results suggest that the immigrant students have no or only a slight
effect overall but have an adverse effect on non-immigrant students who
come from disadvantaged backgrounds. Two papers make powerful use of the
method by applying it to military colleges, which arrange incoming
students into very distinct units and strictly control cross-unit
fraternization. Scott Carrell, Frederick Malmstrom, and James West (Fall
2005 program meeting) and Carrell, Richard Fullerton, West, and Robert
Gilchrist (Summer Institute 2006) find evidence of significant peer
effects in academic achievement, athletic performance, and even
cheating. Finally, Zeynep Hansen, Hideo Owan, and Jie Pan (12251) use
variation in the study groups to which students are assigned in business
school courses. They find that male-dominated groups perform worse, both
working in groups and in exams taken individually, than do
female-dominated or gender-balanced groups.
Other papers exploit policy differences among schools that are
otherwise very similar. Philip J. Cook, Robert MacCoun, Clara Muschkin,
and Vigdor (12471) exploit differences in whether sixth grade is the top
primary school grade or the bottom middle school grade. If the former is
the case, then sixth graders are exposed mainly to younger peers. If the
latter is true, then sixth graders are exposed mainly to older peers.
The authors find that sixth graders exposed to older peers are more
likely to have disciplinary incidents and that the differences persist
in the seventh and eighth grades, when all of the students are in middle
school. Hanushek and Ludger Woessmann (11124) compare students across
educational systems that "track" earlier and later. In the
latter systems, students' classrooms remain heterogeneous longer.
Additional papers make use of explicit randomized experiments. Lisa
Sanbonmatsu, Jeffrey R. Kling, Greg J. Duncan, and Jeanne Brooks-Gunn
(11909) use data from the Moving to Opportunity experiment, in which
some families who apply for housing vouchers are induced to move out of
high poverty areas. Compared to children in the control group, the
children in the (randomized) treatment group are exposed to peers from
higher-income families. The authors "had hypothesized that reading
and math test scores would be higher among children in families [who
move out of high poverty neighborhoods, but] ... the results show no
significant effects on test scores for any age group among over 5000
children ages 6 to 20 in 2002 who were assessed four to seven years
after randomization." This finding--an absence of peer
effects--conflicts somewhat with the results of the aforementioned
studies, but the Moving to Opportunity experiment alters families'
lives on more dimensions than the typical school rearrangement does.
Thomas S. Dee (11660) puts the randomization in the Tennessee Star
Experiment (which was designed for analyzing class size) to unusual
purpose: understanding the peer effects of teachers. Although the
application strains the "peer effects" nomenclature and
"role model effects" might be more natural, the study
nevertheless belongs in this section. In it, Dee finds that students
assigned to own-gender teachers have higher achievement, are more
engaged, and are more positively perceived in school.
Many, though not all, of the above papers have difficulty
identifying non-linear peer effects, primarily because the typical
experiment (natural or otherwise) does not rearrange a sufficient number
of students in a sufficiently diverse number of ways. In other words,
the studies typically lack the power to discover non-linear effects.
Ding and Lehrer's paper (12305) is something of an exception. Its
authors suggest that students who are initially high achieving benefit
more from having high achieving schoolmates than do students who are
initially low achieving. However, Gretchen Weingarth Salyer and I
(Spring 2006 program meeting) illustrate the most intense testing for
non-linearities by examining more than 80,000 students exposed to
reassignments in a large North Carolina school district. We test nine
models of peer effects and find evidence of substantial nonlinearities.
For example, we find that students are disproportionately influenced by
students who are initially like them. And, if a student who is initially
very low achieving is "dropped" into a classroom, his presence
most affects other students who were fairly low achieving themselves.
Another result is that, while some classroom heterogeneity is fine,
excessive heterogeneity reduces all students' achievement: the
evidence against bi-modal or "schizophrenic" classrooms is
particularly strong.
College Students and their Choices
It is only a bit of an exaggeration to say that economic research
on higher education used to focus on only two questions: what was the
return to college education (where "college" was a generic
thing) and whether "policy X" made students more likely to
attend college ? In College Choices: The Economics of Where to Go, When
to Go and How to Pay for It, I predicted the proximate demise of these
two questions owing to the fact that, at least for American students,
they are not where the action is. (2) Most students who are at all
interested in college now at least try attending some institution of
higher education, but there is enormous variation in the sorts of
institutions they attend, the curricula to which they are exposed,
whether they persist and earn a degree, and how quickly they earn
credits. It is increasingly naive to expect a college-related policy to
have its main effects on the attendance margin as opposed to the
"which college", "whether a degree", or "when a
degree" margins. It is also naive to treat all postsecondary
education as the same: a year is a year is a year, regardless of the
curriculum delivered, the institution's resources, or the time the
student devotes to the effort (full-time or part-time, for instance).
Thus, I am not only unsurprised but also glad to see that, by what
appears to be a wholly natural evolution, program members are
increasingly investigating questions about how a student's college
choices, in all their complexity, affect his outcomes.
Several papers consider persistence to the college degree and
achievement in college classes. In practice, these are closely related
topics because, once a student starts performing poorly in college, he
is likely to stop persisting and may never (or only much later) earn a
degree. Failure to persist is particularly common among students from
disadvantaged backgrounds, students whose secondary school achievement
was poor, and students who enroll in non-selective institutions. This is
not to say that any of these factors is causal--for instance, being
disadvantaged does not necessarily cause a student to drop out--but they
suggest where the investigation should begin. Eric P. Bettinger and
Bridget T. Long (10369, 11325) examine the effect of college remediation
courses. These courses, which are many students' first
postsecondary experience, are controversial. On the one hand, they may
provide useful transitional experiences for students whose poor
preparation would cause them to fail regular college courses. On the
other hand, remediation increases the total number of courses a student
must take before attaining his degree, thereby perhaps discouraging
students who see a long plod ahead of them. Using rich administrative
data from Ohio, where colleges differ in how they assign students to
remediation, the authors find that both phenomena (encouragement and
discouragement) exist. Being placed in remedial courses increases a
student's probability of dropping out or transferring to a less
selective college. However, actually completing a remedial course (the
treatment on the treated effect) increases a student's persistence
in college. The authors conclude that "remediation may serve ... to
re-sort students across schools"--in other words, to help them find
the institution most likely to serve their needs. Josh Angrist, Kevin
Lang, and Philip Oreopoulos (Summer Institute 2006) examine an explicit
experiment in which a college randomized students to receive financial
incentives for good grades, receive support services, or receive both.
They find that, at the end of a year, the financial incentives have
modestly improved the grades of female students, especially those who
studied more in high school. John Bound and Sarah Turner (12424)
investigate whether college students are more likely to persist when
they attend a college with more resources. This is not an easy question
because of self-selection: students who are more able and more motivated
are admitted to colleges with more resources. However, the authors
exploit the fact that states rarely increase the resources of their
public institution in line with the size of the cohort ready to attend
college. Therefore, students in "crowded cohorts" get fewer
resources, all else equal, and the authors link this deprivation to
decreased persistence.
Several papers examine how financial aid affects students. This is
a classic topic, but the new twists are that authors examine persistence
and the college selected. Authors have also greatly improved the methods
used. Whereas numerous previous papers depended upon variation in
financial aid that was fairly obviously endogenous (meritorious students
got more, students admitted to selective colleges got more, states gave
more when fiscal times were good, poorer students got more), recent
papers often exploit a discontinuity in aid formulae or an experiment.
For instance, Kane (9703) compares students on one and the other side of
a (ex ante unknowable) discontinuity in California's aid formula.
He also (10658) examines a policy change that made the District of
Columbia's residents eligible for in-state tuition at Maryland and
Virginia public colleges. The studies find that a $1000 reduction in
cost causes a modest increase in the probability that a student will
attend college at all (by 0.3 percentage points in the former study, by
about 0.9 percentage points in the latter) but causes substantial shifts
in which college students chose. In several years, when long-term
outcomes can be investigated, researchers will be able to see whether
the aid allowed students to attend colleges that were merely more
expensive (though not to them) or to attend colleges that were truly
better investments, thereby suggesting that students were previously
liquidity constrained not to attend the optimal college. Christopher
Avery, Kaitlin Burek, Clement Jackson, Glen Pope, Mridula Raman, and I
(12029) examine a Harvard policy that eliminated or greatly reduced
expenses for students from families with less than $60,000 in income.
While the actual change in aid was modest and the number of students who
matriculated as a result was modest as well, the policy greatly
increased applications from students with low-income backgrounds. This
suggests that disadvantaged students may fail to understand their
opportunities to get aid and may need information as much as they need a
generous aid formula. This theme is taken up by Susan M. Dynarski and
Judith E. Scott-Clayton (12227) who show that much of the complexity in
aid formulas, presumably the source of bafflement, serves very little
purpose in terms of identifying aid recipients and determining the
dollars for which they qualify.
Finally, several studies examine the effect of a college's
curriculum on student outcomes. Daniel S. Hamermesh and Steven G. Donald
(10809) use a combination of survey and administrative data to produce
estimates of the earnings effect of various college majors. The study is
a convincing improvement over previous research because of its
authors' unusual ability to control for pre-existing factors, such
as incoming achievement and background, with very rich data and precise
measures of course taking. Ofer Malamud (Fall 2005 program meeting)
investigates the trade-off between forcing a student to choose his major
early (thereby increasing his coursework in the area of his eventual
degree) and allowing him to choose it later (thereby increasing his
likelihood of being well matched to a major because he has had more
opportunity to learn about fields before being forced to choose). The
optimal timing of such choices has long been a puzzle. The study, which
exploits institutional differences between Scottish and English
universities, demonstrates that students who choose their major later
are less likely to switch out of the field after college but that,
conditional on staying in a field, students who choose their major early
attain higher starting wages. Finally, Ronald G. Ehrenberg, George
Jakubson, Jeffrey Groen, Eric So, and Joseph Price (12065) analyze an
unusual policy experiment in which some graduate programs were given
funding to alter their structure in ways intended to increase
students' probability of and speed in getting their doctoral
degrees. This study is especially noteworthy for demonstrating how
mutually beneficial the relationship between institutions (in this case,
the Mellon Foundation) and researchers can be. An institution wants to
learn how to use funds well to produce particular outcomes; researchers
need to find policy experiments that allow them to identify the effects
of policy.
Emerging Themes
As more accountability programs are implemented, studies will
increasingly trace their effect on students. Signs of this appear in
Edward P. Lazear's work (10932), which provides insights into the
incentives generated by accountability programs; Hanushek and Margaret
E. Raymond's study (10591), which uses the staggered implementation
of states' accountability programs to assess early effects on
achievement; and Christiana Stoddard and Peter Kuhn's paper
(11970), which investigates whether teachers work more hours when under
pressure from accountability programs. Construing accountability more
broadly, one can learn about the impacts of high school exit exams from
Dee and Jacob (12199) or Francisco Martorell (Summer Institute 2005), or
the effects of financial incentives for students to perform from Michael
Kremer, Edward Miguel, and Rebecca Thornton (10971).
Working out empirical methods to deal with general equilibrium problems in education continues to be a challenge. General equilibrium
is especially relevant to issues like school choice, school finance, the
relationship between housing markets and schools, and desegregation.
Progress is being made, however. Patrick Bayer and Robert McMillan
(11802), Bayer, McMillan, and Kim Reuben (11095), and Bayer, Fernando
Ferreira, and McMillan (10871) all display innovative methods of
identification that exploit, but do not attempt to set aside,
equilibrium properties of the market for education. On school finance,
Katherine Baicker and Nora Gordon (10701), Ilyana Kuziemko and I
(10722), Christian A. L. Hilber and Christopher J. Mayer (10804), and
Kane, Staiger, and Stephanie K. Riegg (11347) explore links between
house prices, intergovernmental aid for schools, and local school
budgets. The linkages make it challenging to design effective
redistributive aid among schools but do allow one generation to help
finance another's education. Finally, Nora Gordon, Elizabeth
Cascio, Sarah Reber, and Ethan Lewis (Fall 2005 program meeting) offer a
striking new interpretation of school desegregation in the South, which
they demonstrate was, to a large extent, a response to federal financial
incentives (especially Title 1) rather than explicit court orders and
the like.
Connections between health and education have often been neglected,
but a number of interesting papers suggest that a new wind is blowing.
David M. Cutler and Adriana Lleras-Muney (12352) provide an overall
introduction; Justin McCrary and Heather Royer (12329) investigate
whether more education makes women better mothers in terms of infant
health; and Ding, Lehrer, J. Niels Rosenquist, and Janet
Audrain-McGovern (12304) use data on genetic markers to evaluate the
causal impact of health on education. Much of the relationship between
health and education is associated with infancy and early childhood,
where health, nutrition, and the environment may have disproportionate
effects on cognitive development. This, in turn, may affect a
person's later education, which may, in turn, affect the
environment she provides for her infant. Janet Currie and Enrico Moretti
(11567), Sandra E. Black, Paul J. Devereux, and Kjell Salvanes (11796),
and Eric I. Knudsen, James J. Heckman, Judy L. Cameron, and Jack P.
Shonkoff (12298) explore these linkages.
Summing Up
It is striking that many of the themes that I identified as
emerging in my last program review have now been explored in a good
number of studies. It is also striking how quickly new topics in the
economics of education are emerging. While some of the appearance of
novelty in this program review is deliberate (I have de-emphasized
studies in areas that are well-trodden), much of the novelty simply
reflects the evolution of the program, which continues to develop
rigorous methods for investigating problems of fundamental importance
and policy relevance.
Caroline H. Hoxby *
(1) J.E. Rockoff, "The Impact of Individual Teachers an
Student Achievement: Evidence from Panel Data," American Economic
Review, Papers and Proceedings, May 2004.
(2) C.M. Hoxby, College Choices: The Economics of Where to Go, When
to Go and How to Pay for It, Chicago: University of Chicago Press, 2004.