Do introductory economics students learn more if their instructor has a Ph.D.?
Finegan, T. Aldrich
The single greatest impediment to good teaching of economics, of
course, is that we only qualify to do it by attending economics graduate
school. That was the most intellectually stifling experience of my life
and a model of how not to teach.
- Randy Bartlett (1993)
After reviewing predictions of an impending shortage of Ph.D.s a
few years ago, Ronald Ehrenberg (1991) asked a more fundamental
question: "Suppose that a 'shortage' of American
doctorates does occur in the future. Would this have a substantial
negative effect on academe?"
Ehrenberg concluded that there would be little effect of a Ph.D.
shortage on either the research productivity of faculty at American
colleges and universities or on the flow of students, especially the
most talented ones, into doctoral study. Only a cataclysmic Ph.D.
shortage, according to Ehrenberg, would be likely to affect those
universities that generate virtually all of the academic research. And
few doctoral students earn their undergraduate degrees at the
institutions that would likely experience a declining proportion of
their faculty holding a Ph.D. in the event of a shortage.
On another important issue, however - the likely effect of a Ph.D.
shortage on the quality of undergraduate instruction - Ehrenberg was
equivocal. Numerous studies find no difference in the final examination
scores of introductory economics students taught by regular faculty and
by graduate students (Siegfried and Fels, 1979; Siegfried and Walstad,
1990). Those studies do not address the pertinent question, however,
since "graduate students" include many individuals who will
eventually earn Ph.D.s, and few who have much teaching experience. If
there is a chronic Ph.D. shortage, its effect will be manifested in more
experienced "regular" faculty who do not hold an earned Ph.D.
or expect to earn one in the near future.
Ehrenberg identified a number of studies that have correlated teaching evaluations with the terminal degree status of faculty. But
those studies are all dated, rely on students' impressions of what
they learned rather than on objective measures of learning, and also
produce conflicting results. Few of them controlled for factors other
than degree status that might affect teaching ratings. The quality of
the evidence led Ehrenberg to identify this as "an area that
clearly warrants new research." This paper is a response to that
call.
I. Data and Research Design
Our data come from 53 introductory macroeconomics classes taught by
38 different regular instructors at 29 different colleges, and from 64
introductory microeconomics classes taught by 48 different regular
instructors at 34 different colleges. The distribution of the classes by
type of institution is reported in Table 1. These data were collected
during the norming of the third edition of the Test of Understanding
College Economics (TUCE III) (Saunders, 1994). We use them to explore
whether the introductory economics students of regular full-time faculty
who hold a Ph.D. learn more than the introductory economics students of
regular faculty who do not hold a Ph.D.
The value added of Ph.D. training on undergraduate student learning
is likely to be greatest in more advanced undergraduate courses. On the
other hand, introductory courses in economics account for a large share
of the instructional burden of economics faculty. About 40 percent of
faculty undergraduate teaching assignments, and almost 60 percent of
undergraduate economics enrollments, are in introductory economics
courses (Siegfried and Wilkinson, 1982, 136). If a shortage of Ph.D.s
were to develop, it is likely that those faculty who had not earned a
Ph.D. would be assigned disproportionately to teach introductory
economics. Thus, the effects of a potential shortage of Ph.D. economists
on student learning should be apparent in introductory economics
courses.
[TABULAR DATA FOR TABLE 1 OMITTED]
The TUCE III norming sample contained a total of 189 introductory
economics classes. We lost 51 observations by omitting the 20 classes
taught by instructors who were graduate students and 31 classes taught
by visitors and adjunct faculty. We also lost a few observations because
we could not assemble a complete data set. Thus instructors in our
sample are regular tenured or tenure-track faculty, the kind likely to
be affected by a chronic shortage of new Ph.D.s. There are a total of 75
different faculty in the sample; 59 of them held earned Ph.D.s and 16 of
them did not. Those with a Ph.D. reported 15.2 years of teaching
experience on average; those holding only a Master's degree reported an average of 16.9 years of teaching experience.
We use classes as the unit of observation for several reasons.
First, because the main focus of our analysis is on an instructor
characteristic - highest degree earned - and there is one instructor per
class, it seems inappropriate to weight larger classes more heavily in
the analysis than smaller classes. Second, suppose that department
chairs tend to assign instructors who do not hold a Ph.D. to smaller
classes. Then, the relationship between teachers' highest degree
earned and students' learning could be obscured by the large number
of observations from large classes if individual students were used as
the unit of observation. Third, unobserved characteristics of students
may affect their learning of introductory economics. Since these
unobserved characteristics are likely to vary less (on average) across
classes than across individuals, using class average data minimizes the
chance that one of these characteristics might affect our empirical
results. And, fourth, using individual student observations risks
finding statistically significant results with no substantive meaning
because of the "too-large-sample-size" phenomenon (Kennedy,
1992, p. 65).
Our approach is to relate several measures of proficiency in
teaching to various student and faculty inputs that might be expected to
affect how much students have learned. One of these factors is the
instructor's highest earned degree.
This data set contains a variety of measures of student learning.
All of the 5,876 students in the 117 sample classes took the same TUCE
III pretest and TUCE III post-test. Fifty-seven percent of the students
completed a student questionnaire, on which they were asked to rate how
much they learned in the course, and to rate the teaching effectiveness
of their instructor.
From these data we formed four measures of teaching effectiveness:
(1) POSTTUCE, the score on the TUCE III post-test (given after the
course was over); (2) VALADD, a measure of value added - the post-test
TUCE score minus pre-test score; (3) LEARNING, students' subjective
impression of how much they had learned in the course, and (4)
EVALUATION, students' evaluation of their instructor's
teaching effectiveness. These four measures serve as alternate dependent
variables.(1)
A few words on how these dependent variables differ may be useful.
The average score on the post-course TUCE (POSTTUCE) has the merit of
objectivity and simplicity: it measures how much knowledge (of the kind
testable by multiple choice questions) a given class possessed at the
end of the course, not how much it had learned in the course. The
distinction is important because most students begin a course in
principles of economics with some understanding of the subject matter,
and because the average level of pre-course understanding (at least as
measured by pre-test scores, PRETUCE) varies considerably across
classes.(2) Further, we find statistically significant associations
across classes between PRETUCE and POSTTUCE and between PRETUCE and some
independent variables (including, across micro classes, the
instructor's degree). All this argues for the need to control
PRETUCE in some regressions. Given the statistical hazards of using this
variable on the right hand side of the regression (see Kennedy, 1994),
we do not do so in regressions explaining POSTTUCE. Instead, an
alternate dependent variable (i.e., VALADD) measures the absolute
difference between POSTTUCE and PRETUCE.
The subjective evaluations by students regarding how much they have
learned in their course (LEARNING) and how highly they rate the teaching
effectiveness of the instructor (EVALUATION) are of interest because
some elements of learning may not be well measured by the TUCE. Beyond
that, student opinions matter - even if uncorrelated with objective
learning - because students are the ultimate consumers in this industry.
Fortunately, we find a highly significant positive association across
both macro and micro classes between LEARNING and both TUCE-based
measures of achievement. But there is little correlation between the
objective measures of learning and students' rating of their
teachers' effectiveness.(3) The weak link between how well students
do on the TUCE and how highly they rate their instructors is not
surprising, considering the many other factors that no doubt influence
such assessments.
Each measure of teaching effectiveness is regressed on the
instructor's terminal degree status and a set of control variables.
We estimate separate regressions for the macro principles courses and
for the micro principles courses in our sample and for subsets of macro
and micro classes taught at comprehensive universities and liberal arts
colleges. These are the institutions that would most likely feel the
pinch from a shortage of new Ph.D.s. Research and doctoral universities
would likely continue to hire Ph.D.s even during a shortage; two-year
colleges hire few Ph.D.s even when they are plentiful.
The instructor's terminal degree is measured by a binary variable (DEGREE) which equals one if the instructor holds an earned
Ph.D., and is zero otherwise. (All of the instructors who did not have a
Ph.D. had a Master's degree.) How an instructor's highest
degree should influence teaching effectiveness is not self-evident. On
the one hand, instructors with more graduate training are likely to have
learned more economics and might therefore be better prepared to teach
elementary economics. In addition, to the extent that instructional
skills are learned by observation, instructors with a Ph.D. have had an
opportunity to observe a larger number of their own professors'
instructional methods. On the other hand, the curriculum of most Ph.D.
programs includes little formal teacher training; and as the headnote quotation suggests, some graduates take a dim view of the net
contribution of their graduate school experience to the effective
teaching of undergraduates. In particular, there is seldom any training
in how to motivate and inspire beginning students of economics. Further,
because the Ph.D. is a research degree, it may attract individuals who
are more interested in research than teaching or who emphasize complex,
subtle arguments at the expense of teaching fundamental principles. If
so, instructors holding earned Ph.D.s might invest less in (and care
less about) teaching, or confuse students more than they enlighten them,
and accordingly, be less effective instructors.
The eleven control variables fall into four groups: characteristics
of the instructors, students, and the schools in our sample; and the
circumstances under which students took the post-TUCE test. A brief
introduction of each variable follows.
Characteristics of instructors. Besides the instructor's
highest earned degree, we have included a dummy variable (TEACHSEX) for
the instructors' sex (1 if female, 0 if male) and the average
subjective rating of the class, on a five-point scale, as to how well
the instructor spoke English (ENGLISH). We have no a priori expectation
as to how the sex of the instructor would influence student learning and
assessments of teaching effectiveness, but this factor could be
important, at least in subjective ratings, and is exogenous to the
instructor's degree. We expect perceived proficiency in English to
have a positive association with student performance.
Characteristics of schools. In assessing the influence of
instructor's highest degree on how much students learn, it is
important to control as best we can for the quality of the students
enrolled in the courses in our sample, inasmuch as faculty without a
Ph.D. degree are clearly underrepresented in the more selective colleges
and universities.(4) Our measure of school selectivity (SELECTIVITY) is
an estimate of the combined SAT scores (verbal and quantitative) at the
25th percentile of the distribution for all undergraduates in each
school.(5) We expect to find more learning (objective and subjective),
but not necessarily higher instructor ratings, in more selective
schools.
We also included a dummy variable for small classes (SMALLCLASS),
which identifies classes with enrollments of 30 or fewer students.
Thanks to the greater attention that individual students can get in
small classes, and perhaps to the use of more effective pedagogical methods in smaller classes, we expect the estimated coefficient on this
variable to have a positive sign.
Characteristics of students. First, we control for the cumulative
grade point average of the students in the class in all previous courses
at the same school (GPA). Students who were taking principles as first
semester freshmen obviously had to be excluded from this average. After
we control for school characteristics, GPA may be positively related to
academic effort and ability, which should lead to more learning in
principles of economics, but not necessarily to a higher evaluation of
the instructor.(6) So we expect a positive sign for GPA for all
dependent variables except EVALUATION.
In regressions explaining LEARNING and EVALUATION, we also control
for the average grade that students expected to receive in micro or
macro principles (EXPECTGRADE). After controlling for school
characteristics and students' GPA, we surmise that EXPECTGRADE will
pick up mainly the instructor's grading standards. Lower grading
standards should contribute to higher instructor ratings (McKenzie,
1975; Seiver, 1983; Mehdizadeh, 1990) but may or may not color student
assessments of their actual learning (we shall see).
Given the common perception that freshmen find introductory
economics more challenging than do other students, we include a variable
(FRESH) that measures the fraction of the class who reported having
completed less than 24 semester or quarter hours of previous academic
work. The expected sign of its estimated coefficient is negative.
We also include a variable (STUDSEX) that controls for the fraction
of the students in the class who were female. Earlier research (Lumsden
and Scott, 1987) has found that female students tend to do a little less
well than male students on multiple choice tests. After controlling for
school selectivity and class GPA, classes with relatively more women
might therefore score a little lower on our objective tests of learning.
But in light of how tenuous this expectation is, we enter STUDSEX as an
unsigned variable.
Our last two controls for student characteristics relate to the
outside jobs held by some students in our sample. One (PCTREGJOB)
measures the fraction of the class who held full-time jobs (defined here
as requiring 30 hours per week or more); the other (PCTPTJOB) records
the fraction holding part-time jobs (under 30 hours). We have the
impression that most students holding full-time jobs carry lighter than
normal courseloads, whereas most students holding part-time jobs carry
normal courseloads. We surmise that a part-time job is likely to reduce
the study time of a full-time student, leading to a negative expected
sign for the estimated coefficient on this variable, except when
explaining EVALUATION. It is less clear that holding a full-time job
would also have a negative effect on learning if the students holding
such jobs are carrying lighter academic loads. Such students also tend
to be older. Full-time and part-time students may also differ in
academic goals and motivation. We therefore do not predict the expected
sign of the estimated coefficient of PCTREGJOB.
Variables related to the POSTTUCE. In those regressions using
POSTTUCE and VALADD as the dependent variable, we included a binary
variable (COUNT) that distinguishes whether a student's score on
the post-TUCE exam counted towards his or her course grade. It did so in
64 percent of the macro classes and in 61 percent of the micro classes.
There is evidence that student motivation was affected by whether the
exam counted (Kennedy and Siegfried, 1995). We thus expect exam scores
to be higher in those classes where it did.(7)
Finally, there was a small variation across classes in how many
questions on the post-test students were told to answer (30 or 33,
depending on the class's coverage of international economics), and
considerably more variation in how much time students were given to
complete the test. To cope with this problem we created a variable
(REL-TIME) equal to the average number of minutes each student had to
answer each question. (The mean was roughly 1.6 minutes with a standard
deviation of 0.3 minutes in each kind of principles course.) We also
normalized our dependent variables, POSTtUCE and VALADD, to control for
the number of questions on the post test.
II. Empirical Results
The sixteen regressions in this study (all estimated by ordinary
least squares) contain 184 regression coefficients. In light of the
generally similar results for both TUCE-based dependent variables within
each kind of principles class and the similar results for many of the
control variables, we do not report all 184 estimated regression
coefficients. Table 2 reports the definition, mean, and standard
deviation of each dependent variable by kind of course and the adjusted
[R.sup.2] of each regression for each dependent variable. Table 3
assembles the definitions of our independent variables, their means and
standard deviations in all classes combined, and provides an overview of
how each performs in our regressions.(8) Table 4 reports the regression
coefficients and t-values of the variable of special interest,
instructor's degree (DEGREE) in all regressions.(9) Highlights of
the findings for other control variables are summarized in the text.(10)
The most striking result from Table 2 is the similarity in the
means and standard deviations of each dependent variable for macro and
micro principles courses and their respective school-kind subsets. The
mean number of questions answered correctly on the POSTTUCE is slightly
higher in micro classes (15.3 versus 13.9 in macro), but this difference
corresponds to a higher PRETEST mean in micro. Thus, the all-class means
for VALADD are almost identical - 4.67 and 4.56. respectively.
Of the 13 explanatory variables in this study, five were expected
to carry the same sign in all regressions, while another four were
expected to have consistent signs in all runs except those explaining
EVALUATION (see Table 3). Eighty-two percent of the 104 signed
regression coefficients turned out to have the "right" sign;
the part-time job variable accounts for 10 of the 19 "wrong"
signs. Only one coefficient obtaining an unexpected sign would have been
statistically significant at the 5 percent level using a two-tail
test.(11) On the other hand, only 21 percent of all unsigned
coefficients and only 28 percent of all coefficients differ
significantly from zero based on the appropriate one- or two-tail
t-test.
We turn now to our findings for key variables.
Instructor's highest degree. It is hard to find compelling
evidence in Table 4 that full-time instructors with Ph.D.s in economics
are more effective teachers of introductory economics than those
full-time instructors holding only an M.A. degree. In fact, a simple
tally of regression coefficients would suggest the opposite conclusion.
On closer inspection, however, the data seem to render a split decision.
For macro principles courses, all four of the coefficients for
DEGREE explaining objective measures of achievement or learning (our
TUCE-based measures) are positive, but all are small and none is even
close to statistical significance. While the null hypothesis cannot be
rejected, neither can we reject the alternative hypothesis that students
taking macro principles from instructors with a Ph.D. do a little better
on the TUCE. In micro principles courses, however, three of the four
coefficients for DEGREE explaining objective measures of learning are
negative, and both of these coefficients in the regressions using all of
the micro classes in our sample are significant at the 0.05 level or
better. The inference that students taking micro principles learn less
from Ph.D. instructors would be even stronger had the results for the
subset of micro classes from liberal arts colleges and comprehensive
universities been equally decisive; instead, these two coefficients have
mixed signs and are not significant. One reason may be the small number
of microeconomics classes in this subset (only 4) taught by instructors
with M.A. degrees.
Earlier in this paper we suggested several reasons why instructors
with Ph.D. degrees might prove, on balance, to be less effective
teachers of introductory economics. Unfortunately, none of these
conjectures can readily explain why only micro instructors should suffer
this disadvantage.
Finally, in regressions explaining student assessments of learning
and their instructors' teaching effectiveness, seven of the eight
coefficients for DEGREE are negative but none is significant. Whatever
drives such assessments in introductory economics classes, the
instructor's degree seems to play an unimportant role.
Student assessments of instructors' English. If holding a
Ph.D. does little, if anything, to boost an instructor's course
evaluations, speaking English well - at least to students' ears -
does a lot. All eight of the coefficients for this variable explaining
self-assessed learning and the overall rating of the instructor are
positive and significant at the .01 level. Their magnitude implies that
a one standard deviation improvement in English proficiency is
associated with about one-sixth of a point higher rating of subjective
learning and one-fourth of a point higher rating of the
instructor's teaching effectiveness.(12)
But does better English mean more "real" learning, as
measured by our TUCE variables? For microeconomics classes, the answer
is an emphatic "yes": here, a one point advantage in
student-assessed English goes hand in hand with about two more right
answers on the POSTTUCE and in value added; and all four of these
coefficients are also statistically significant. While better English is
not associated with significantly better scores on the TUCE in
introductory macro, all four of the coefficients for instructor's
English explaining TUCE-based achievement measures in macro classes do
have positive signs, and they are one-third to two-thirds as large as
their counterparts in micro classes.
Instructor's sex. Other things held constant, the sex of the
instructor is not associated with how well students in principles
courses do on the TUCE. Five of the 8 TUCE related coefficients for this
dummy variable (1 = female) are positive, but none is as large as its
standard error. But on subjective assessments of learning and the
instructor's teaching effectiveness, women teachers and their
courses get higher marks in seven out of eight runs; and two of these
differentials (for EVALUATION in the subset of macro courses and in the
complete set of micro courses) are statistically significant at the .05
level or better.
School selectivity and class size. The expected positive
association between student achievement and the selectivity of the
college or university, as measured by our estimate of the school's
mean combined SAT score at the 25th percentile, is quite robust.
Fourteen of the 16 regression coefficients for SELECTIVITY are positive
and 10 are significant at the 0.05 level. In micro classes, a 100-point
advantage on the first quartile combined SAT average is associated with
a one question gain in VALADD (the t-value is 4.8). The association is
weaker in macro classes, where the gains are about two-thirds as large.
These gains on objective measures of learning are accompanied by higher
subjective assessments of learning, although the measured benefit of a
100 point SAT gain is quite small (about 0.07 points). And, not
surprisingly, we find no association between school selectivity and
instructor ratings: either better students expect more of their teachers
or the quality of instruction is independent of student abilities.
[TABULAR DATA FOR TABLE 2 OMITTED]
[TABULAR DATA FOR TABLE 3 OMITTED]
[TABULAR DATA FOR TABLE 4 OMITTED]
The anticipated benefits from taking introductory economics in a
small class receive little support from our study. Although 15 of the 16
regression coefficients for the dummy variable identifying a class of 30
or fewer students emerge with a positive sign, only two of them are
statistically significant at the 0.05 level. One is in the regression
explaining self-assessed learning by students in all 64 micro principles
classes; the other explains instructors' ratings in the subset of
32 macro classes. The lack of any significant association between
objective measures of learning and class size is consistent with other
findings on this issue (see Kennedy and Siegfried, 1995, for a review of
this literature).
Characteristics of students. The associations between how much
students learn and their personal characteristics are, we think, more
reliably assessed from data using individuals rather than classes as
observations.(13) Since the focus of this study lies elsewhere, we have
not examined these associations with data at the individual student
level. Instead, we offer a brief summary of the results for those
variables that seek to control for interclass differences in these
characteristics.
First, the class's average cumulative GPA in earlier courses
is positively associated with TUCE-measured learning in all eight
regressions and is statistically significant in four of them. Curiously,
other things equal, classes with higher GPAs report having learned
slightly less economics than other classes, although none of these
negative coefficients would be significant at the 0.05 level using a
two-tail test. And in macro principles, classes with better GPAs rated
the overall effectiveness of their instructors significantly lower. The
latter association could reflect differences in real teaching
proficiency or in students' evaluation standards.
The average grade that students expected to receive in the course
is always positively related to how much they believe they have learned
in the course and how highly they rate their instructor, but only three
of the eight regression coefficients are significant. Happily, even the
largest coefficient for EXPECTGRADE explaining EVALUATION is only 0.33,
which means that an instructor would have to raise (arbitrarily) course
grades in macro principles by one whole letter grade to reap a one-third
of a point gain in his or her student evaluations. In micro principles
classes, this variable's coefficient is a trivial 0.05.(14)
Classes with a larger than average fraction of freshmen did a
little less well on the TUCE in macro principles; but only in the
POSTTUCE run for the subset of macro classes was FRESH statistically
significant. In micro principles these coefficients were small and
thoroughly mixed. In neither macro nor micro classes was there evidence
that freshmen thought they learned less or gave lower marks to their
instructors.(15)
Our variable measuring the fraction of students who were women
produced mixed results of a different kind, and we are not sure how to
interpret them. In micro principles, classes with relatively more women
reported a little more self-assessed learning and gave their instructors
significantly higher evaluations; in macro principles, however, such
classes reported significantly less learning but essentially the same
instructor ratings. On the two objective indices of learning, all but
one of the eight coefficients for STUDSEX are negative and five are
significant, suggesting that female students learned less economics than
male students.
Finally, there are also some surprises in the results for the
outside job variables. Contrary to our expectations, there is no
evidence that holding a part-time job reduces either objective or
subjective indices of learning. In fact, PCTPTJOB has a positive sign in
10 of the 12 regressions explaining all kinds of learning, although only
one coefficient would have been significant at the .05 level using a
two-tail test. Holding regular jobs (requiring 30 hours or more each
week) is also associated with more objectively measured achievement, and
in micro principles classes this association is significant at the .01
level for both TUCE-based measures.
The idea that holding a full-time job somehow promotes the learning
of micro principles, ceteris paribus, seems implausible. Rather, the
small minority of students who hold such jobs (about 14 percent of our
sample) probably differ in other important respects (e.g., course load,
maturity) from other students. The fairly strong negative association
between PCTREGJOB and school selectivity (r = -.48) supports this
conjecture, but the question deserves further research.(16)
TUCE-test variables. As anticipated, giving students more time to
answer the post-TUCE test and counting that test in determining course
grades were both associated with higher scores on that test, although
only two of the 15 positive regression coefficients for these variables
were significant at the .05 level. Both of the significant results
involved micro principles.
III. Conclusion
Using data from a sample of 53 courses in introductory
macroeconomics and 64 courses in introductory microeconomics, this study
has tried to determine if objective and self-assessed measures of
student learning are higher or lower in classes taught by full-time
instructors with a Ph.D. degree than in classes taught by full-time
instructors holding only a Masters degree. After controlling for many
other characteristics of instructors, schools, and students, we find no
significant difference in objective measures of learning between classes
in macroeconomic principles taught by Ph.D.-holding instructors and
similar classes taught by instructors with only an M.A. degree, while
classes in micro principles taught by doctorate faculty learn
substantially and significantly less. For neither subject do we observe
a significant net association between instructor's highest degree
and students' own average assessment of how much they have learned
or how highly they rate their instructor.(17) The results suggest that
if a shortage of Ph.D. economists were to appear, it would not reduce
the learning of students taking introductory economics. Whether it would
have an impact on student learning in more advanced courses, where the
additional economics training of Ph.D.s may matter more, remains to be
seen.
Notes
1. In an earlier draft of this paper we reported separate results
for a fifth dependent variable, namely, VALADD divided by the difference
between a perfect score and the pre-test score. While we hoped that this
relative index of amount learned might yield better results than VALADD,
the t-values from these two specifications were so similar that we
decided to drop the gap-closing index.
2. The mean (standard deviation) of pre-TUCE scores in our sample
of courses was 9.3 (1.1) questions in our macro principles classes and
10.7 (2.4) questions in our micro classes. A mean of 7.5 would result
from random guessing.
3. The simple correlations between LEARNING and our two TUCE
variables are near +.54 for macro classes and about +.40 across micro
classes. In contrast, the correlations between EVALUATION and the TUCE
variables are much weaker (about +.05 for macro classes, between +.11
and +.24 for micro).
4. The simple association between DEGREE and our measure of school
selectivity (defined below) is +0.42 across macro principles classes,
+0.54 across micro classes.
5. The TUCE III data file contains data on the SAT and ACT scores
of some of the students who were enrolled in principles of economics
courses but no school-wide test scores. SAT data, however, are missing
for quite a number of our observations. Thus we secured values of
SELECTIVITY from external sources, including The College Entrance
Examination Board and an issue of U.S. News and World Report. For
schools with no reported first quartile SAT scores, we estimated
SELECTIVITY from (a) the reported median combined SAT score, (b) the
equivalent median SAT score based on the reported median ACT score, or
(c) the mean combined SAT score of the category of school (two year
colleges). At one point we considered naming this variable MONGREL!
6. Students presumably attach a positive value to learning, but how
highly they rate an instructor will also depend on a host of other
considerations. Further, better students may have different expectations
of instructor performance.
7. Experiments with a more sophisticated variable measuring how
much POSTTUCE counted produced similar results to those reported below.
8. Most independent variables had similar means and standard
deviations in micro classes and macro classes, as well as within the
school-kind subset of each. DEGREE is an important exception (see Table
4); a few others are mentioned either in the text or footnotes.
9. Because we use class means as observations, one might expect our
residuals to be correlated with the number of students in each class.
The estimated coefficients are inefficient when such heteroskedasticity
is present. It turns out, however, that the residuals are not correlated
with the number of students in each class.
10. The complete set of results from all regressions will be sent
to interested readers on request.
11. This is the coefficient for PCTPTJOB explaining POSTTUCE in all
micro principles classes (b = 0.46, t = 2.14).
12. There is virtually no association between perceived proficiency
in an instructor's English and his or her highest degree: the
nonsignificant simple correlation is -.13 in macro classes and -.14 in
micro classes.
13. The problem is that with aggregated data, con-trolling for the
fraction of the class with some characteristic X (say, being female)
runs a greater risk of picking up unknown but important associations
between percent X and other determinants of learning than one would
likely find in a random sample of individual students, some of whom are
X and others of whom are not. As a result, coefficients for percent X
are more likely to be biased than those for the dummy variable, X.
14. It is possible that classes with higher expected grades in
principles had learned more economics, but the simple correlations
between EXPECTGRADE and our TUCE-based measures, of learning provide
slender support for this hypothesis (all these coefficients are
positive, but only two - both for POSTTUCE - are statistically
significant).
15. The fraction of students in an introductory economics course
who were freshman may be negatively related to the fraction who had
taken a previous college level course in economics. In most cases, this
previous course would have been the first course in principles in that
college's sequence. How, if at all, this sequence (macro before
micro or vice versa) influences student learning in each introductory
course is still an unsettled issue. A recent study (Lopus and Maxwell,
1995) based on TUCE scores from 5,940 students in 53 universities in
1989-90 found that a prior course in macro principles was significantly
related to higher pre- and post-test TUCE scores in micro principles,
but no similar benefits resulted from the reverse sequence.
In an early stage of our study, we added to our VALADD regressions
a variable (PRIORECON) measuring the fraction of students in each class
who reported having taken three or more hours of college level economics
prior to the current term. While this variable had a positive regression
coefficient in both micro and macro classes, its t-value (near 0.3) in
each regression was not significant. When PRIORECON was included, the
estimated regression coefficient for instructor's degree was
virtually unchanged in the macro class regression and only ten percent
smaller in the micro class regression. The zero-order correlation
between instructor's degree and the fraction of students who had
had a prior economics course was only +.07 in macro classes and +.02 in
micro classes. Consequently, we omitted this variable from subsequent
runs.
16. While the TUCE questionnaire did not ask students for their
age, it did ask for the number of semester or quarter hours of courses
in which they were currently enrolled. As we surmised, those in the
sample who held full time jobs were carrying lighter academic loads than
other students (11.8 hours versus 14.1 hours, respectively); but the 2.3
hour difference was much smaller than we had anticipated.
17. It is possible that the students of instructors with a Ph.D.
degree perform no better on the TUCE exam than the students of
instructors who do not hold a Ph.D. because the TUCE measures better
what is taught by the M.A. instructors. For example, Ph.D. instructors
may assign more optional chapters in textbooks and/or include material
in their courses that is not usually covered in a basic textbook.
Because the TUCE is designed to evaluate students' understanding of
those basic principles of economics that are included in almost all
introductory courses, it may fail to measure everything learned by the
students of instructors with Ph.D.s, and so underestimate their
understanding of economics.
On the other hand, the TUCE is designed to measure student's
skills in applying economics to real world problems. Two-thirds of the
exam consists of applications, in contrast to one-third of the exam
devoted to recognition and understanding skills (Saunders, 1991, p. 33).
If the experience Ph.D. instructors gain during their additional years
of study helps them teach applications more effectively than instructors
who stopped with an M.A. degree, the students of Ph.D. instructors
should have an advantage over other students on the TUCE. There is,
however, no evidence from our empirical results of any advantage
accruing to students of Ph.D. instructors.
References
Bartlett, Randall, "Empty Busses: Thoughts on Teaching
Economics," Eastern Economic Journal (Fall 1993), Vol. 19, pp.
441-446.
Ehrenberg, Ronald, "Should Policies Be Pursued to Increase the
Flow of New Doctorates?," Chapter 10 in Charles T. Clotfelter,
Ronald G. Ehrenberg, Malcolm Getz and John J. Siegfried, Economic
Challenges in Higher Education (Chicago: University of Chicago Press,
1991), pp. 233-240.
Kennedy, Peter, A Guide to Econometrics, Third Edition (Cambridge,
MA: M.I.T. Press, 1992).
Kennedy, Peter, "How Much Bias from Using Pretest as a
Regressor?" paper presented to the Western Economic Association
Meetings, Vancouver, B.C., July 1994.
Kennedy, Peter and John J. Siegfried, "Class Size and
Achievement in Introductory Economics: Evidence from the TUCE III
Data," forthcoming, Economics of Education Review.
Lopus, Jane S. and Nan L. Maxwell, "Should We Teach
Microeconomics Principles Before Macroeconomic Principles?,"
Economic Inquiry (April 1995), Vol. 38, No. 2, pp. 336-350.
Lumsden, Keith G. and Alex Scott, "The Economics Student
Reexamined: Male-Female Differences in Comprehension," Journal of
Economic Education (Fall 1987), Vol. 18, pp. 365-375.
McKenzie, Richard D., "The Economic Effects of Grade Inflation
on Instructor Evaluations: A Theoretical Approach," Journal of
Economic Education (Spring 1975), Vol. 6, pp. 99-109.
Mehdisadeh, Mostafa, "Loglinear Models of Student Course
Evaluation," Journal of Economic Education (Winter 1990), Vol. 21,
pp. 7-21.
Saunders, Phillip, "The Third Edition of the Test of
Understanding in College Economics," American Economic Review (May
1991), Vol. 81, No. 2, pp. 32-37.
Saunders, Phillip, The TUCE III Data Set: Background Information
and File Codes (documentation, summary tables, and five 3.5 inch
double-sided, high density disks in ASCII format). (New York: National
Council on Economic Education, 1994.)
Seiver, Daniel A., "Evaluations and Grades: A Simultaneous
Framework," Journal of Economic Education (Summer 1983), Vol. 14,
pp. 32-38.
Siegfried, John J. and Rendigs Fels, "Research on Teaching
College Economics: A Survey," Journal of Economic Literature
(September 1979), Vol. 16, pp. 923-969.
Siegfried, John J. and William Walstad, "Research on Teaching
College Economics," Chapter 20 in Phillip Saunders and William
Walstad, eds., The Principles of Economics Course: A Handbook for
Instructors (New York: McGraw-Hill, 1990), pp. 270-286.
Siegfried, John J. and James T. Wilkinson, "The Economics
Curriculum in the United States: 1980," American Economic Review
(May 1982), Vol. 72, pp. 125-138.
T. Aldrich Finegan and John J. Siegfried are professors of
economics, Department of Economics and Business Administration,
Vanderbilt University, Nashville, TN 37235. We thank Donald Coffin, John
Olsen, W. Lee Hansen, and an anonymous referee for comments on an
earlier draft. Hao Zhang provided indispensable computational assistance.