A suggested evaluation metric instrument for faculty members at colleges and universities.
Ridley, Dennis ; Collins, Jennifer
INTRODUCTION
The purpose of this study is to introduce an instrument for
evaluating university faculty members. This study is unique with regard
to the teaching evaluation metric that is presented and based on the
product of a teaching effectiveness coefficient and the number of
student contact hours. The methodology for determining the teaching
effectiveness coefficient is the main contribution of this study.
However, the responsibilities of faculty members are not limited to
teaching. They include teaching, research and service.
The distribution of assignment of responsibility and the allocation
of effort to each of these components of responsibility affects the
ability to perform the other components. Therefore, it is necessary to
present a complete measure of total performance evaluation. This study
also includes a research evaluation component and a service evaluation
component. These are not offered as new concepts, but care is taken to
identify the elements of research and service, which are time consuming,
and therefore, are dependent on the time that must be allocated to
teaching. The time allocated to teaching depends on the number and size
of the classes assigned to the professor.
There is one feature of the research evaluation metric that may be
non-traditional. Basic research papers may ultimately be those with the
greatest impact. Unfortunately, the impact of basic research is not
easily seen until much later, after the related applications are
developed and the commercial value is realized. Therefore, by formally
recognizing a limited number of non-refereed newsworthy follow up
articles, wider dissemination to non-specialists is encouraged and a
shorter time to impact may be realized. This study puts forth a
comprehensive performance evaluation method for faculty members at
colleges and universities.
BACKGROUND OF THIS STUDY
In order to be objective, the instrument is based on a quantitative
approach, in which all performance criteria are published in advance of
the evaluation period. After decades of quantitative growth in higher
education, consensus is emerging on the need to establish a valid and
reliable evaluation system of teaching (Wolfer & Johnson 2003; Ma
2005). The instrument put forth in this paper will allow professors to
determine how they can best set their own goals and objectives, and be
confident that they will be recognized and rewarded accordingly.
Extensive research has been conducted on student evaluations and
their validity (e.g., Kozub 2008; Ryan, Anderson & Birchler,1980;
McNatt 2010; McPherson, Jewell, & Kim 2009). Yunker and Yunker
(2003) found a negative relationship between student evaluations and
student achievement. Grade expectations and first impressions have also
been noted to possibly affect student ratings of instructors (Centra
2003; Buchert et al. 2008). Marshall (2005) observed in a study that
conventional teacher supervision and evaluation methods were ineffective
and inefficient. Most student evaluations of teachers create highly
skewed distributions that require institutions to use percentile
rankings of instructors (Clayson & Haley 2011).
The inability to feel confident in the reward/ effort ratio is
de-motivational, in that some professors may perceive that to a large
extent it does not matter what they do, so long as it is politically
correct. Perhaps the worst outcome of a poor evaluation system is when
the evaluation system discourages high academic performance (Coker et.
al. 1980; Weinstein 1987). There is also evidence that student
evaluations infringe on the academic freedom of faculty (Haskell 1997;
Ryan et. al. 1980; Dershowitz 1994; Stern & Flynn 1995).
Teaching styles vary from professor to professor. In the same
regard, so do student-learning styles. It is therefore imperative that
an objective teaching metric be utilized to cover such a wide spectrum
of approaches to education. Xu (2012) proposes a comprehensive teaching
model that uses experts, students and examination of teaching. However,
Xu's (2012) multi-method approach involves contact with the subject
under evaluation. The teaching evaluation metric put forth in this paper
is in effect a no contact, non-intrusive, non-confrontational,
non-threatening, non-coercive peer review evaluation of how well each
professor prepares their students to perform in all the other
professor's classes. The proposed measure performs a ranking of how
each professor contributes to what the team of professors collectively
arrived at as their institutional goal.
This professor evaluation metric includes teaching, service and
research contributions made by professors. One source of concern for
current evaluation metrics is student opinion regarding the quality of
teaching. This in turn has called into question, the value of research.
The assumption there is that the issue is one of teaching versus
research. Faculty members would like to be considered as scholars and
not just teachers. They believe that research and teaching are
complementary and not competing activities (Sharobeam & Howard
2002).
The professorial evaluation metric is presented first. This allows
us to set up and define the ultimate purpose and utility of the
evaluation instrument. The teaching, research and service evaluation
metrics are then presented, in that order. In each case the metric is
fully illustrated by way of a small made up example involving four
professors and ten students. Three of the four professors teach in the
same department and are to be evaluated in this study. The fourth
professor teaches in a college of general studies and would be evaluated
(not done in this paper) with a different peer group.
THE PROFESSORIAL EVALUATION METRIC
The following professorial evaluation metric (PEM), used to
determine a professorial evaluation score (PES), is designed to
incorporate measures of teaching, research and service, in a way that is
objective. It includes a teaching evaluation metric (TEM), used to
determine a teaching evaluation score (TES); a research evaluation
metric (REM), used to determine a research evaluation score (RES); and a
service evaluation metric (SEM), used to determine a service evaluation
score (SES).
The PES is an overall measure a professor's contribution,
expressed as a fraction of the total contribution of all professors in
the instructional unit. The instructional unit may be defined as a
department, a professional school or college, or a university. The PEM
accounts for uneven distribution of effort and prior assignment of
responsibility between teaching, research and service, between
professors, and between different time periods. It is used for annual
evaluations, merit reward, tenure and promotion. Professorial
contributions require time to take effect.
A 5 year total PES will measure long, continuous and productive
contributions. The PEM requires that each professor be assigned a unique
identification code. For the purpose of giving the most general possible
description of the model, assume that the number of professors in an
instructional unit is k. Then let the professor code be j where j=1, 2,
3, ... k. The professorial evaluation score for the jth professor is
determined as follows:
[PES.sub.j] = [T.sub.j] [TES.sub.j] + [R.sub.j] [RES.sub.j] +
[S.sub.j] [SES.sub.j]: (1)
Where
[TES.sub.j] = Fraction of total professorial teaching contribution
made by the jth professor,
[RES.sub.j] = Fraction of total research contribution made by the
jth professor,
[SES.sub.j] = Fraction of total service contribution made by the
jth professor,
[T.sub.j] = Fraction of the jth professor's assignment of
responsibility given to teaching ([greater than or equal to] 0.25 i.e.,
average at least one 3hr. course per semester to maximize the TES
contribution to the PES).
[R.sub.j] = Fraction of the jth professor's assignment of
responsibility given to research ([greater than or equal to] 0.20),
[S.sub.j] = Fraction of the jth professor's assignment of
responsibility given to service ([greater than or equal to] 0.05 [less
than or equal to] 0.1),
[T.sub.j] + [R.sub.j] + [S.sub.j] = 1 (assigned prior to the
evaluation period then revised later to maximize PES),
j = 1,2,3... k, and k= number of professors in the instructional
unit.
Details of the models for determining TES, RES and SES are given
separately, in the sections on the TEM, REM and SEM.
Application To The Reward System
The score obtained from the above evaluation may be applied to the
granting of annual merit raises, tenure and promotion according to the
criteria given in Table 1. None of the models discussed in this paper
can provide any absolute measure of performance. Each score merely
reflects a measure of relative performance. Therefore, the criteria are
stated in a manner that is consistent with a professor's relative
position in their peer ranking. The parameters that define the criteria
are flexible, but they must be established by agreement prior to the
evaluation period.
The purpose of the peer faculty vote is to verify the correctness
of the procedure and the evaluation data. It is not a value judgment of
the candidate. The value judgment is already built into the evaluation
criteria. The following example will help to clarify the calculation of
a PES.
Example: Calculating the PES
Consider a small example based on prior assignments of
responsibility and performance scores as shown in Table 2. The
calculations for the three individual performance scores will be shown
later in the relevant section. The three professors listed teach in the
same professional college, and are being evaluated. The weighted average
scores are calculated first. These scores are then expressed as a
fraction of the total for all three professors. It is now a simple
matter to apply these values of per unit PES to the reward criteria.
These assignments of responsibility are based on previously
determined goals and objectives; professorial credentials, strengths,
interests and abilities to contribute to the university; and the
university's immediate teaching needs to cover courses and long
term needs for research, service and development. The application of
these prior assignments of responsibility is not unlike management by
objectives.
However, such a design would limit professorial ingenuity for the
re-deployment of effort as unexpected opportunities present themselves.
For example a professor may have accepted a high teaching and service
assignment of responsibility. Then, quite unexpectedly a journal
expresses early interest in a paper that the professor submitted for
publication. The paper is conditionally accepted, subject to significant
additional data collection and evaluation. The professor already has a
full load of work to do but may decide to risk working overtime, without
additional pay, to complete the paper as required, in an attempt to
secure a final letter of acceptance during the evaluation year.
In order to encourage dynamic re-deployment of effort during the
evaluation period, a process of management by dynamic objective (MBDO)
is employed. The assignments of responsibility are changed at the end of
the evaluation period, so as to maximize each professor's final
weighted average score, subject to the previously determined
institutional constraints. The final results are shown in Table 3.
The Teaching Evaluation Metric
The following teaching evaluation metric is designed to replace
subjective methods of teaching evaluation, with a scientific method.
Subjective methods are arbitrary methods, based entirely on the opinion
of an administrator. Evaluation scores are assigned annually for
professors who teach academic courses. These numbers are not necessarily
tied to teaching innovation, methodology, workload or currency of
teaching material. They are not necessarily tied to the purpose of the
university, which is learning (Coker, et. al. 1980, Weinstein, 1987).
They are based primarily on the degree to which the professor's
teaching philosophy is in accord with that of the administrator.
Furthermore, the score is likely to be reduced when the evaluator
hears complaints from students. While some complaints may be justified,
some are not. For example a complaint that the professor does not show
up for classes, or does not give feedback through graded tests, etc. are
all justifiable complaints. Too often the complaint is against those
professors who have high grading standards, and who are accused of
failing students without good cause. Professors such as these are almost
invariably attempting to build quality, raise student intellectual
curiosity and responsibility and improve study habits.
Given the proper institutional support, rigorous professors raise
the educational level from memorization and regurgitation to critical
thinking, understanding and intellectual leadership. Eventually the
grades and passing rates must increase. However, any attempt to change
student behaviour for the better, may be meted out with severe
criticism, drastic reductions in evaluation scores and job termination.
To the extent that they corroborate preconceived ideas about a
professor, administrators may incorporate formal university sponsored
student evaluations of professors. These evaluations are known as
popularity contests, inversely related to learning (Coker, et. al. 1980,
Weinstein, 1987). As an alternative to simply studying hard as they
should do, failing students may collectively choose to exercise
political pressure against a professor. In that case administrators may
make political choices between students and professor.
An objective teaching evaluation metric will serve all stakeholders
well. It is scientific, and is based on quantifiable data that is tied
to learning. The cumulative student grade point average is regressed on
the fraction of the number of credit hours that students are taught by
each professor. Each regression coefficient measures the marginal rate
at which the corresponding professor contributes to student learning as
measured by the average number of cumulative grade points earned,
ceteris paribus. The teaching evaluation score is the total contribution
to learning, and is calculated from the product of the rate of
contribution (adjusted for grade inflation) and the total number of
student credit hours taught by the professor. Professors who do in fact
make a high contribution to the evaluation metric are protected from
student criticism. Other professors may attend their classes to see what
to do. Failing students will be forced to focus their efforts on
improving their own performance.
The Regression Model
The TEM is based on the following regression model:
[y.sub.i] = [[beta].sub.0] + [[beta].sub.1] [x.sub.i1] +
[[beta].sub.2] [x.sub.i2] + ... + [[beta].sub.j] [x.sub.ij] + ... +
[[beta].sub.k] [x.sub.ik] + [[epsilon].sub.i]: i = 1,2,3, j = 1,2,3,
..., k (2)
Where
[y.sub.i] = cumulative grade (re-centered around c=2) point average
of the ith student,
[x.sub.ij] = fraction of total number of semester hours that the
ith student was taught by the jth professor,
[[beta].sub.0] = regression parameter representing the extent to
which grade point average is unaffected by direct contact hours within
the instructional unit,
[[beta].sub.j] = regression parameter containing information
regarding the impact that the jth professor has on student grade point
average, and the errors [epsilon]. are independent and normally
distributed with zero mean and variance [[sigma].sup.2],
k = number of professors in the instructional unit,
n = number of students.
Assuming that the department or college allocates the budget for
annual raises, then as far as the distribution of merit raises is
concerned, there is no advantage in evaluating and ranking professors
other than by department or college. There are also good mathematical
reasons to narrow the focus. If all of the professors in the university
were included in the regression model, then the university wide matrix
of independent variables would be sparse. Also, since the total number
of hours taken by each student is equal to the sum of the number of
contact hours with each professor, all rows in the matrix would sum to
one, giving rise to multi-collinearity.
It will be assumed that each and every student has not been taught
the same number of hours by each and every professor in the
instructional unit. If multi-collinearity arises for any other reason,
it will be assumed that there is some combination of departments that
will break up the correlation.
The marginal rate at which the jth professor contributes to student
grade point average, ceteris paribus, is given by [[beta].sub.j] grade
points per contact hour of instruction. Assuming that grade points
measure learning, then [[beta].sub.j] represents the institution &
instructional unit context specific teaching effectiveness of the jth
professor, in the presence of all contributions by all the other
professors. Also, it is assumed that each professor contributes to
student learning is some general way through advising or any number of
other indirect ways, and that such learning is reflected in
[[beta].sub.0]. Therefore, teaching credit is determined from the
teaching effectiveness coefficient [b.sub.j] = [[beta].sub.0] +
[[beta].sub.j.sup.(estimated)] It reflects the jth professors knowledge,
proficiency, ability to impart knowledge, contribution to student
intellectual development and study habit, ability to leverage the
contributions to date made by all other professors, and contribution to
student ability to perform in the professors course, as well as, in
other courses taken at the university. In order to correct for grade
inflation and differences in grading standards, the grades reported for
each class are re-centered around a grade of c = 2 points before
totalling up the grade points. For each class the re-centered grade
values are the original grade values minus the average of the grade
value for the class plus 2.0 (the alternative to re-centering the grades
would be to simply use standardized test scores for [y.sub.i]).
Therefore, this is a professorial peer evaluation of the preparedness of
each other's students. It is conducted by the best experts that the
university has to offer. Furthermore, the evaluation is kept honest by
grade re-centering (the average grade is the same for all professors).
The teaching evaluation score ([TES.sub.j]) for the jth professor
is based on a combination of the teaching effectiveness coefficient
([b.sub.j]) and the teaching workload. It is measured by the total
contribution to the number of student credit hours earned by students
who were taught by the jth professor, expressed as a fraction of the
contribution to the grand total number of student credit hours made by
all professors.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)
Where [H.sub.ij] represents the number of contact hours that the
jth professor taught the ith student in the evaluation year. If the ith
student was not taught by the jth professor during the evaluation year,
then [H.sub.ij] = 0.
If it were true that large class sizes lower teaching
effectiveness, then the teaching effectiveness coefficient will be
lowered. However, multiplying the teaching effectiveness coefficient by
the number of student credit hours will increase the TES, and thereby
offset the effect of class size on the TES. In order to assist in
maximizing teaching effectiveness, the university should attempt to
equalize and reduce class sizes. Where large class sizes are
unavoidable, technology may be used to mitigate against reductions in
teaching effectiveness.
Example: Calculating the TES
Consider the data given in Appendix 1. In addition to the three
business school professors, there is a fourth professor who teaches
outside of the immediate instructional unit. The first step is to
calculate the regression coefficients. The data required for the
regression analysis is extracted and presented in Table 4.
The dependent variable is student GPA, obtained by dividing the
cumulative number of grade points for each student by the number of
credit hours taken by the student. The independent variables are the
fraction of the number of hours that the student has taken from each
professor, obtained by dividing the number of credit hours taken from
each professor by the total number of hours taken by the student. The
estimated regression coefficients, the teaching effectiveness
coefficients, and the calculations for TES for the 1995-1996 evaluation
year are summarized in Table 5.
The Research Evaluation Metric
Each faculty member will be assigned a total research score based
on the number and quality of each published paper, and the number of
authors. The quality of the publication will be determined purely from
the journal in which it appears. That is, the editorial review process
will be accepted without question. A published rank listing of journals
may be used to classify each journal. Alternatively, the dean and/or
faculty members of the instructional unit determine the journal rank
scores. However, in each case the professor has the responsibility for
submitting proof of the editorial policy, so that the journal may be
classified. A publication in a high-ranking journal will contribute a
higher score. However, it will most likely require more work and a
longer time to publication. It is the professor's responsibility to
review the list of journals, their rank and score, and to publish papers
where the reward / effort ratio is most favourable to them, given all of
their duties and responsibilities. Decisions regarding this trade off
are left to the individual professor. The total research score is the
sum of the products of the journal rank score and the author
contribution score for each paper.
For the purpose of annual evaluations, only the papers
corresponding to the evaluation year are considered. However, a
professor may elect to defer consideration of a publication to later
years so as to smooth out bunching of unusually good and lean years.
This is important since in any one-year there are only so many rewards
to go around. This option encourages the professor to publish as much as
possible as soon as possible.
The categories, the rank score and the criteria for ranking the
publication outlets are given in Table 6. These categories are well
differentiated. Scholarly books/monographs containing a theoretical
contribution (category AAAA) provide no financial reward to the
professor. They are like category A & AA papers, except that they
are more expansive and more time consuming to produce. Therefore, these
receive twice the credit of category A papers. There is no limit to the
number of these that receive research credit.
For the purpose of annual faculty evaluations, tenure and
promotion, the journals ranked A & AA carry the same score. However,
the special rank AA classification for journals is used to denote that
by virtue of the special nature of these publications, they are superior
to journals ranked A. The AA status may be used for awarding special
recognition to professors who choose the most difficult task for making
their contribution in the form of new theory and methodology. There is
no limit to the number of these that receive research credit.
Commercial teaching textbooks/monographs and professional
books/monographs carry their own financial awards, and professors cannot
expect to be rewarded twice. Such books contain a rearrangement of old
knowledge, but they also disseminate research. Also, the harvesting of
the research is a research related activity. Furthermore, it is time
consuming. On balance, these receive research points that are equivalent
to one category A research paper. However, the book is not completed
research, therefore, in order to contribute to the total research score
credit, the number of commercial publications may not exceed the number
of rank A & AA journal articles.
The research points granted for Category B and C publications are
intended to encourage wide dissemination of the Category A & AA
research findings, to non-specialists. This can increase the impact of
research. They are valued at approximately one third and one tenth of a
category A paper, respectively. However, the research must first be
done. Therefore, in order to contribute to the total research score
credit, the number of rank B journal articles may not exceed 3 times the
number of rank A & AA journal articles, and the number of rank C
journal articles may not exceed 10 times the number of rank A & AA
journal articles.
Additional categories are always possible. However, too many
categories, which are not highly differentiated, may become a source of
counterproductive contention. A sample listing of journals is given in
Table 7.
Consider the 1995-1996 evaluation year. Each publication and the
corresponding rank score, author contribution score, and total research
score for Dr. Brainstorm are listed in Table 8. The name position code
is used to distribute credit for the publication. The format of the name
position code is (p/n), where n denotes the number of authors and p
denotes the position in which the author appears in the list of authors,
p=1 being the first position and p =n being the nth position. For single
authored papers the name position credit is 1. For the case of two
authors the name position credit is 0.6 for the first author and 0.4 for
the second author. For the case of three authors the name position
credit is 0.5 for the 1st author and 0.25 each for the other authors.
For the case of four authors the name position credit is 0.4 for the 1st
author and 0.2 each for the other authors. For the case of five or more
authors the name position credit is 1 divided by the number of authors.
RES is the total research score expressed as a fraction of the sum
of the total research scores for all professors. The total research
score and RES for each professor are summarized in Table 9.
The Service Evaluation Metric
The category, code number and score for each one of various
volunteer service activities are given in Table 10. The categories are
highly differentiated by the number and diversity of the organization
being served. The activities listed are those recognized as enhancing
education in general, the roles of the department, college and
university, and/or as benefiting the general welfare of the local,
regional and national communities. The scores increase in increments of
5 points as the activity moves further away from the familiar territory
of the department or college. In all categories, the chairperson/officer
activity receives four times the score of a regular member activity.
Additional categories are always possible. However, too many categories,
which are not highly differentiated, may become a source of
counterproductive contention. Alternatively, the service code scores are
determined by the dean and/or faculty members of the instructional unit.
Consider the following data on service activities for the
1995/96-evaluation year. Dr. Knowhow chaired a committee to raised
$1,000,000 for the American Red Cross. Dr. Wiseman chaired an AACSB
re-accreditation visitation team to Clemson University, and a
departmental committee. Dr. Brainstorm served as president of the
national association of university professors.
SES is the total service score expressed as a fraction of the sum
of the total service scores for all professors. The total service score
and SES for each professor are summarized in Table 11.
CONCLUDING REMARKS
This study provides the methodology, with illustrations, for a
comprehensive objective system for evaluating university professors. The
system is flexible, in that the department or college within the
university may determine the defining parameters for the evaluation
criteria. However, the system is objective because these parameters must
be established by agreement, prior to the period of evaluation.
Manipulation
An interesting question to ask about this or any evaluation system
is, is there a way to beat it? One would be foolish to say no,
emphatically. Any system is subject to manipulation of one kind or
another. In that sense there may be no such thing as a foolproof system.
So let us take a pre-emptive look at those possibilities that are
clearly identifiable ways to influence the outcomes, and ways in which
to mitigate against the effects of such manipulation. The research and
service metrics are fairly straightforward. The list and merit of each
creditable activity must be established by agreement, ahead of the
evaluation period. However, the component that is most complex and which
appears to be open to some manipulation is the TEM. Here is what can be
observed.
The TEM merely associates professors with student grade point
average. The only worthwhile way for a professor to unduly influence the
TEM is to teach more students with above average GPA than students with
below average GPA. The TEM assumes that students are assigned, or assign
themselves randomly to professors. However, an academic director may
make systematic assignments of professors to certain types of students.
For example one professor may teach mostly freshmen while another may
teach mostly seniors. If it were true that grade point average is
correlated with the total number of credit hours taken, or with
freshman, sophomore, junior and senior classifications, etc., then the
errors from the regression equations will exhibit heteroscedasticity.
This condition of systematic change in error variance with fitted
grade point average is easily rectified by using generalized least
squares estimators. Barring that, professors may simply ask to be
assigned to certain teaching activities that they believe to be most
beneficial, by a system of rotation. If the systematic assignment of
high or low ability students to certain professors is the concern, that
too is easily corrected by modifying the regression model to include
student achievement (S.A.T. or A.C.T.) scores. Recall also, that the
effect of grade inflation strategies was eliminated by grade re-centring
prior to estimating the regression coefficients.
So, how else can the professor unduly influence the TEM? It appears
that one way is to develop a reputation. This is the professor's
prerogative. There is no law against it, and it cannot be prevented.
However, students, where possible, may select professors based on their
reputation. Since there is no reward or punishment for grade inflation
or deflation, that is, for variations in grading standard, a professor
may implement tough or relaxed grading standards with impunity. If the
reputation is one of high expectations and high grading standards, the
professor may attract students who are highly desirous of learning and
competing on that basis. Those students may be high GPA students. If on
the other hand, the reputation is one of low expectations and relaxed
grading standards, the professor may attract low GPA students.
Therefore, it may be fair to say to each professor "be careful
what you wish for because you just might get it!" Considering that
the objective is to raise standards and performance, this particular
strategy for developing a good reputation may not be a bad thing. It
allows the injection of student opinion, accounting for student
qualifications as measured by GPA. Furthermore, it may eliminate the
need for written student evaluations of professors. However, assume the
worst case scenario that student evaluations would have been a
popularity contest for professors anyhow. Then, in the TEM process where
in effect students perform evaluations by virtue of which professor they
select, the professor can only benefit by being popular with the high
GPA students. Unlike regular student evaluations, there is an incentive
to get it right since the price of a selection error is borne by both
student and professor. Perhaps this should be encouraged. However, the
obvious way to mitigate against it is to conceal the names of all
professors prior to the time when the students register for classes.
Teamwork
One truly outstanding, desirable and not preventable way of
influencing the TEM is for a professor to follow the progress of his or
her own students. The professor simply provides follow up advice, help,
and additional instruction to students taught by that professor, thereby
helping them to raise the grades that they earn in their other classes.
The other professors of those students can only benefit by simple
association. Those other professors must, however, make their own
contribution if they are to maximize their own benefits. In no way is it
clear how this can lead to negative outcomes. It encourages maximum
cooperation and professorial performance.
A university education requires time to become most effective. It
is important not to sacrifice long-term goals for short-term goals. At
the same time, we wish to be responsive to rapidly changing theories,
methodologies and technologies, within the emerging knowledge base, as
well as, through available educational tools. By evaluating professors
based on the sum of the last 5 yearly total overall professorial
evaluation scores, we can encourage a combination of both short term and
long term goals.
Obviously the TEM cannot teach us how to teach. Furthermore, a
regression model cannot by itself determine cause and effect
relationships. What we can reasonably theorize is that we can determine
those professors who, if the students have more instructional time with
them, then the students will, on the average, have a higher GPA.
Empowerment
The PEM empowers only the performance and involvement of the
professor. In particular, a highly motivational feature of the PEM is
its management by dynamic objective (MBDO), which rewards optimal
re-deployment of effort by reassignment of responsibility. Each
professor optimizes their own PEM score by personal allocation of their
own efforts in teaching, research and service.
Automation
The data required for the TEM is available from computerized
student academic transcripts. The data for the REM and the SEM may be
fed into the computer. The computer can be programmed to evaluate each
professor automatically at the end of each academic year. Each professor
can be provided with a detailed personalized report, including all
individual performance statistics, and final ranked scores in each
category.
Appendix 1
Student transcript
Professor Name Professor College
code
1 Dr. Jack Knowhow KNJ School of Business
2 Dr. Tony Wiseman WIT School of Business
3 Dr. Peter Brainstorm BRP School of Business
4 Dr. Art History HIA General Studies
Hours Semester &
Student Name Taken Course Year taught Prof.
1 Sherly Studyhard 10 course 1 Sum 1990 KNJ
course 2 Spr 1996 WIT
course 3 Fall 1995 BRP
course 4 Fall 1995 WIT
2 Henry Punctual 10 course 1 Sum 1990 KNJ
course 2 Fall 1995 WIT
course 3 Fall 1995 BRP
art hist Spr 1996 HIA
3 John Plodder 10 course 1 Fall 1995 KNJ
course 2 Spr 1996 KNJ
course 3 Fall 1995 BRP
course 4 Fall 1995 WIT
4 Jenny Quickstudy 10 course 1 Sum 1990 KNJ
course 2 Spr 1996 WIT
course 3 Fall 1995 BRP
art hist Spr 1996 HIA
5 Joe Learner 10 course 1 Sum 1990 KNJ
course 2 Fall 1995 BRP
course 3 Spr 1996 BRP
course 4 Fall 1995 WIT
6 Helen Examace 10 course 1 Fall 1995 KNJ
course 2 Fall 1995 WIT
course 3 Fall 1995 BRP
art hist Spr 1996 HIA
7 Merit Scholar 12 art hist Spr 1996 HIA
art hist Spr 1996 HIA
course 3 Fall 1995 BRP
course 4 Fall 1995 WIT
8 Jefferson High 10 course 1 Fall 1995 KNJ
course 2 Fall 1995 WIT
art hist Spr 1996 HIA
course 4 Fall 1995 WIT
9 Stan Bookworm 10 course 1 Spr 1996 KNJ
course 2 Spr 1996 WIT
course 3 Fall 1995 BRP
course 4 Fall 1995 WIT
10 Yolette Senior 10 course 1 Fall 1995 KNJ
course 2 Fall 1995 BRP
course 3 Spr 1996 BRP
course 4 Fall 1995 WIT
H Centered Centered
Student Name Grade Hrs. GP Cum. GP
1 Sherly Studyhard A 3 3.25x3 = 9.75 (1)
C 1 1.33x1 = 1.33
F 3 0.00x3 = 0.00
B 3 3.00x3 = 9.00 20.08
2 Henry Punctual B 3 2.25x3 = 6.75
A 1 2.67x1 = 2.67
D 3 1.00x3 = 3.00
C 3 2.00x3 = 6.00 18.42
3 John Plodder A 3 3.25x3 = 9.75
C 1 2.00x1 = 2.00
B 3 3.00x3 = 9.00
C 3 2.00x3 = 6.00 26.75
4 Jenny Quickstudy C 3 1.25x3 = 3.75
C 1 1.33x1 = 1.33
B 3 3.00x3 = 9.00
C 3 2.00x3 = 6.00 20.08
5 Joe Learner C 3 1.25x3 = 3.75
A 1 2.50x1 = 2.50
A 3 2.50x3 = 7.50
B 3 3.00x3 = 9.00 22.75
6 Helen Examace B 3 2.25x3 = 6.75
C 1 0.67x1 = 0.67
C 3 2.00x3 = 6.00
C 3 2.00x3 = 6.00 19.42
7 Merit Scholar C 3 2.00x3 = 6.00
C 3 2.00x3 = 6.00
D 3 1.00x3 = 3.00
D 3 1.00x3 = 3.00 18.00
8 Jefferson High D 3 0.25x3 = 0.75
A 1 2.67x1 = 2.67
C 3 2.00x3 = 6.00
F 3 0.00x3 = 0.00 9.42
9 Stan Bookworm C 3 2.00x3 = 6.00
A 1 3.33x1 = 3.33
A 3 4.00x3 = 12.00
B 3 3.00x3 = 9.00 30.33
10 Yolette Senior B 3 2.25x3 = 6.75
B 1 1.50x1 = 1.50
B 3 1.50x3 = 4.50
C 3 2.00x3 = 6.00 18.75
Sum 1990: Course 1 Fall 1995: Course 1 Course 2 Course 3
Course 4 Spr 1996: Course 1 Course 2 Course 3 Art Hist
KNJ ABCC(2.75) ABDB(2.75) C(2) C(2)
WIT ACA(3.33) BCBDFBC(2) CCA(2.67)
BRP AB(3.5) FDBBCDA(2) AB(3.5)
HIA CCCCCC(2)
(1) Class grade average =(4+3+2+2)/4=2.75 Centered
grade = 4 -2.75+2=3.25 Centered grade points = 3.25x3=9.75
REFERENCES
Buchert, S., Laws, E.L., Apperson, J.M., & Bregman, N.J.
(2008). First Impressions and Professor Reputation: Influence on Student
Evaluations of Instruction. Social Psychology of Education 11, 397-408.
Centra, J. A. (2003). Will Teachers Receive Higher Student
Evaluations by Giving Higher Grades and Less Course Work? Research in
Higher Education 44, 495-518.
Clayson, D.E., & Haley, D.A. (2011). Are Students Telling Us
the Truth? A Critical Look at the Student Evaluation of Teaching.
Marketing Education Review 21, 101-112.
Coker, H., Medley, D.M., & Soar, R.S. (1980). How Valid Are
Expert Opinions About Effective Teaching? Phi Delta Kappan 62, 31-149.
Dershowitz, A. (1994). Contrary to popular opinion. New York:
Berkley Books.
Haskell, R. E. (1997). Academic Freedom, Tenure, and Student
Evaluation of Faculty: Galloping Polls In The 21st Century. Education
Policy Analysis Archives 5 from http://olam.ed.asu.edu/epaa/v5n6.html
Kozub, R. M. (2008). Student Evaluations of Faculty: Concerns and
Possible Solutions. Journal of College Teaching & Learning 5, 35.
Ma, X. Y. (2005). Establishing Internet Student-Assessing of
Teaching Quality System to Make the Assessment Perfect. Heilongjiang
Researches on Higher Education 6, 94-96.
McNatt, D. B. (2010). Negative Reputation and Biased Student
Evaluations of Teaching: Longitudinal Results From a Naturally Occurring
Experiment. Academy of Management Learning and Education 9, 225-242.
McPherson, M. A., Jewell, R.T., & Kim, M. (2009). What
Determines Student Evaluation Scores? A Random Effects Analysis of
Undergraduate Economics Classes. Eastern Economic Journal 35, 37-51
Ryan, J. J., Anderson, J.A., & Birchler, A.B. (1980). Student
Evaluations: The faculty Responds. Research in Higher Education 12,
317-333.
Sharobeam, M. H., & Howard, K. (2002). Teaching Demands Versus
Research Productivity. Journal of College Science Teaching 31, 436-441.
Stern, J., & Flynn, P.D. (1995). Students propose a course of
action for grade inflation. The Bucknellian, from
www.bucknell.edu/bucknellian/sp95/03-02-95/ops/4165.html
Weinstein, L. (1987). Good Teachers Are Needed. Bulletin of the
Psychometric Society 25, 273-274.
Wolfer, T. A., & Johnson, M.M. (2003). Re-evaluating Student
Evaluation of Teaching: The Teaching Evaluation Form. Journal of Social
Work Education 39, 111-121.
Xu, Y. (2012). Developing a Comprehensive Teaching Evaluation
System for Foundation Courses with Enhanced Validity and Reliability.
Educational Technology Research and Development 60, 821-837.
Yunker, P., & Yunker, J. (2003). Are Student Evaluations Of
Teaching Valid? Evidence From An Analytical Business Core Course.
Journal of Education for Business 78, 313-317.
Dennis Ridley
Jennifer Collins
Florida A&M University
Dennis Ridley studied Electrical Engineering at Middlesex
University in England and the University of the West Indies, where he
received the Master of Science degree in Computer Methods in Power
Systems Analysis. He received his Ph.D. degree in Engineering Management
from Clemson University. He has the distinction of a US patent,
publication in the Journal of the Royal Statistical Society, U.S. State
Department Fulbright Senior Specialist at Kharkov University in Ukraine
and Harvard Business School certificate in The Art & Craft of
Discussion leadership. He is a Professor at Florida A&M University,
and a Faculty Associate in the Department of Scientific Computing at
Florida State University. He is widely published in many fields, and his
professional societies include the Institute for Operations Research and
Management Science and the International Institute of Forecasters, among
others.
Jennifer M. Collins is an Associate Professor of Management in the
School of Business and Industry at Florida A & M University in
Tallahassee, Florida. Dr. Collins holds a Ph.D. in Management from
Florida Atlantic University. She teaches Human Resource Management,
Strategies for Entrepreneurial Decision Making, Organizational Behavior,
Strategic Management and Business Policy courses. Her research interests
include: employee creativity, student learning assessment, and strategic
human resource management.
Table 1
Application of Professional Evaluation Score to Merit Criteria
Annual merit raise = PES x Amount of money allocated to merit raises.
Score required for tenure Last 5 years total PES rank in top
(after 5 years of service): 25% of peer faculty.
Score required for promotion Last 5 years total PES rank in top
to associate professor: 50% of peer faculty.
Score required for promotion Last 5 years total PES rank in top
to full professor: 25% of peer faculty.
In each case a simple majority vote (by Robert's rules of order)
of the peer faculty is also required.
Publications can be deferred to later years so as to smooth
out bunching of unusually good and lean years.
Table 2
Calculation of Professional Evaluation Score
Assignment of Responsibility
T [greater RT [greater 0.05 [less
Name of than or equal than or equal than or equal
Professor to] 0.25 to] 0.2 to] S [less than
or equal to] 0.1
Dr. Jack Knowhow 0.25 0.7 0.05
Dr. Tony Wiseman 0.60 0.3 0.10
Dr. Peter 0.70 0.2 0.10
Brainstorm
Performance Evaluation
Scores
TES RES SES Weighted PES
Name of Average
Professor
Dr. Jack Knowhow 0.181 0.143 0.333 0.162 0.164
Dr. Tony Wiseman 0.217 0.340 0.333 0.266 0.270
Dr. Peter 0.602 0.517 0.333 0.558 0.566
Brainstorm
Total for all professors = 0.986 1.000
PES = (TxTES) + (RxRES) + (SxSES)
Table 3
Maximizing Professional Evaluation Score
Name of Assignment of Responsibility
Professor
T [greater R [greater 0.05 [less than or
than or equal than or equal to] s [less
to] 0.25 equal to] 0.2 than or equal
to] 0.1
Dr. Jack 0.70 0.20 0.10
Knowhow
Dr. Tony 0.25 0.70 0.05
Wiseman
Dr. Peter 0.75 0.20 0.05
Brainstorm
Name of Performance Evaluation
Professor Scores
TES RES SES Weighted PES
Average
Dr. Jack 0.181 0.143 0.333 0.175 0.166
Knowhow
Dr. Tony 0.217 0.340 0.333 0.309 0.293
Wiseman
Dr. Peter 0.602 0.517 0.333 0.572 0.541
Brainstorm
Total for all professors =1.056 1.000
PES = (T X TES) + (R X RES) + (S X SES)
Table 4
Data Extracted for Regression Analysis
(GP&H from Table 3 divided by total number of
hours taken)
i [sup.2.y.sub.i] [x.sub.i1] [x.sub.i2] [x.sub.i3]
1 2.008 0.300 0.400 0.300
2 1.842 0.300 0.100 0.300
3 2.675 0.400 0.300 0.300
4 2.008 0.300 0.100 0.300
5 2.275 0.300 0.300 0.400
6 1.942 0.300 0.100 0.300
7 1.500 0.000 0.250 0.250
8 0.942 0.300 0.400 0.000
9 3.033 0.300 0.400 0.300
10 1.875 0.300 0.300 0.400
[2.sup.y] =GP/total number of hours taken.
x =H/ total number of hours taken.
Table 5
Calculation of TES
Estimated
Name Code Number [[beta]. [[beta].
(j) sub.0] sub.j]
Dr. Jack KNJ 1 0.2295 1.7608
Knowhow
Dr. Tony WIT 2 0.2295 1.1849
Wiseman
Dr. Peter BRP 3 0.2295 3.4156
Brainstorm
Name [b.sub. [n.summation [b.sub.j] [TES.sub.j]
j] over (i=1)] [n.summation
(H.sub.ij) over (i=1)]
[H.sub.ij]
Dr. Jack 1.9903 16 31.8448 0.181
Knowhow
Dr. Tony 1.4144 27 38.1888 0.217
Wiseman
Dr. Peter 3.6451 29 105.7079 0.602
Brainstorm
[k.summation over (j=1)] [b.sub.j] [n.summation
over (i=1)] [H.sub.ij] = 175.7415 1.000
[TES.sub.j] = ([b.sub.j] [n.summation over (i=1)] [H.sub.ij]])
/ ([k.summation over (j=1)] [b.sub.j] [n.summation over (j=1)]
[H.sub.ij]) for the 1995/1996 evaluation year.
Table 6
Journal Rank Score and Description
Publication Score Description
Rank
AAAA 200 Scholarly book/monograph with a theoretical
contribution.
AA 100 Refereed journal. Theory and methods.
Rigorous validation.
A 100 Refereed journal. Applied. Data analysis.
Rigorous validation.
A 100 Commercial teaching text book/monograph.
A 100 Commercial professional book/monograph.
B 30 Refereed Proceedings, Professional.
C 10 Opinion piece. Preprint. Technical report.
Table 7
Sample List of Journals
Sample List of Journals Rank Score
Int. Journal of Production Economics AA 100
Int. Transactions in Operational Research AA 100
Decision Sciences AA 100
Computers and Industrial engineering AA 100
Management Science AA 100
The Journal of Business Forecasting A 100
Harvard Business Review A 100
The Review of Business A 100
All locally refereed university research B 30
publications
Business week C 10
Wall Street Journal C 10
Example: Calculating the RES
Table 8
Research evaluation for Dr. Peter Brainstorm
Paper/Book Title Journal
The inverse Douglas-Cobb Int. J. of Production
function. Economics
Temporal price elasticity: Int. Transactions in
a new theory. Operational Research
Income elasticity of pork The American Economist
in the Canadian market.
A survey of triple entry Refereed Proceedings of
accounting methods the 9th Conference of the
Decision Sciences Institute
Trickle down economics Financial quarterly
can impact the deficit.
Trickle up economics can Wall Street Journal
impact the deficit
TOTAL RESEARCH SCORE = 181
Paper/Book Title Score Author position
The inverse Douglas-Cobb 100 1/1 100x1=100
function.
Temporal price elasticity: 100 2/5 100x.2=20
a new theory.
Income elasticity of pork 100 1/3 100x.5=50
in the Canadian market.
A survey of triple entry 30 2/10 30x.1= 3
accounting methods
Trickle down economics 10 2/4 10x.2= 2
can impact the deficit.
Trickle up economics can 10 1/2 10x.6= 6
impact the deficit
TOTAL RESEARCH SCORE = 181
Total research score = [[SIGMA].sub.for all papers]
(Score based on journal rank x author contribution)
Table 9
Calculation of Research Evaluation Score
Name of Professor Total Research Score RES
Dr. Jack Knowhow 50 0.143
Dr. Tony Wiseman 119 0.340
Dr. Peter Brainstorm 181 0.517
All professor total = 350 1.000
RES = Total research score / [SIGMA]Research scores
for all professors.
Table 10
List of Service Activities
Function Organization
Member department/college committee/task force etc.
Chairperson/Officer department/college committee/task force etc.
Member university wide committee/task force etc.
Chairperson/Officer university wide committee/task force etc.
Member local community committee/task force etc.
Chairperson/Officer local community committee/task force etc.
Member regional committee/task force etc.
Chairperson/Officer regional committee/task force etc.
Member national committee/task force etc.
Chairperson/Officer national committee/task force etc.
Function Code Score
Member 1 5
Chairperson/Officer 2 20
Member 3 10
Chairperson/Officer 4 40
Member 5 15
Chairperson/Officer 6 60
Member 7 20
Chairperson/Officer 8 80
Member 9 25
Chairperson/Officer 10 100
Table 11
Calculation of Service Evaluation Score
Name of professor Activity Score Total Service SES
Code Score
Dr. Jack Knowhow 10 100 100 0.333
Dr. Tony Wiseman 8 80
2 20 100 0.333
Dr. Peter Brainstorm 10 100 100 0.333
All professor total = 300 1.000
SES = Total service score / [SIGMA]Service scores for all professors.