LOAN PERFORMANCE AND RACE.
MARTIN, ROBERT E. ; HILL, R. CARTER
R. CARTER HILL [*]
Recent studies find evidence of racial discrimination in mortgage
markets. Although these studies explore loan approval rates for whites
versus minorities, they do not specifically consider loan performance,
either in the form of default rates or loan administration costs. This
study considers discrimination in the used car credit market, where the
collateral is not subject to location externalities, collateral value
and quality do not vary as much as in real estate, and the loan terms
are shorter. We find administration costs and default rates are higher
for minorities than for whites, controlling for age, income, home
ownership, wealth, occupation, loan terms, and geographic location. (JEL
K2)
I. INTRODUCTION
Credit discrimination occurs whenever minorities are denied credit
on the basis of race or are offered loans at terms different than the
terms offered similarly situated whites. [1] Economists identify two
types of discrimination: preference (taste)-based discrimination and
information (statistical)-based discrimination. [2] The legal definition
of discrimination makes no such distinction. In their statistical
studies, Munnell, Brown, McEneaney, and Tootell [1996], Carr and
Megbolugbe [1994], and Gabriel and Rosenthal [1991] find evidence of
discrimination, whereas most economic theorists, such as Becker [1993],
and commentators, such as Brimelow [1993] and Brimelow and Spencer
[1993], argue competition will greatly reduce, if not eliminate,
discrimination.
In Section III, we demonstrate why, under regulation, statistical
discrimination may increase with increasing competition. This anomalous result occurs because the legal definition does not distinguish between
the two types of discrimination. Some economists implicitly assume
credit markets are unregulated when considering the effect of
competition on discrimination. Antidiscrimination regulation changes the
effect of competition. In unregulated markets, taste-based
discrimination is unprofitable and increasing competition will reduce
this form of discrimination. If the industry's employees or white
borrowers have a taste for discrimination, competition will not
necessarily eliminate taste-based discrimination. When regulation is
binding and the economic foundation for statistical discrimination
exists, discrimination is profitable and increasing competition may
increase discrimination. The effect of regulation and competition on
discrimination depends critically on the information basis for
discrimination (the relationship between loan performance and race).
Therefore, the appropriate empirical question is Does a statistical
basis for discrimination exist? This question cannot be answered by loan
approval studies. Rather, it must be addressed by evaluating how loans
perform after they are approved. Further, the appropriate public policy
issue is Should society regulate statistical discrimination?
This article is an empirical inquiry into the relationship between
race and loan performance. We measure loan performance by estimating the
same marginal default rates that lenders use to score credit
applications and by the variables that have the most impact on loan
administration costs. We find that race matters, regrettably in the
fashion suggested by stereotypes. In general, minorities have higher
default rates and contribute to higher administration costs. In other
words, the foundation for statistical discrimination appears to exist.
Section II contains a review of the extant literature. A theoretical
model of credit "scoring" with both taste and statistical
discrimination is contained in Section III. The data are discussed in
Section IV, and the empirical models are presented in Section V.
II. LITERATURE
As in Arrow [1973] and Becker [1957], an individual is said to have
a preference, or taste, for discrimination if he is willing to pay a
price, or forgo income, in order to practice discrimination. Given a
distribution of preferences across individuals, one could rank order the
strengths of those preferences by their respective reservation prices.
The higher the reservation price, the stronger the preference for
discrimination. This form of discrimination has little economic survival
value, since some agents will have a zero reservation price for
discrimination and others may have a negative reservation price. [3]
Competition reduces the market impact of preference-based
discrimination, since one lender's greed is sufficient to off-set
the bigotry of many other lenders. A measurable credit market impact
from preference-based discrimination may arise if (1) all lenders, and
potential lenders, have positive reservation prices for discrimination;
(2) the industry's employees, and potential employees, have
positive reservation prices for discrimination; or (3) white borrowers
have a taste for discrimination and are willing to boycott lenders who
loan to minorities. If white borrowers do not care or do not know to
whom lenders loan, then, free entry, potential lenders with zero
reservation prices for discrimination, and potential employees with zero
reservation prices for discrimination should eliminate the market effect
of preference-based discrimination. Minority default rates and loan
administration costs less than white default rates and loan
administration costs constitute evidence of preference-based
discrimination.
According to Phelps [1972] informational, or statistical,
discrimination is based on measurable differences in behavior by group.
For example, actuarial studies indicate young men, as a group, have more
automobile accidents than do young women of the same age, education, and
income. Consequently, insurance companies charge higher insurance
premiums for young males than for young females. Since the cost of
supplying insurance to young men is higher than it is for young women,
this is not price discrimination. If default rates and loan
administration costs differ between two groups, lenders have an economic
motive for applying different lending criteria to different groups. The
theoretical foundations for informational discrimination are contained
in Stiglitz and Weiss [1981], Williamson [1987], and Jaffee and Stiglitz
[1990].
Munnell et. al.'s [1996] [4] Boston real estate discrimination
study refocused interest on this important social issue. Gabriel and
Rosenthal [1991, 371] suggest discrimination in lending is serious and
may be getting worse. This work prompted a vigorous response from some
economists. Most recently, Day and Liebowitz [1998] report serious data
irregularities in the Boston real estate study. When they correct the
data, the evidence of discrimination disappears. Similarly, Harrison
[1998] finds shortcomings in the statistical methods employed in the
Boston study. When Harrison used the appropriate statistical methods, he
found no evidence of discrimination in the data.
Gary Becker [1993] argues that the Munnell et. al. study is flawed,
since it does not control for loan performance. Munnell et. al. counter
that very little is known about loan performance and what is known does
not suggest that minorities have higher default rates or higher loan
administration costs [1996, 27]. They claim "The dearth of any
evidence that minorities default more frequently, given their economic
fundamentals, makes a conclusion of economically rational discrimination
problematic" [1996, 45]. Further, "The hypothesis of
statistical discrimination begs several questions,...; if such variables
exist, why are the lenders not collecting them? And to use race as a
signal, the lenders need evidence that race is correlated with outcomes,
holding the rest of their information set constant; the default studies
have yet to provide much support for this hypothesis" [1996, 47].
Munnell et. al. seem unaware of the serious legal liability any lender
would face if they collect race-or gender-based data. Racial information
in their credit-appraisal process would be a "smoking gun,"
much like a tape recording of senior executives making racial remarks.
[5] Consequently, researchers cannot find detailed data sets to estimate
loan performance and race, since no rational lender would collect that
data. It is very important to note that in the current legal and
political environment, lenders are prohibited by law from collecting the
very data Munnell et. al. fault them for not offering in their own
defense. Therefore, it is not at all surprising that the issue of loan
performance and race is still an open question. However, as we see
below, the loan performance [6] question goes directly to the heart of
the discrimination issue.
Tootell [1993, 45] and Munnell et. al. [1996, 44] claim default
rate studies add very little to our understanding of discrimination.
They note that discrimination occurs "at the margin," so
average default rate studies are not useful. [7] Tootell [1993] and
Munnell et. al. [1996] do not consider the default rate analysis lenders
use to make the accept/reject decision. They refer to aggregated average
default rate studies. For example, one might calculate the average
default frequencies within census tracts and then use these averages as
independent variables in regressions that control for the racial
composition of the census tract, along with other income, wealth, and
demographic data. This is not the experience data that firms use to
compute hazard rates in lending. The marginal probabilities are
estimated from actual lending experience. The marginal default rates
provide the foundation for "credit scoring." The default rates
and loan administration cost estimates provided in this article are all
marginal estimates based on actual loan experience for individual
borrowers. [8] It is exactly the information firms use to make decisions
at the margin.
III. THEORY
Credit Rationing
Stiglitz and Weiss [1981], Williamson [1987], and Jaffee and
Stiglitz [1990] provide theoretical models that suggest equilibrium
credit rationing can arise with asymmetric information. Under these
conditions, prices may have adverse sorting effects. This is
particularly important in credit markets, where rising interest rates
may adversely effect the pool of potential borrowers. Prudent borrowers
will be deterred by higher interest rates, while imprudent or dishonest
borrowers may be undeterred. The adverse sorting causes expected profit
per dollar lent to be concave in interest rates. This leads to a
"bank-optimal" interest rate, beyond which the supply of
credit is negatively sloped and where banks ration credit. Martin and
Smyth provide statistical evidence that mortgage supply functions are
backward bending in interest rates and that the "bank-optimal"
mortgage interest rate is approximately 11% [1991, 1072].
If lenders ration credit exclusively by price, loan approvals could
not be a discrimination issue. Therefore, the fact that loan approvals
are a discrimination issue (rather than, say, loan terms) is evidence
that lenders do use nonprice methods to ration credit. Suppose lenders
used "risk-based pricing" for all loans; then, lenders rely
exclusively on interest rates to ration credit. [9] No loans would be
rejected. Instead, the lender would choose an interest rate for the loan
that reflected the loan's risk. Borrowers would then accept or
reject the loan offer. There are other reasons, beyond adverse price
sorting, that help explain why lenders choose to ration credit via the
loan approval process. If lenders used interest rates to ration credit
and there were a statistical basis for discrimination, this practice
would establish a prima facie statistical case that they were
discriminating against minorities. This follows, since the statistical
basis for discrimination means that default rates and loan
administration costs are positively correlated with race. By using
risk-based pricing, lenders would establish a statistical record that
they charged higher interest rates to minorities. Again, the legal
liabilities would be significant, since information-based discrimination
is illegal. This is also the reason why lenders prefer
"parsimonious" credit-scoring models. [10]
Evaluating Loans
Given nonprice credit rationing, the lender must have an objective
criteria for sorting prospective loans. The technique employed by most
lenders is credit scoring. Following Boyes, Hoffman, and Low [1989,
4-5], the marginal loan decision can be modeled as an expected profit
calculation, where expected profit is net of the rate of return on
government securities and loan administration costs. If [r.sup.o] is the
rate of return on government securities, D is the default probability, w
is the loss per dollar lent in the event of default, c is the loan
administration cost per dollar lent, and r is the rate of return on the
loan, then expected profit per dollar lent is
(1) [pi] = (1 - D)(r - [r.sup.o]) - Dw - c.
The lender should accept all loans such that [pi] [greater than] 0
and reject all loans such that [pi] [less than] 0. The threshold default
rate is [D.sup.t] and is found where [pi] = 0. Therefore, [pi] [greater
than] 0 for all D [less than] [D.sup.t] and [pi] [less than] 0 for all D
[greater than] [D.sup.t].
Suppose there are two groups of borrowers, say group a and group b.
Let the group a default rate be [D.sup.a] [equivalent to] [D.sup.a](s),
where s is the borrower's "credit score," s [equivalent
to] s(x), and x is a vector of borrower, loan, and collateral
characteristics. Similarly, let [D.sup.b] [equivalent to] [D.sup.b](s)
be the group b default rate. Both default rates are assumed to be
decreasing in the credit score, s. Note, [D.sup.b] may be greater than,
less than or equal to [D.sup.a]. The default rates represent the
marginal default rates for members in groups a and b. Given the same
age, income, occupation, education, and so forth, and if [D.sup.a]
[greater than] [D.sup.b], then individuals in group a have a higher
default probability than individuals in group b. Further, the expected
profit function in equation (1) assumes the loan administration cost is
the same for both groups. If one group misses more payments, writes more
bad checks, or requires more extensions than the other, the threshold
default rate such that [pi] = 0 will be different for that group. The
lender assigns the group with higher administration costs a lower
threshold for D and a higher threshold for s.
If [D.sup.b] does not equal [D.sup.a] (as in Figure 1, panel a),
then the threshold credit scoring criterion will be different. In an
unregulated market, the threshold scores for the two groups would be
[s.sup.at] [less than] [s.sup.bt]. While the threshold scores are
different, the default rates of the marginal borrowers are the same,
[D.sup.t], when loan administration costs are the same. Pure taste
discrimination against, say, members of group b occurs when the lender
chooses a threshold score for group b that is higher than [s.sup.bt].
For example, a threshold score for group b such as [s.sup.*] in Figure 1
panel a would represent pure taste discrimination. Note, however, that
the lender must be willing to forgo making some loans to members of
group b where the expected real profit is positive. The expected profit
from all loans to members of group b with scores between [s.sup.bt] and
[s.sup.*] is positive. How much higher [s.sup.*] is than [s.sup.bt]
depends on the lender's reservation price for discrimination. Any
other lender with a lower reservation price will lend to all, or some,
of the group b members whose scores are in the [s.sup.*]-[s.sup.bt]
interval. Therefore, other things equal, pure taste-based discrimination
will not persist in an unregulated competitive market. Further,
statistical discrimination will persist only if the default rates
between the two groups are in fact different.
Applying the legal definition of discrimination to this market
yields some surprising results. Under the current interpretation of the
law, the lender must treat members of groups a and b the same with
respect to access to credit and to credit terms. Further, any lender
that collects group-based data is subject to costly litigation. To
protect themselves from this litigation, lenders do not collect
group-based data. From the lender's regulated perspective, there is
only one group whose default rates are an average of the two, such as
the dotted line in Figure 1 panel b. Given the same threshold default
rate, [D.sup.t], for [pi] = 0, regulation requires a uniform threshold
score for both groups, say, [s.sup.t]. If the two groups' default
rates are different and the lender does not discriminate, the default
rate for the marginal borrower in group b will be higher than the
default rate for group a. Further, the expected profit from loans made
to group b members in the interval from [s.sup.t] to [s.sup.b] is
negative, whereas the expected profit from loans not made to group a
members in the interval [s.sup.a] to [s.sup.t] is positive. Most
important, the opportunity cost of taste-based discrimination is now
negative--it pays to discriminate based on one's preferences.
Lenders whose reservation price for discrimination is zero find it
profitable to practice discrimination. Lenders who have a positive
reservation price for discrimination have a double incentive to
discriminate. Their expected profit increases if they discriminate, and
discrimination increases utility. Ironically, antidiscrimination
regulation in credit markets subsidizes bigots.
The two polar cases under regulation are (1) lenders practice
neither statistical nor taste discrimination and (2) lenders practice
both statistical and taste discrimination. In case 1, the outcome is
[D.sup.a] [less than] [D.sup.b]. The data set obtained from the
lender's loan experience will reveal default rates are higher for
group b. The outcome is equally unambiguous in the opposite case, where
[D.sup.a] [greater than] [D.sup.b]. The data set obtained from the fully
discriminating lender's loan experience will reveal default rates
are higher for group a.
There are an infinite number of possible outcomes between the two
extremes. However, regulation creates a perverse incentive to
discriminate. We start with complete compliance with the law, which is
case 1. As the lender deviates from strict compliance and raises the
threshold score for group b and/or lowers the threshold score for group
a, expected profit increases until the threshold for group b rises to
[s.sup.bt] and the threshold for group a falls to [s.sup.at]. Any
increase beyond [s.sup.bt] or any reduction below [s.sup.at] decreases
expected profit. Ironically, the most likely effect of increasing
competition under regulation is an increase in discrimination.
Deregulation of credit markets in the early 1980s may have contributed
to an increase in "redlining." [11]
IV. DATA
Although very important, the real estate market is particularly ill
suited for credit market discrimination studies. Real estate is a poor
candidate, because of the nature of the collateral and the length of the
loans. Real estate asset values are significantly influenced by the
positive and negative externalities of the real estates' location.
The marginal value of real estate improvements depends critically on
these location externalities. Hence, "redlining" can represent
either discrimination against an individual borrower or
"discrimination" against a particular location. We should not
be surprised to find a borrower rejected for a home loan at one location
and subsequently approved by the same lender at another location. For
example, a white couple wishing to purchase and renovate a home in a
low-income minority neighborhood may experience considerable difficulty
finding a lender willing to underwrite the project. The lender rejects
the location, not the borrower, in this instance.
The used car credit market does not suffer from the inherent
collateral problem one observes in real estate discrimination studies.
Location externalities have a minimal influence on collateral asset
values. [12] The mobility of the collateral suggests values will more
closely resemble competitive prices, where the price and quality
relationship is more uniform than is the case for real estate. [13] With
less variation in asset values, the lender's decision to either
accept or reject the loan in the automobile market reflects the borrower
risk rather than collateral risk. In contrast to real estate, we would
be quite surprised to find a lender who refuses to underwrite a loan on
a specific car, regardless of who the borrower might be.
Our data come from a single lender who makes loans in the "C
and D" used car market. The loans are, said to be "indirect
paper," since the original loan application is generated by the
used car dealer. [14] Most of the loans come from independent used car
lots and the borrowers are, typically, high risk. Therefore, the loans
are "C and D" paper rather than the "A and B" paper
generated by franchised new car dealers. The lender data set contains
more than 45,000 observations covering four years and nine months. There
are nonracial borrower characteristics and multiple indicator variables
for default and loan performance in the data. The loan performance
variables allow us to study loan administration costs as well as default
performance. Some loans may have high administrative costs yet never go
into default.
The lender data set consists of borrower characteristics, loan
characteristics, collateral characteristics, and the complete payment
history of each loan. The original data set contains more than 120,000
loans. It was "aged" to include only loans that have an
opportunity to run their complete natural course during the interval
from January 1990 to September 1994. In fact, many of the sample loans
did not run their entire natural course; they went into default or were
prepaid. Prepaid or defaulted loans whose natural termination date occurs on or before September 1994 are included in the data set. If the
natural termination date follows September 1994, the prepaid or
defaulted loan is assumed to be from a later population of loans. After
aging, there are 45,351 usable observations. This sample contains all
loans from the interval that went successfully to term, were prepaid,
were charged-off, or in which the car was repossessed.
Race
Lenders cannot collect racial variables. Hence, the original data
set contains no racial variables. The lender avoids any contamination of
the data by racially based variables. This is a serious problem for all
loan performance studies and will remain so until lenders are allowed to
defend themselves against discrimination charges. As a second-best
solution, we merged the original data set with another data set
containing racial variables. [15] The merging was accomplished by
cross-matching zip codes between the lender data set and the zip code denominated data from the U.S. Bureau of the Census. [16]
Among other variables, the Census Bureau data reports racial
variables as the number of white, black, Indian, Asian, and other
residents in each zip code. The number of Hispanic residents in each zip
code is an ethnic classification. The Hispanic residents are further
classified by race, as white Hispanic, black Hispanic, and so forth. We
wish to control separately for race and ethnicity. So, the racial
variable is min, which is the proportion of the zip code population that
is nonwhite. The ethnicity control variable is wh, the proportion of the
zip code population that is white Hispanic. For each of the matched
loans, the race and ethnic variables represent the probability the
borrower is either a racial minority or a Hispanic person of white
descent.
Other Borrower Characteristics
The lender data set contains the age of the borrower (bage), the
borrower's combined monthly income (bcmi), a zero/one variable for
home ownership (home), a zero/one variable for a cosigner (cosign), and
a borrower occupation code. The occupation codes are clerical (cler),
skilled labor (sl), unskilled labor (ul), and professional (prof). All
of these variables come from the lender's data set and represent
characteristics of the specific borrower.
The Census Bureau data contain other useful demographic variables.
These additional demographic data offers the opportunity to control for
unobserved differences in borrower wealth. For example, the 1989 median
housing price (medhp) controls for real estate values. Similarly, the
1989 interest, dividend, and rental income per reporting household
(idrinch) and the proportion of households reporting interest, dividend,
and rental income (idrincp) are proxies for wealth. Finally, median
household income (mhi) is included as a control for neighborhood income
characteristics.
Collateral Characteristics
The lender's data set contains little direct information about
the collateral. [17] There are two variables: the model year and a
variable indicating whether the car is new or used. From the model year,
we compute the age of the car at the time of loan origination measured
in years (cage). We also define a zero/one variable for new cars (new).
Loan Terms
The lender's data set contains a variety of loan terms. The
amount borrowed (amt), the interest rate on the loan (apr), the amount
of the monthly payment (pay), the length of the loan measured in months
(term), and the percent down payment (dp) are the primary loan terms. An
estimate of the borrower's commitment to monthly fixed payments
(fixpay) is computed from borrower characteristics, Census Data, and
loan terms. The variable fixpay is equal to the monthly loan payment
(pay) divided by the borrower's combined monthly income (bcmi) plus
the median monthly homeowner cost (mmoc) divided by the median household
income (mhi), and whether the borrower is a homeowner (home = 1). If the
borrower is not a homeowner, the variable fixpay is equal to the monthly
loan payment (pay) divided by the borrower's combined monthly
income plus the median monthly gross rent (mgr) divided by the median
household income. The variables mmoc, mhi, and mgr are data from the zip
code data set.
Loan Performance
If the car is repossessed or if the loan is charged off, it is said
to be a "bad" loan (bloan). The empirical study of default
rates uses bloan as the criterion variable. Loan administration costs
vary considerably from one loan to the next. Borrowers who always pay on
time are the lowest-cost borrowers. Borrowers who skip payments, make
partial payments, or make payments with bad checks increase collections
cost. The number of months during the payment history in which no
payments were made (nopmt) is one indicator of loan performance. The
number of extensions given (numext), the number of partial payments
(partpmt), and the number of payment reversals (revers) are other
indicators of higher administration cost. Each of the loan
administration cost variables (nopmt, numext, partpmt, and revers) are
discrete count variables. A complete variable list is contained in the
appendix.
V. LOAN ADMINISTRATION COSTS
The maintained hypothesis in this section is that loan
administration costs are an increasing function of specific borrower
behavior. Consequently, the lender's expected profit from a given
loan decreases when the borrower fails to make a monthly payment
(nopmt), makes partial payments (partpmt), writes bad checks (revers),
or requires an extension of the loan (numext). All of the foregoing
activities at least disrupt the timing of the lender's cash flow.
The delay in the receipt of loan proceeds imposes an opportunity cost on
the lender in the form of earnings foregone from reinvestment. There are
direct costs in addition to the opportunity cost. Each borrower action
requires a response from the lender. The lender response varies from a
telephone call to written notice, collateral repossession, or legal
fees.
The empirical question is what role, if any, does race play in
missed payments, partial payments, bad checks, and loan extensions.
Clearly, we must control for factors other than race that may contribute
to this behavior. Nonracial borrower characteristics such as bage, bcmi,
home, cosign, and the occupation variables sl, ul, prof, and cler may be
important explanatory variables. The collateral characteristics, cage
and new, are also included in the estimation. Further, there may be
adverse incentive effects created by loan terms. Hence, we include apr,
amt, pay, trm, dp, and fixpay as explanatory variables.
Seventy-one percent of the loans are concentrated in three states,
and the remaining 29% are distributed over the rest of the continental
United States. State laws may create different incentives, and
variations in local competitive conditions may affect the quality of the
loan portfolio, so we include zero/one dummy variables for the three
states (st1, st2 and st3). "Compliance management" is an
important part of risk management. State regulations governing rules for
loan amortization, late fees, interest rate calculations, and collateral
recovery vary by state. [18] Relatively small deviations from state
regulations can leave the lender vulnerable to expensive class-action
lawsuits. Compliance management consists of testing the computer-based
loan calculations to determine whether they are in compliance with the
regulations from the state of origin. This is a particularly important
function when dealing with loans originated by third parties. In what
follows, the omitted state dummy variable is the state that generates
the most loans for the lender. To the extent that the lender faces less
competition in the states where it does the most business, the state
dummy variables will control for these effects.
The foregoing independent variables, along with the minority (min)
and ethnic (wh) variables, were used as regressors explaining the four
loan administration cost variables. The models are estimated by Poisson
regression to account for the fact that the cost variables are
"counts." The results are contained in Table I. Additional
wealth variables from the Census Bureau's demographic data were
included in the estimation. The coefficient estimate for the racial
minority proportion is positive and significant at better than the .01
level in each equation. Higher concentrations of minorities are
positively related to higher nopmt, partpmt, revers, and numext. The
cost of administering loans increases for borrowers from neighborhoods
with higher minority concentrations. The results for the white Hispanic
variable, wh, are mixed. The coefficient is positive and significant at
better than the .01 level for nopmt and partpmt. The coefficient is
negative and significant at better than the .01 level for numext. The co
efficient is insignificant in the revers equation.
A measure of the potential size of the minority effects can be
obtained by sorting the original data set by racial concentrations in
neighborhoods, estimating the administrative cost models separately on
the two data sets, and comparing the results. The "white" data
set consists of all zip codes in which the white proportion is greater
than or equal to 98% of the total population. The "minority"
data set consists of all observations for which the zip code minority
population is greater than or equal to 85% of the total population. [19]
Sorting results in 3,142 "white" observations and 957
"minority" observations.
Table II contains the average values for the total data set, the
white data set, and the minority data set by loan performance variables
and the independent variables. In the white sample, 99.2% of the
population is white. In the minority sample, 91.5% of the population is
minority. The overall average for the total sample is 29.9% minority.
The averages for bloan, nopmt, partpmt, revers, and numext are higher in
the minority sample than in the white sample. Further, simple tests for
differences in means reveal we reject the null hypothesis that the means
are the same in each instance. Additional means tests reveal that
minority borrowers receive lower interest rates, borrow larger amounts,
have higher monthly automobile payments. They also borrow for longer
periods of time, make lower down payments, maintain higher
fixed-payments-to-income ratio, have the same combined monthly incomes
as whites, are older than white borrowers, have the same homeowner
representation as whites, have fewer cosigners, have lower valued homes
than white borrowers, receive less income from interest, dividends, and
rents, and tend to buy newer cars.
Controlling for the obvious differences in independent variables, a
measure of the difference race makes in loan administration cost
variables is obtained by estimating the models separately on first the
white data set and then the minority data set. The "white"
parameters are used in the minority data set and the
"minority" parameters are used in the white data set to
compute the predicted values for nopmt, partpmt, revers, and numext. The
proportional average difference between the actual values and the
predicted values is computed. [20] This analysis reveals that, after
controlling for other variables, the minority data and minority
parameters suggest proportionately higher values for nopmt, partpmt,
revers, and numext when joined with the white parameters and white data,
respectively. When we apply the "minority" coefficients to the
"white" data set, the results reveal the predicted values are
higher than the actual values by the following percentages: nopmt,
107.4%; partpmt, 110.0%; revers, 90.0%; and numext 40.0%. The minority
coefficients lead to a higher predicted value in each case. In general,
the minority coefficients lead to predicted values that are twice as
high for the number of months with no payments, the number of partial
payments, and the number of payment reversals.
The results when we use the "white" coefficients with the
"minority" data are equally revealing. The predicted values
are lower than the actual sample values as follows: nopmt, -35.7%;
partpmt, -33.3%; revers, -56.0%; and numext, -25.0%. The predicted
values are between 25% and 50% lower when we use the white coefficients
with the minority data.
VI. DEFAULT RATES
Lender discrimination requires rejection of loans to qualified
minorities. Hence, direct evidence of discrimination comes from the
lender's loan portfolio as minority default rates that are lower
than white default rates. Statistical discrimination with no taste
discrimination suggests that default rates would be the same for whites
and minorities. Higher minority default rates are consistent with a
lender who has neither a taste for discrimination nor is statistically
discriminating. Higher minority default rates are inconsistent with a
lender who practices both taste and statistical discrimination.
The lender's risk of losing all or part of the loan principle
is high in the "C and D" used car credit market. The cars are
older and the borrowers have lower incomes than in the "A and
B" used car credit market. In addition, almost all of the borrowers
have troubled credit histories. Lenders use statistical "credit
scoring" techniques to evaluate the default risk, an example of
which can be found in Campbell and Dietrich [1983]. Boyes, Hoffman and
Low [1989] demonstrated that probit analysis is a more efficient
estimator of default hazard rates than other techniques, such as expert
systems [21] and discriminant analysis.
The probit model is estimated using traditional credit scoring
variables and the same racial variables used in the loan administration
cost model. The traditional credit scoring variables are apr, amt, pay,
term, dp, fixpay, deal, bage, bcmi, home, cosign, ul, sl, prof, cage,
new, st1, st2, and st3. The results are contained in Table III. As
expected, the default probability decreases with the borrower's
age, the borrower's income, if the borrower is a homeowner, if
there is a cosigner on the note, and with a higher down payment. The
default probability increases as the interest rate increases, the size
of the monthly payment increases, the length of the loan increases, and
the fixed payment proportion of the borrower's income increases.
The effect of the minority proportion (min) on the default
probability is positive and significant at better than the .01 level, as
in the administration cost models. The co-efficient for white Hispanics
(wh) is not significant. Holding age, income, home ownership, and
occupation constant, minorities appear to have higher default rates than
whites. The lender's data set suggests she is not discriminating
against minorities. Discrimination would be evidenced by lower default
rates for minorities than for whites.
Direct measures of borrower wealth are not reported, except for the
borrower's combined monthly income (bcmi). However, it is possible
to construct proxies for borrower wealth from the Census data set. One
potential measure of wealth is the 1989 median housing price (medhp) in
the zip code. Other measures are the amount, per reporting household, of
other 1989 income from interest, dividends, and rentals (idrinch), the
proportion of households reporting such income (idrincp), and median
household income (mhi). The results for the probit default rate model
with the wealth proxy variables are contained in Table IV. The signs and
significance levels for min an wh are qualitatively the same as they
were in Table III.
A measure of the size of the impact on default rates that
differences in race may make can be obtained by an analysis similar to
the impact analysis used on the measures of loan administration cost:
the probit model is estimated separately on the white and minority data
sets, and then the estimated coefficients are used to obtain the
predicted default rate using the opposite sample. When the
"minority" coefficients are used with the "white"
data set, the predicted default rate is 10.1% higher than the actual
default rate for the white sample. When the "white"
coefficients are used with the "minority" data set, the
predicted default rate is 11.1% lower than the actual default rate for
the minority sample.
VII. CONCLUSIONS
The legal definition of discrimination does not distinguish between
preference-based discrimination and statistical discrimination. Although
the motives are different in these two types of discrimination, they are
treated equally under the law. Without discrimination regulation, there
are disincentives associated with preference-based discrimination and
there are economic incentives associated with statistical
discrimination. Discrimination regulation is binding in competitive
markets only if the information foundation for statistical
discrimination exists. The impact of competition is different in
unregulated markets and in regulated markets. Competition reduces
preference-based discrimination in unregulated markets, whereas
competition may increase statistical discrimination in regulated
markets.
Recent empirical studies suggest redlining in mortgage markets is
increasing. This may be the natural consequence of the financial
deregulation of lending institutions that began in the late 1970s. If
redlining increases as competition increases, it is indirect
confirmation of our results. An information basis for statistical
discrimination must exist if competition increases redlining; otherwise,
the entry of new lenders in the 1980s and early 1990s would have reduced
preference-based discrimination.
We consider default rates and loan administration costs in our
empirical models. Our results suggest minority loan default rates are
higher than comparable white default rates and that minority loan
administration costs are higher than comparable white administration
costs. The measured differences are substantial. Regulation of
statistical discrimination creates serious adverse incentive effects. It
adversely selects for borrower behavior that increases default rates and
raises loan administration costs. This adverse selection can result in
"lemon effects" in credit markets and may result in a general
deterioration in credit quality in the long run. Worst of all,
regulation of statistical discrimination directly subsidizes bigoted lenders. These lenders can raise their profits by discriminating, in
addition to realizing a utility gain from discrimination.
The empirical researchers who find evidence of discrimination claim
there is little evidence that loan performance is different across
racial groups. They are also puzzled by why lenders do not provide
evidence of differences in group behavior. We note the rational lender
will not collect racial data, since the legal liability would be
substantial. Hence, data sets that contain both loan performance
characteristics and race variables do no exist. Our results need to be
confirmed or refuted in other data sets. This will require access by
qualified researchers to other lender's loan data sets.
Lender's will be reluctant to make this data available for study,
since the simple pursuit of profit can establish a statistical record
that suggests they are guilty of discrimination. If the information
basis for discrimination exists, profits rise as the lender pursues
statistical discrimination. Some "hold-harmless" agreement
must be reached between lenders and regulators that allows the data to
be studied.
Martin: Professor, Department of Economics, Centre College,
Danville, Kentucky 40422, Phone 606-238-5260, Fax 606-236-7925, E-mail
bmart@centre.edu
Hill: Department of Economics, Louisiana State University, Baton
Rouge, Louisiana 70803, Phone 225-388-1490, Fax 225-388-3807, E-mail
eohill@lsu.edu
(*.) This study is not part of any existing or pending litigation,
and we received no compensation of any sort from the lender who provided
the data. We are indebted to an anonymous referee, the members of the
University of Kentucky's microeconomics workshop, and David Baumer
for their comments and suggestions. A special debt is owed the referee
for several important contributions. Any remaining errors are our own.
(1.) Tootell [1993, 46] contains a discussion of the legal
definition of discrimination.
(2.) Arrow [1973] and Becker [1957] consider preference- or
taste-based discrimination, whereas Jaffee and Stiglitz [1990], Phelps
[1972], Stiglitz and Weiss [1981], and Williamson [1987] explore
information- or statistics-based discrimination.
(3.) A negative reservation price suggests the lender would be
willing to subsidize minority borrowers.
(4.) The study was originally released in 1992 by the Boston
Federal Reserve Bank.
(5.) "Redlining" is the least detectable solution to the
problem of economically motivated statistical discrimination under
regulation. The lender need not contaminate her records with racial
data. She can cease lending in minority neighborhoods and observe the
impact on portfolio profitability. If minorities default more frequently
and cause higher administrative costs, profitability will rise. If the
hypothesis is incorrect, profits will fall. In any event, the lender
does not need to collect racial data and discrimination is harder to
prove.
(6.) Despite Munnell et. al.'s arguments to the contrary
[1996, 44-45], loan performance is the central issue.
(7.) These authors also claim default rate studies suffer from
omitted variable problems that prevent them from being useful. Omitted
variable problems exist in every empirical economic problem, including
"redlining" studies. Perfect variables and perfectly specified
models do not exist. The degree to which omitted variables become a
problem is always a judgment issue.
(8.) Each observation represents an individual loan, not a
Census-tract average.
(9.) The Federal Reserve System recently encouraged lenders to use
more "risk-based pricing," rather than rejecting loans.
(10.) A "parsimonious" credit-scoring model is one with a
limited number of explanatory variables. As the number of explanatory
variables increases, the lender runs the risk of unintentionally
establishing a record of statistical discrimination, if the added
variables are correlated with race. More variables add explanatory power
to the model, and the cost of estimation and data storage are trivial
compared with the benefits of lowering default rates. Again, this is a
regulation effect.
(11.) Jayaratne and Strahan [1997] find that deregulation of
banking in the 1970s and 1980s led to significant improvements in
efficiency and that most of the cost reductions were passed on to
consumers in the form of lower interest rates. Costs fell, and
"nonperforming loans" declined significantly. Easier entry
caused increased competition, leading to lower interest rates. The
pressure on banks' margins cause them to reduce costs and improve
loan portfolio performance. Under pressure from increased competition,
banks would review those parts of the business that generate
unprofitable loans. If there is a statistical basis for discrimination,
groups of unprofitable loans would appear as geographic clusters based
on minority concentrations. Therefore, more "redlining" would
be a natural consequence of the increase in competition.
(12.) One exception may be the effect of salt corrosion in northern
climates. Automobiles located in cold climates, or close to the coast,
may suffer negative location extenalities from salt corrosion.
(13.) We are indebted to an anonymous referee for raising this
point.
(14.) This practice varies little from national real estate
lenders, who make home loans without ever meeting the borrower or who
purchase mortgages originated by others. In both cases, the lender could
use secondary data sources to infer the race of the borrower, if the
lender was so motivated. Hence, we cannot rule discrimination out a
priori.
(15.) Clearly, we would prefer to have the racial identity of each
of the 40,000 plus individual borrowers. Since that does not exist in
this data set or any other data set suitable for a loan performance
study, we must use the Census data to augment our study. We take
additional steps in the analysis to overcome this inherent problem by
sorting the data set for those zip codes where it is almost certain the
residents are white or almost certain the residents are minorities. This
practice results in models with significant degrees of freedom, despite
the sorting.
(16.) Obviously, the lender must have the borrower's zip code
in order to administer the loan.
(17.) The absence of collateral data in the lender's
information set reflects the fact that collateral risk is not the
principle issue in the used car credit market.
(18.) Up-to-date details as to how state regulations vary are
available from the Commerce Clearing House publications.
(19.) A lower criterion for the minority proportion was chosen in
order to have sufficient degrees of freedom. Note, the 85% threshold for
minorities results in an average proportion for the minority sample of
over 90%.
(20.) For each observation, the predicted value is computed as
described in the text. The actual value is subtracted from the predicted
value, and the differences are summed and divided by the number of
observations. The resulting average difference is then divided by the
average actual value for the sample.
(21.) An expert system is one where each credit application is
evaluated by an "expert" credit analyst, who renders an accept
or reject decision. It represents a precomputing technology.
REFERENCES
Arrow, Kenneth J. "The Theory of Discrimination." in
Discrimination in Labor Markets, edited by Orley Ashenfelter and Albert
Rees. Princeton, N.J.: Princeton University Press, 1973.
Becker, Gary S. The Economics of Discrimination. Chicago.:
University of Chicago Press, 1957.
-----. "The Evidence Against Banks Doesn't Prove
Bias." Business Week, April 19, 1993, 18.
Boyes, William J., Dennis L. Hoffman, and Stuart A. Low. "An
Econometric Analysis of the Bank Credit Scoring Problem." Journal
of Econometrics, 40, 1989, 3-14.
Brimelow, Peter. "Racism at Work?" National Review, April
12, 1993, 42.
Brimelow, Peter, and Leslie Spencer. "The Hidden Clue,"
Forbes, January 4, 1993, 48.
Carr, James, and Isaac F. Megbolugbe. "The Federal Reserve
Bank of Boston Study on Mortgage Redlining Revisited." Journal of
Housing Research, 4, 1994, 277-314.
Day, Theodore E., and Liebowitz, S. J. "Mortgage Lending to
Minorities: Where's the Bias?" Economic Inquiry, January 1998,
3-28.
Gabriel, Stuart A., and Stuart S. Rosenthal. "Credit
Rationing, Race, and the Mortgage Market." Journal of Urban
Economics, 29, 1991, 371-79.
Harrison, Glenn W. "Mortgage Lending in Boston: A
Reconsideration of the Evidence." Economic Inquiry, January 1998,
29-38.
Jaffee, Dwight, and Joseph Stiglitz. "Credit Rationing."
In Handbook of Monetary Economics, vol. 2., edited by B. M. Friedman and
F. H. Hahn. New York: Elsevier Science, 1990, 837-88.
Jayaratne, Jith, and Strahan, Philip. "Entry Restrictions,
Industry Evolution, and Dynamic Efficiency: Evidence from Commercial
Banking." Journal of Law and Economics, April 1998, 239-73.
Martin, Robert E., and Smyth, David J. "Adverse Selection and
Moral Hazard Effects in the Mortgage Market: An Empirical
Analysis." Southern Economic Journal, April 1991, 1071-84.
Munnell, Alicia, Lynn E. Browne, James McEneaney, and Geoffrey M.
B. Tootell. "Mortgage Lending in Boston: Interpreting HMDA Data." American Economic Review, March 1996, 25-53.
Phelps, Edmund S. "The Statistical Theory of Racism and
Sexism." American Economic Review, September 1972, 659-61.
Stiglitz, Joseph, and Andrew Weiss. "Credit Rationing in
Markets with Imperfect Information," American Economic Review, June
1981, 393-410.
Tootell, Geoffrey M. B. "Defaults, Denials, and Discrimination
in Mortgage Lending." New England Economic Review,
September/October 1993, 45-51.
Williamson, S. "Costly Monitoring, Loan Contracts, and
Equilibrium Credit Rationing." Quarterly Journal of Economics,
February 1987, 135-45.
Poisson Models: Administrative Cost Variables [a]
Number of Number of
No-Pay Partial-Pay Number of
Months Months Pay Reversals
Independent Variable (nopmt) (partpmt) (revers)
Intercept 0.09 -2.06 -7.05
(2.86) (23.61) (20.24)
Interest rate 0.001 -0.004 0.01
(1.95) (2.33) (1.92)
Loan amount -0.00006 0.00001 -0.00003
(11.05) (1.62) (0.61)
Payment 0.001 0.003 0.004
(13.24) (15.21) (3.94)
Term 0.03 0.02 0.04
(38.55) (13.63) (5.95)
Down payment -0.002 -0.008 -0.0002
(11.30) (11.65) (0.09)
Fixed payment ratio 0.002 -0.0001 0.001
(18.19) (0.33) (1.50)
Buyer's age -0.05 -0.005 -0.003
(24.05) (8.26) (1.42)
Buyer's income -0.00002 -0.00006 0.00004
(8.18) (9.22) (1.86)
Homeowner -0.15 -0.21 -0.23
(23.00) (11.45) (3.25)
Cosigner 0.005 0.03 -0.28
(0.60) (1.14) (2.57)
Unskilled labor 0.14 0.30 0.17
(14.43) (11.05) (1.68)
Skilled labor 0.13 0.13 0.02
(13.42) (4.62) (0.15)
Professional 0.15 0.03 0.12
(11.44) (0.69) (0.90)
Car age -0.01 -0.01 -0.02
(10.54) (2.62) (1.06)
New car -0.05 -1.03 -3.44
(0.20) (6.35) (2.27)
State 1 0.17 -0.03 0.17
(23.27) (1.65) (1.94)
State 2 0.10 -0.05 0.44
(10.12) (1.97) (4.44)
State 3 0.16 0.19 0.47
(22.60) (9.77) (5.85)
Minority 0.36 0.42 1.07
(23.88) (10.08) (6.67)
White Hispanic 0.39 0.51 0.32
(9.93) (4.70) (0.79)
Median house price 0.0000005 -0.0000002 0.000003
(2.57) (0.39) (1.73)
Interest, dividend, 0.000001 -0.000003 0.00001
rent per house (1.08) (1.12) (1.16)
Interest, dividend, -0.02 -0.07 0.26
rent proportion (0.78) (0.98) (1.03)
Median household 0.000007 0.000008 0.00002
income (10.34) (4.37) (2.74)
Log-likelihood -142,621 -44,908 -5,759
observations 45,361 45,361 45,361
Number of
Extensions
Independent Variable (numext)
Intercept -3.80
(28.41)
Interest rate 0.006
(2.38)
Loan amount -0.00004
(2.50)
Payment 0.003
(10.94)
Term 0.04
(16.29)
Down payment -0.007
(6.39)
Fixed payment ratio 0.002
(4.46)
Buyer's age -0.001
(1.23)
Buyer's income -0.00002
(2.77)
Homeowner -0.02
(0.61)
Cosigner 0.05
(1.06)
Unskilled labor 0.24
(5.76)
Skilled labor 0.18
(4.24)
Professional -0.17
(2.84)
Car age -0.06
(9.90)
New car -1.19
(4.23)
State 1 0.15
(4.73)
State 2 -0.05
(1.22)
State 3 0.29
(9.59)
Minority 0.38
(6.01)
White Hispanic -0.67
(2.81)
Median house price 0.0000002
(0.24)
Interest, dividend, -0.00001
rent per house (1.78)
Interest, dividend, -0.26
rent proportion (2.43)
Median household 0.00001
income (5.16)
Log-likelihood -21,893
observations 45,361
(a.)t-values in parentheses.
Means: Total Sample, White Sample
and Minority Sample
Total White Minority
Variable (n = 45,361) (n = 3,142) (n = 957)
Performance variable:
Bad loan (%) 14.7 13.9 20.8
No payment 3.2 2.7 4.2
Partial pay 0.4 0.3 0.6
Reversal 0.029 0.02 0.05
Extensions 0.2 0.1 0.2
Independent variable:
Interest rate (%) 27.9 28.8 26.9
Amount ($) 3,949 3,410 4,456
Payment ($) 200 180 215
Term (mo) 25.9 24.5 27.4
Down payment (%) 15.5 17.9 11.9
Fix payment ratio 2.1 1.5 3.2
Buyer's age (yrs) 36.6 36.0 38.0
Income ($) 1,719 1,716 1,699
Homeowner (%) 27.0 26.6 28.3
Cosigner (%) 10.8 16.2 10.1
Unskilled (%) 50.9 58.0 50.1
Skilled (%) 32.3 25.0 33.1
Professional (%) 6.6 8.0 3.4
Car age (yrs) 3.3 4.3 2.6
New car (%) 0.6 0.5 0.4
State 1 (%) 26.9 2.6 11.9
State 2 (%) 15.8 14.2 13.4
State 3 (%) 28.5 47.2 47.8
Minority (%) 29.9 0.8 91.5
White Hispanic (%) 1.6 0.6 0.1
Median house price ($) 60,337 57,848 42,309
Interest, dividend, rent per house ($) 5,735 5,533 3,474
Interest, dividend, rent proportion (%) 41.0 46.7 15.9
Median household income ($) 24,823 23,917 15,600
Probit Default Rate Model: bloan
Default Probability
Independent variable Coefficient p-Value
Intercept -1.44 0.0001
Interest rate 0.005 0.0020
Loan amount -0.00005 0.0051
Payment 0.17 0.0001
Term 0.01 0.0001
Down payment -0.006 0.0001
Fix payment ratio 0.003 0.0001
Buyer's age -0.008 0.0001
Buyer's income -0.00004 0.0001
Homeowner -0.20 0.0001
Cosigner -0.11 0.0001
Unskilled -0.004 0.8797
Skilled 0.03 0.2200
Professional -0.02 0.6075
Car age -0.001 0.6805
New car -0.16 0.1387
State 1 0.13 0.0001
State 2 0.16 0.0001
State 3 0.22 0.0001
Minority 0.13 0.0002
White Hispanic -0.14 0.2380
Log-likelihood observations -18,261.76
45,361
Probit Default Rate Model with Wealth
Variables: bloan
Default Probability
Independent variable Coefficient p-Value
Intercept -1.44 0.0001
Interest rate 0.005 0.0017
Loan amount -0.00004 0.0055
Payment 0.17 0.0001
Term 0.01 0.0001
Down payment -0.006 0.0001
Fix payment ratio 0.0003 0.0001
Buyer's age -0.008 0.0001
Buyer's income -0.00004 0.0001
Home owner -0.19 0.0001
Cosigner -0.11 0.0001
Unskilled -0.0009 0.9729
Skilled 0.04 0.1971
Professional -0.02 0.6333
Car age -0.001 0.6872
New car -0.15 0.1472
State 1 0.12 0.0001
State 2 0.17 0.0001
State 3 0.21 0.0001
Minority 0.09 0.0284
White Hispanic -0.27 0.0293
Median housing price 0.02 0.0004
Interest, dividends, rent, per house 0.0002 0.5505
Interest, dividends, rent, proportion -0.15 0.0277
Median household income -0.000003 0.1577
Log-likelihood observations -18,253.52
45,361
List of Variables
Performance Variables:
bloan - "bad" loan, a loan that has been written off
and/or the car has been repossessed, (1,0).
nopmt - number of months during payment history when no payment was
received.
partpmt - number of months during payment history when a partial
payment was received.
revers - number of months during payment history when payments were
reversed.
numext - number of extensions given.
Independent Variables:
apr - annual percentage interest rate on loan.
amt - amount borrowed.
pay - monthly payment.
term - length of loan measured in months.
dp - percent down payment.
fixpay - ratio of monthly fixed payments divided by monthly income.
bage - borrower's age.
bcmi - borrower's combined monthly income.
home - home ownership, (1,0).
cosign - cosigner on loan, (1,0).
ul - unskilled laborer.
sl - skilled laborer.
prof - professional worker.
cage - car age, measured in years.
new - new car, (1,0).
st1, st2, and st3 - state of residence control variables, (1,0).
min - minority population as percent of zip code.
wh - white Hispanic population as percent of zip code.
medhp - median housing price in zip code.
idrinch - average interest, dividend, and rental income reported
per household in zip code.
idrincp - proportion of households in zip code reporting interest,
dividend, and rental income.
mhi - median household income for zip code.