Evidence of Adverse Selection from Thoroughbred Wagering.
Wimmer, Bradley S.
Brian Chezum [*]
Bradley S. Wimmer [+]
Previous research has shown the thoroughbred sales market to be
affected by adverse selection. In the market, sellers who race as well
as breed thoroughbreds will choose to keep thoroughbreds when their
estimated private values exceed expected sales prices. The presence of
asymmetric information leads these sellers to sell their low-quality
horses and keep their best for racing. We extend the analysis by
examining how bettors use similar information when wagering on
thoroughbred races. We show, using a sample of two-year-old maiden races, that homebreds (those horses kept by their breeders for racing)
are favored over otherwise similar nonhomebreds.
1. Introduction
In his Lemons Market example, Akerloff (1970) shows that some
mutually beneficial trades will not occur when sellers have private
information about the quality of goods. Essentially, a buyer's best
estimate of the quality of any seller's good is the market average,
and sellers of high-quality goods may not enter the market. Although
there is little dispute over the theoretical underpinnings of
Akerloff's Lemons Market, relatively few empirical studies present
evidence illustrating this outcome. Some examples are Greenwald and
Glasspiegel (1983), Gibbons and Katz (1991), Genesove (1993), and Chezum
and Wimmer (1997).
These papers relate data on prices with distinguishing
characteristics of sellers to show that sellers with "adverse"
characteristics receive lower prices. For example, Chezum and Wimmer,
examining the thoroughbred yearling sales market, show that prices
commanded by sellers who also race thoroughbreds are lower than prices
commanded by sellers who sell all of their thoroughbreds. Intuitively,
breeders will take a thoroughbred to market when the expected market
price exceeds the value they would receive from retaining the animal. If
private information is important and costly to transmit, prices for
thoroughbreds sold by racing-intensive breeders will be lower than those
received for similar thoroughbreds sold by breeders who do not race. [1]
This paper extends this intuition by examining how bettors evaluate
two-year-old thoroughbreds when they reach the track. If breeders
adversely select the horses they sell, a horse retained by a breeder should be of a higher average quality. If betting markets are efficient,
a finding that bettors expect homebreds (horses retained by their
breeders) to outperform horses that were sold indicates that adverse
selection is present in the market for thoroughbreds.
Studies of betting markets have found that bettors predict the
outcome of horse races (and other sporting events), relatively
accurately. [2] These studies show that bettors are able to aggregate
disparate pieces of information efficiently. Use of betting information
therefore provides an opportunity to examine how markets incorporate
information on breeders' decisions to keep or sell thoroughbreds in
settings other than a sale. Such a test of adverse selection is
relatively unique to the literature because it does not rely on sales
data to determine whether adverse selection is present in a market. [3]
The use of race data allows us to compare the quality of goods retained
by their producers with that of those offered for sale.
This prediction is clarified in section 2. Section 3 describes the
data used to test our prediction and the empirical strategy. Using data
from a set of races run by two-year-old Thoroughbreds conducted at the
Keeneland and Saratoga racecourses in the summer and fall of 1995, we
find that homebreds are favored over otherwise similar nonhomebreds as
reflected by the post-time odds of a race.
2. Thoroughbred Sales Markets and Betting Behavior
In the thoroughbred industry, owners obtain thoroughbreds for
racing by purchasing them privately, at auction, or through their own
breeding operations. [4] In turn, breeders may be classified as one of
three types: those who sell all of their thoroughbreds, those who race
all of their thoroughbreds, and those who both sell and race
thoroughbreds. Breeders who keep a portion of their crop are expected to
use private information to determine which thoroughbreds they keep to
race.
In the time before breeders take their thoroughbreds to market,
they observe how their horses respond to other thoroughbreds, have
access to their horses' complete medical histories, and are
generally able to identify their thoroughbreds' temperaments.
Although these factors do not predict future on-track success perfectly,
they do give the seller an informational advantage over buyers. At
thoroughbred auctions, information on a thoroughbred's breeding is
available and buyers are allowed to inspect thoroughbreds prior to the
sale. However, buyers do not have access to the information that the
seller possesses. Because buyers do not have perfect information, sales
prices reflect the average quality of thoroughbreds offered for sale
with similar breeding and visual characteristics.
Sellers who also race are expected to use private information to
determine which thoroughbreds to take to market. As in the standard
lemons model, the presence of bad thoroughbreds forces the average price
down, and the highest quality yearlings are likely to exit the market.
If buyers know this, they expect the average quality to decline and, as
in Akerloff (1970), the market may collapse. Genesove (1993) examines a
model in which buyers and sellers have identical tastes but sellers are
capacity constrained so that some thoroughbreds are sold for reasons
other than adverse selection. The presence of capacity constraints generates an equilibrium with positive prices because buyers do not know
whether a seller is selling a thoroughbred because it is capacity
constrained or because the horse is being adversely selected. In this
model, price reflects the average quality traded, and sellers take their
lowest quality horses to market to ease their capacity constraints. This
may describe the thoroughbred industry because participants who race are
limited in their capacity to maintain a large stable. [5]
When horses reach the racetrack, they race for a portion of a purse that is typically funded through pari-mutuel wagering. [6] Consumers bet
on the order of finish. Simple bets are those for win, place, and show
(first, second, and third, respectively). A bettor receives a payoff
when the horse on which he wagered finishes in the wagered position or
better. For example, a show bet will pay if the horse finishes first,
second, or third [7] Each wager type has its own pool, which is the sum
of money bet on all horses for each type of wager. Payoffs are based on
the relative proportion of money bet in the appropriate pool on each
horse in the race. These payoffs are indicated by the odds. As the
relative amount bet on a horse increases, the odds and subsequent
payoffs fall.
Bettors analyze available information to predict how the horses
entered in a race will finish. This process is referred to as
handicapping. The object of handicapping is to allocate available funds
in a way that maximizes expected returns. [8] The question of interest
is how bettors use information on whether a horse was sold before it
reached the track when making their wagers. This information, as well as
other relevant handicapping information, is published in The Daily
Racing Form. [9]
If bettors know that breeders may keep some of their yearlings, and
that breeders are more likely to sell horses from the lower end of the
distribution, they should expect thoroughbreds being raced by their
breeder to be of higher average quality and a test for adverse selection
arises. Specifically, bettors should favor thoroughbreds that are kept
by their breeder over otherwise similar thoroughbreds.
More formally, presume that horses in a race are drawn from a
quality distribution F(q), on the interval ([q.sub.L],[q.sub.H]) here q
indexes quality (L indicating low quality). Higher-quality horses are
more likely to win races. With no other information, the best estimate
of a horse's quality is the mean quality from the known
distribution. If bettors know a horse is a homebred, they also know that
the breeder has chosen to race, rather than sell, the horse. If
thoroughbreds taken to auction are adversely selected, the signal that
the horse is a homebred indicates that it is drawn from the top of the
distribution. That is to say, homebreds are drawn from the interval
([q.sup.*],[q.sub.H]), where [q.sup.*] [greater than] [q.sub.L]. [10]
Bettors should favor homebreds over otherwise similar nonhomebreds.
It is worth noting that a breeder may race a particular horse for
many reasons. Some breeders will keep a portion of their female horses
because they are inputs in the future breeding process. Also, a breeder
may be "stuck" with a horse because it was ill at the time of
a sale. [11] The weight put on the homebred characteristic may be
discounted if such information is available to bettors.
3. Data and Empirical Strategy
We test our hypothesis by examining data on two-year-old maiden
races conducted during the summer and fall of 1995 at the Saratoga and
Keeneland racecourses. [12] We examine two-year-old maiden races to
highlight the differences that might exist in bettors' perceptions
based on the distinction that a horse is or is not a homebred. We define
a homebred as a horse that has at least one entity listed as both its
breeder and owner at the time of the race. The variable
"Homebred" is set equal to one when this condition is met and
is set equal to zero otherwise.
A maiden race is a race restricted to thoroughbreds that have not
yet won a race. The majority of horses in two-year-old maiden races have
few or no previous starts because thoroughbreds do not begin racing
until their two-year-old season. Generally, two-year-old maidens are
randomly allocated to races based on a nomination process. Owners
nominate their maiden two-year-olds to races for an upcoming meet. From
this pool of eligible two-year-olds, horses are drawn into particular
races several days prior to the race. In the meets studied, most
nominations are made well in advance of the races. [13]
In maiden races, bettors have limited information about the
on-track performance of horses. For the problem examined here, this is
advantageous because as horses start in more races, information about
their on-track ability is revealed. If a horse's on-track ability
is correlated with being a homebred, the homebred variable may become
less important statistically. [14]
Posttime odds are used to measure the attitudes of bettors toward
each horse in our sample. The most prominent piece of information
displayed at the track prior to each race is the odds. The current win
odds are displayed and updated in one-minute intervals, following the
most recent race until the horses reach the starting gate. The odds are
calculated as follows:
odds = (1 - t)/[P.sub.k] - 1,
where [P.sub.k] is the proportion of the total win pool bet on
horse k, and t is the takeout or parimutuel tax. As more dollars are bet
on a particular horse, the odds fall. Lower odds indicate that bettors
find it relatively more likely that a horse will win the race. At the
time the horses reach the starting gate, the betting windows are closed
and the final race odds are calculated. These posttime odds are
published for each race and are the measure of odds we use in this
study.
We model the speed of a racehorse as being equal to [y.sub.ir] =
[X.sub.ir][beta] + [[epsilon].sub.ir], where x is a vector of
covariates, [beta] is the corresponding vector of coefficients,
[epsilon] is the random error term, i indexes the horse, and i indexes
the race. [15] Horse i will win race r if for each j [not equal to] i,
[y.sub.i] [greater than] [y.sub.i]. Assuming the error terms are
independent extreme value random variables, the probability that horse i
wins a race is given by
[e.sup.[x.sub.i][beta]]/([[[sigma].sup.m].sub.j=1]
[e.sup.[x.sub.j][beta]]), where j indexes all the horses included in a
race.
This model leads naturally to a fixed-effects regression model.
Setting the probability that a horse will win a race equal to (1 - t)/(l
+ odds), the betting public's expected probability that a horse
will win a race, and manipulating, gives the fixed-effects model
In([odds.sub.ir] +1) = [d.sub.r] + [x.sub.ir][beta],
where i indexes each horse in race r and [d.sub.r] is a set of
dummy variables to capture fixed race effects.
We also examine a rank-ordered logit specification. Several studies
have shown that the odds may not reflect the true probability that a
horse will win a race. [16] The basic result is that long shots (horses
with a low true probability of winning a race) are overbet. Bettors
perceive the probability that a long shot will win a race to be higher
than it actually is. Similarly, the true probability of favorites winning a race is lower than bettors predict, or they are underbet. This
favorite--long-shot bias suggests that the odds may not properly
aggregate the information on the relative quality of the horses entered
in a particular race, although their predicted rankings are, on average,
accurate. The noise introduced by this bias is reduced in the
rank-ordered logit model.
Beggs, Cardell, and Hausman (1981; BCH) show that if the
probability of choice i being favored over choice j is independent of
the other choices available, the probability of an observed ranking, 1
[greater than] 2 [greater than] ... [greater than] m, is the product of
the conditional probabilities of choices from successively restricted
subsets. This gives the following likelihood function for a set of
observations:
L = [[[pi].sup.n].sub.r=1]
([e.sup.[x.sub.1][beta]]/[[[sigma].sup.m].sub.i=1]
[e.sup.[x.sub.i][beta]] .
[e.sup.[x.sub.2][beta]]/[[[sigma].sup.m].sub.i=2]
[e.sup.[x.sub.i][beta]] ...
[e.sup.[x.sub.m-1][beta]]/[[[sigma].sup.m].sub.i=m-1]
[e.sup.[x.sub.i][beta]]),
where r = 1, ..., n indexes each race, and i = 1, ..., m indexes
the horses in each race by rank, where i = 1 indicates the horse is the
favorite in the race. This specification improves over an ordinary logit
specification because it contains information on the entire ranking
rather
than estimating only the probability that one horse is favored over
all others. Additionally, this specification accounts for within-race
effects.
Hausman and Ruud (1987) note that people are likely to use more
care when differentiating between top choices compared to the less
favored alternatives. Intuitively, bettors paint a clearer picture among
the favorites, but random factors become more important for lower-ranked
options. To account for these difficulties, Hausman and Ruud modify the
BCH likelihood function by restricting the model to include only the top
P ranks. This specification is given by
L = [[[pi].sup.n].sub.r=1]
([e.sup.[x.sub.1][beta]]/[[[sigma].sup.m].sub.i=1]
[e.sup.[x.sub.i][beta]] .
[e.sup.[x.sub.2][beta]]/[[[sigma].sup.m].sub.i=2]
[e.sup.[x.sub.i][beta]] ...
[e.sup.[x.sub.p][beta]]/[[[sigma].sup.m].sub.i=p]
[e.sup.[x.sub.i][beta]]),
where P is the number of ranks used to estimate the model. This is
essentially a weighted maximum likelihood technique, where the weight on
the first P ranks is 1, and zero on the remainder. We estimate both
full-rank and partial-rank logit specifications along with the
fixed-effects model discussed above.
The data on the control variables and information regarding the
name of the breeder, owner, and trainer of each horse in our sample were
collected from various issues of The Daily Racing Form (henceforth The
Form) and the American Produce Records. For each race, The Form lists
the conditions of the race and the "past performances" of all
horses entered. [17] The past performances include information on the
horse's name, its sire (father) and dam (mother), the names of the
breeder and owner of the horse, the trainer, the horse's career
record (number of starts, record in starts, and money earnings), the
horse's most recent performances, and the results of recent
workouts. We collected information from each of these components to
capture the information available to bettors.
Posttime odds, post position, and jockey were taken from the charts
for each race. The charts for the races in our sample were obtained from
The Lexington Herald-Leader for the Keeneland races and from The Form
for the Saratoga races. Our sample includes 389 horses drawn from 39
horse races. The sample includes all two-year-old maiden races run
during the Saratoga and Keeneland fall meetings that had at least two
homebreds or at least two nonhomebreds entered.
To measure the ability of the jockeys, we include the variable
Jockey Winning Percentage. Jockey Winning Percentage is equal to each
jockey's 1994 winning percentage (number of wins divided by number
of starts). An increase in Jockey Winning Percentage indicates that a
jockey of relatively high skill is riding the horse and should therefore
be looked at more favorably by bettors. Thus, Jockey Winning Percentage
should be inversely related to its odds and positively related to the
probability that the horse will be among the race favorites. The race
favorite has the lowest odds among the horses in the race. In the
rank-ordered logit model, variables that are positively correlated with
the probability that a horse will be among the race favorites will have
a positive coefficient. To keep the analysis consistent, we use the
negative of the natural logarithm of (odds + 1) in the fixed-effect
model.
The Form includes information on recent workouts, which indicates
how a horse is performing. A common workout listing is
* Oct 1 Kee 5f fst :[58.sup.2] B 1/23.
This line shows that the horse worked five furlongs on October 1 at
Keeneland. [18] The track was fast. The workout was completed in a time
of 58 and 2/5 seconds. The B is a comment that the horse was breezing or
moving easily. The fraction indicates that this was the day's
fastest time of the 23 horses that worked this distance at Keeneland on
October 1. The solid black dot at the front of a workout line indicates
that the work was a bullet, or the fastest five-furlong work at
Keeneland that day.
To incorporate information contained in the work line, we include
the number of listed works (Works), the number of bullets in the
previous three works (Bullets), and the relative ranking of a
horse's last work (Workrank). As Works increase, we argue that the
trainer has made greater efforts to prepare the horse for the race. A
bullet workout indicates that a horse is training well. We expect Works
and Bullets to be positively related to the probability that a horse
will be among the race favorites. Finally, a higher Workrank indicates
the last workout was relatively poor, and we expect to see an inverse
relationship between Workrank and the public's view that a horse
will win.
The variable Month is defined to be equal to one if the horse was
born in January, two if February, and so on. The more recently a horse
was born, the larger the value of this variable. We include the variable
Month to capture the effect of a horse's age and expect horses born
more recently to be relatively more immature. We expect a negative
correlation between Month and the probability that a horse will win.
On a regular basis, The Form publishes a list of sires that have
shown a propensity to beget horses that win as two-year olds. The
variable Outstanding Juvenile Sire is set equal to one if a horse's
sire appears on this list, and zero otherwise. We expect Outstanding
Juvenile Sire to increase the probability that a horse is among the race
favorites. We also include a variable New Sire, which is set to one if
the sire's first crop of two-year olds is currently racing. This
variable is included to control for the inability of these sires to
appear on the published juvenile sire list and the surrounding uncertainty regarding the sire's potential.
To control for the past racing performance of each horse, we
include the variables Percent in the Money, which is defined as the
ratio of top-three finishes (in the money) to the number of races a
particular horse has started. In maiden races (races for horses that
have not previously won), there is some indication that the horse may
soon win if it has previously been in the money. We expect Percent in
the Money to be positively related to the public's perception that
a horse will win. We also include the variable First Start, defined as
one if the horse is a first-time starter, and zero otherwise. Because
horses making their first start lack experience, we expect the public to
look upon these horses less favorably than experienced horses.
We include Trainer Winning Percentage and Trainer Zero to account
for the quality of the horse's trainer. Trainer Winning Percentage
is equal to a trainer's number of wins in 1994 divided by the
number of starters in 1994. An increase in Trainer Winning Percentage
indicates that a conditioner of relatively higher skill trains the horse
and therefore should be looked at more favorably by bettors.
Conditioners who had a zero winning percentage in 1994 train a portion
of the horses in our sample. These are predominantly trainers that
earned less than $50,000 in 1994. We include the variable Trainer Zero
to account for these observations and expect this to decrease the
likelihood that a horse will be among the race favorites.
As noted above, the sample includes races for colts (young male
horses) and races for fillies (young female horses). In the sample,
there are 203 male horses, and among these, 32 have been gelded (or
neutered). Geldings have no residual value in breeding. Breeders and
owners will therefore prefer not to geld a horse unless there is a
compelling reason to do so. Generally, the factors that lead to a horse
being gelded are negative signs regarding a horse's performance.
For example, horses are typically gelded when they are extremely
aggressive and difficult to train. We include the variable Gelding and
expect it be inversely related to the public's perception that a
horse will win a race.
In the presence of adverse selection, market mechanisms should
evolve to correct market inefficiencies that may arise. Several auction
houses, most notably Keeneland, conduct sales where the horses offered
for sale must qualify by passing an inspection. [19] If the auction
house is able to certify that all of the yearlings offered in a sale
exceed some cutoff, the average quality, and thus price, in the sale
will increase. The distribution of horses sold in a certified sale is
therefore truncated in a fashion similar to homebreds. Information
regarding the sale in which a horse is sold is published widely in
industry journals and in the American Produce Records. Bettors are
therefore likely to know which horses were sold in certified sales. We
include the variable Select to control for this and expect it to be
positively related to the probability that it will be among the race
favorites.
Finally, we include a variable Pick, which is set equal to one if
The Form's handicappers have indicated that a horse is one of their
top three choices in a race, and zero otherwise. In every edition of The
Form, four handicappers give their top three choices for each race on
the day's program. The Form uses a simple formula to aggregate the
picks and publishes the consensus view. While the experts will use the
information provided to bettors in The Form, they are also likely to use
information that is observable to bettors but not to the econometrician.
For example, The Form's handicappers may use information on how a
particular horse is behaving in its morning workouts, whether it is
having trouble handling starting gates or if it has a tendency to get
nervous when around other horses. We expect the betting public to use
the consensus picks, or information that is correlated with them, when
handicapping a race and expect this variable to positively relate to the
likelihood that a horse will be among the race favorites. [20]
Horses may be coupled for wagering purposes. This occurs when a
particular trainer trains two or more horses in a race and, in most
jurisdictions, have common ownership. Coupled horses are treated as a
single entry for wagering purposes. When two or more horses are coupled,
a bet on this entry will pay off if any of the horses in the coupling
finish in the wagered position or better Our sample contains 13 coupled
horses from six races (approximately 3% of our sample). Dropping these
observations from our sample will not affect the public's ranking
of horses and will, therefore, have little effect on the rank-ordered
logit model. We also drop these observations from the fixed-effects
model. [21]
Summary statistics for the control variables are presented in Table
1. The table presents the results for the full sample, broken down by
Homebred and Nonhomebred and by the Keeneland and Saratoga races. The
sample consists of 389 horses, 165 homebreds and 224 nonhomebreds. In
the Keeneland races, 191 horses that ran in 17 races are included in our
sample. From the table, we see that the mean of Posttime Odds are only
slightly lower for homebreds. The control variables are similar between
the homebred and nonhomebred portions of the sample. We observe that
homebreds are slightly older and are less likely to be from a new sire.
The means for Keeneland and Saratoga show that fields are larger at
Keeneland and, thus, have higher mean odds. Keeneland entries have a
higher Percent in the Money and are less likely to be first-time
starters. This last finding is consistent with the fact that
Keeneland's meet follows Saratoga's.
4. Results
Table 2 presents the results of our empirical analysis. [22] In the
table, column 1 presents the results for fixed-effects model, and column
2 for the partial-rank-ordered logit model including three ranks. Column
3 presents the partial-rank model using the top four ranks, and column 4
presents the results for the full-rank logit. [23]
Estimated coefficients for the control variables are generally as
expected. In all of the specifications, we see that the coefficients for
Jockey Winning Percentage, Workrank, Percent in the Money,
Trainer's Winning Percentage, Pick, and Select are statistically
significant and of the expected signs. In addition, the variables Works
and Bullets are significant in the fixed-effects model, Trainer Zero is
significant in the top three partial-rank logit, First Start is
significant in the top four partial-rank logit, and Works, Bullets, and
Outstanding Juvenile Sire are significant in the full-rank logit.
The results for our variable of interest, Homebred, are as
predicted. In all four specifications, it is statistically significant
(although only at the 10% level in the full-rank model) and has the
predicted sign. In the rank-ordered logit models, the significance of
the Homebred variable is greatest in the partial-rank specifications.
This may indicate that the difference between homebreds and nonhomebreds
is most important when bettors are choosing between top-ranked horses.
[24] These results indicate that the signal provided by Homebred is
important for determining the betting patterns of two-year-old maiden
races. [25]
In general, our results are consistent with the notion that bettors
favor homebreds over otherwise similar nonhomebreds when examining
two-year-old maiden races. The implication of these results is that
bettors recognize that horses racing as homebreds are drawn from a
distribution that is truncated from below. It appears that based on
bettors' perceptions, asymmetric information leads racing breeders
to keep their best thoroughbreds and adversely select the horses they
choose to sell. [26]
5. Conclusions
In this paper, we hypothesize that horses raced as homebreds will
be favored over otherwise similar nonhomebreds. A homebred is defined as
a horse being raced by its breeder. All breeders have the option of
selling horses without racing them. Those breeders who both race and
sell thoroughbreds, because of informational asymmetries, are likely to
sell their low-quality horses. It follows that horses raced as homebreds
are drawn from a quality distribution that is truncated from below and
have an expected quality that is higher than the expected quality of
otherwise similar nonhomebreds. Betting markets, which have been shown
to be relatively efficient, provide a natural setting to examine the
consequences of asymmetric information in this market. If the act of
keeping (or selling) a thoroughbred is based on private information,
bettors should favor homebreds over otherwise similar nonhomebreds.
We test this prediction using a sample of 39 two-year-old maiden
races conducted at the Keeneland and Saratoga racecourses in the summer
and fall of 1995. We find evidence consistent with our predictions.
Fixed-effects regressions of the natural log of post-time odds,
controlling for characteristics of the race and the individual horses,
show that homebreds, on average, have lower odds. Similarly,
rank-ordered logit regressions show that homebreds tend to be favored
over otherwise similar nonhomebreds. The evidence suggests that adverse
selection is present in the market for thoroughbreds but may be less
severe at the top end of the market where independent agents certify the
quality of horses.
This study suggests that further work is needed to understand the
effect asymmetric information has on markets. In particular, it may be
reasonable to conclude that problems of asymmetric information are less
problematic at the top end of markets because information may be more
valuable and the market will provide mechanisms to correct for any
inefficiency that may arise. The gain from providing information or
developing institutional arrangements to alleviate problems of
asymmetric information is likely to be greater as the value of items
being sold increases. Other possible extensions include conducting a
study that examines older horses to see if the "homebred
effect" persists throughout a horse's career. Finally, an
examination of differences in lifetime earnings between homebreds and
nonhomebreds as well as a comparison of horses sold at different sales
would provide more conclusive evidence of the presence of adverse
selection.
(*.) Department of Economics, St. Lawrence University, Canton, NY
13617, USA; E-mail chezum@stlawu.edu, corresponding author.
(+.) Department of Economics, University of Nevada Las Vegas, Las
Vegas, NV 89154-6005, USA; E-mail wimmer@ccmail.nevada.edu.
We would like to thank John Garen, David Richardson, Seungmook
Choi, and an anonymous referee for helpful advice. All mistakes are the
fault of the authors.
Received July 1997; accepted June 1999.
(1.) Chezum and Wimmer (1997) show that breeders who also race,
receive, on average, lower prices at thoroughbred yearling sales than
breeders who take all of their thoroughbreds to market.
(2.) While bettors accurately estimate the order of finish, there
is evidence that favorites are underbet and long shots are overbet (see,
e.g., Sauer [1998] for a recent survey).
(3.) Bond (1982) compares the maintenance records of trucks
acquired new with those acquired used and finds no significant
differences. Bond's work is comparable to this study.
(4.) An interesting question, which is beyond the scope of this
paper, is the extent to which racers will vertically integrate into the
breeding end of the business.
(5.) Alternatively, Wilson (1980) examines the case where buyers
and sellers have different preferences for the goods being sold to
generate an adverse-selection result. In the context of the example
presented here, buyers and sellers have some preference for racing
thoroughbreds. Presumably, some buyers have a preference that exceeds
sellers' preferences for racing. Wilson shows that in such a model,
a positive-price equilibrium exists. (Wilson also examines the
possibility of multiple equilibria and different market arrangements.)
In this equilibrium, sellers will keep their highest quality
thoroughbreds, selling from the lower end of the distribution.
(6.) Pari-mutuel wagering is the process where the odds are
determined by the relative amount of money bet on each horse after
accounting for the amount of money taken out of the betting pool. A
portion of this "takeout" is used to fund racetrack operations
and purses, with the remainder going to state and local governments.
(7.) Additional wagers are also available. These "exotic"
wagers include bets where the bettor must pick the top two (an exacta)
or top three finishers (a trifecta) in the correct order to earn a
payoff.
(8.) Maximizing returns should approximate the objectives of
utility maximization if the bettor's utility is a function of the
payoffs and cashing tickets. Golec and Tamarkin (1998) argue that bettor
utility functions depend not only on expected returns but also on the
skewness of returns. They argue that this may explain the observed
long-shot bias.
(9.) The Daily Racing Form is a daily newspaper that publishes
information regarding the horses entered in the races at several
racetracks across the country. It is available at the track on the
afternoon prior to the racing day.
(10.) Nonhomebreds may be one of two types. A nonhomebred may have
been sold by a breeder that does not race, selling all of his
thoroughbreds, or may come from a breeder that both sells and races
thoroughbreds. We assume that bettors may not have access to this
information and implicitly assume that the expected quality is taken
over the entire distribution for both types.
(11.) Several of the homebreds in our sample were actually sold at
auction, but the original breeder retained ownership or a share of
ownership in the horse. This may happen for several reasons. First, the
horse may have failed a veterinarian's exam following the sale and
was returned to the breeder. This might be a case where you could argue
that the seller is "stuck" with a horse. Alternatively,
breeders may approach buyers following the sale and attempt to buy a
share of the horse.
(12.) The Saratoga meet is run during late July and August, and
Keeneland meets in October.
(13.) For claiming and allowance races, horses are sorted into
quality categories. In general, horses in a claiming race are of lower
quality. Additionally, allowance races have conditions that define which
horses are eligible for the race. Thus, the draw of horses is much less
random in allowance and other races than in two-year-old maiden races.
(14.) The underlying notion is that the importance of the homebred
variable will evaporate as a horse establishes a racing record. Farber
and Gibbons (1996)--examining the relationship between education,
experience, and wages--suggest that this result may not hold.
(15.) We would like to thank an anonymous referee for pointing out
this model and the specifications that follow.
(16.) For studies that find this result, see Griffith (1949; 1961),
Hoerl and Fallin (1974), Ali (1977), and Golec and Tamarkin (1998). A
recent survey of these results can be found in Sauer (1998).
(17.) The conditions define the length of the race, the quality of
the race, and the size of the purse.
(18.) A furlong is equal to one-eighth of a mile.
(19.) For a more complete discussion of certification in this
market, see Wimmer and Chezum (1998).
(20.) In addition to the results reported, we ran several
specifications that did not include the Pick variable. These results
were generally consistent with those reported.
(21.) In other specifications not reported here, the likelihood
function in the rank-ordered logit model was corrected to account for
the joint probability of coupled horses winning the race. In the
fixed-effects model, we ran a specification that included a qualitative
variable to account for coupled horses. Results from both specifications
are qualitatively similar to those reported below.
(22.) The regression presented in column one is based on
proportions data and is therefore heteroskedastic. The results reported
are corrected for this as in Greene (1993, pp. 653-5), using the number
of horses entered in the race to account for differences in the size of
the total purse.
(23.) Because bettors typically only wager on the top three
finishers, use of the top three ranks is a natural specification in this
setting. Evidence from Hoerl and Fallin (1974) indicate that the
majority of money is bet on the top three or four ranked horses.
Additionally, the decrease in the predicted and actual probabilities of
winning as rankings increase (where a ranking of one indicates a horse
is the favorite) is much greater between the first several rankings than
the differences for horses whose odds indicate they are looked upon less
favorably by bettors. This suggests that favored horses are likely to be
ranked more accurately than are lower ranked horses.
(24.) Chi-square tests to determine whether there is a
statistically significant difference between the full- and
partial-rank-ordered logits yielded teat statistics of 26.88 and 23.58
for the top three and top four specifications. The critical value for a
chi-square distribution with 15 degrees of freedom is 24.996, allowing
us to reject the null hypothesis that the full-rank specification yields
the same estimates as the partial-rank specifications in the case of the
three-rank model. This suggests that bettors pay closer attention to the
top ranks, or random factors play a larger role when differentiating
between long shots. Hausman and Ruud (1987) suggest a technique to
control for heteroskedasticity in this model. This specification was
examined and showed no improvement over the partial-rank specification.
(25.) In addition to rank-ordered logits based on the ranking of
odds, specifications using the actual position of finish as the
dependent variable were estimated. In general, these regressions
produced weaker results than those found using odds to rank the horses.
Using the same specification as in the log odds regressions, only
Percent in the Money and Trainer Winning Percent were statistically
significant and of the expected sign. Homebred received the expected
sign but was not statistically significant. These mixed results are
likely due to the noise inherent in the running of a horse race. We
expect that a larger sample of races might yield more favorable results.
Several specifications yielded more favorable results but included
covariates not readily available to the public. Discussion of these
factors is beyond the scope of this paper.
(26.) Chezum and Wimmer (1997), using a continuous variable, show
that sellers that race more receive lower prices for otherwise similar
thoroughbred yearlings at auction. We included such variables in several
unreported specifications, but the data suggest this information is too
fine for bettors to perceive and were not statistically significant.
References
Akerloff, George A. 1970. The market for 'Lemons':
Quality uncertainty and the marker mechanism. Quarterly Journal of
Economics 84:488-500.
Ali, Mukhtar M. 1977. Probability and utility estimates for
racetrack bettors. Journal of Political Economy 83:803-15. Beggs, S., S.
Cardell, and J. Hausman. 1981. Assessing the potential demand for
electric cars. Journal of Econometrics 16:1-19.
Bond, Eric W. 1982. A direct test of the "Lemons" model:
The market for used pickup trucks. American Economic Review 72:836-40.
Chezum, Brian, and Bradley S. Wimmer. 1997. Roses or Lemons:
Adverse selection in the market for thoroughbred yearlings. Review of
Economics and Statistics 79:521-6.
Farber, Henry S., and Robert Gibbons. 1996. Learning and wage
dynamics. Quarterly Journal of Economics 116:1007-47.
Genesove, David. 1993. Adverse selection in the wholesale used car
market. Journal of Political Economy 101:644-65.
Gibbons, Robert, and Lawrence T. Katz. 1991. Layoffs and Lemons.
Journal of Labor Economics 9:351-80.
Golec, Joseph, and Maurry Tamarkin. 1998. Bettors love skewness,
not risk at the horse track. Journal of Political Economy 106:205-25.
Greene, William H. 1993. Econometric Analysis. 2nd edition. New
York: Macmillan Publishing Company.
Greenwald, Bruce C., and Robert R. Glasspiegel. 1983. Adverse
selection in the market for slaves: New Orleans, 1830-1860. Quarterly
Journal of Economics 98:479-99.
Griffith, Richard M. 1949. Odds adjustments by American
horse-racing bettors. American Journal of Psychology 62:290-4.
Griffith, Richard M. 1961. A footnote on horse race betting.
Transactions Kentucky Academy of Science 22:78-81.
Hausman, Jerry A., and Paul A. Ruud. 1987. Specifying and testing
econometric models for rank-ordered data. Journal of Econometrics
34:83-107.
Hoerl, Arthur E., and Herbert K. Fallin, 1974. Reliability of
subjective evaluations in a high incentive situation. Journal of the
Royal Statistical Society 137:227-30.
Sauer, Raymond D. 1998. The economics of wagering markets. Journal
of Economic Literature 36:2021-64.
The Daily Racing Form. 1995. Various issues, July-August.
The Lexington Herald Leader. 1995. Various issues, July-August.
Wilson, Charles. 1980. The nature of equilibrium in markets with
adverse selection. Bell Journal of Economics 11:108-30.
Wimmer, Bradley, and Brian Chezum. 1998. The effects of
certification in a lemon's market. Unpublished paper, University of
Nevada-Las Vegas.
Descriptive Statistics (Standard Deviations in Parentheses)
Full Sample Homebreds Nonhomebreds Keeneland Saratoga
Posttime odds 21.950 21.347 22.395 26.444 17.615
(23.11) (22.39) (23.66) (26.58) (18.22)
Homebred 0.4242 0.4293 0.4192
(0.495) (0.496) (0.495)
Jockey winning 0.1455 0.1430 0.1474 0.1328 0.1579
Percentage (0.051) (0.051) (0.050) (0.052) (0.047)
Works 6.303 6.406 6.227 6.094 6.505
(2.39) (2.35) (2.43) (2.78) (1.95)
Bullets 0.2519 0.2424 0.2589 0.2984 0.2071
(0.511) (0.496) (0.523) (0.552) (0.465)
Workrank 0.4729 0.5055 0.4489 0.4604 0.4851
(0.286) (0.281) (0.287) (0.293) (0.279)
Month 3.244 3.097 3.353 3.325 3.167
(1.25) (1.31) (1.19) (1.25) (1.24)
Outstanding 0.1748 0.1697 0.1786 0.1937 0.1566
juvenile sire (0.380) (0.377) (0.384) (0.396) (0.364)
New sire 0.0925 0.0424 0.1295 0.1152 0.0707
(0.290) (0.202) (0.336) (0.320) (0.257)
Percent in the 0.1733 0.1841 0.1655 0.1812 0.1659
money (0.339) (0.347) (0.333) (0.333) (0.344)
First start 0.4319 0.4182 0.4420 0.3508 0.5101
(0.496) (0.495) (0.498) (0.478) (0.501)
Trainer's 0.1333 0.1346 0.1324 0.1301 0.1365
winning (0.068) (0.073) (0.064) (0.074) (0.061)
percentage
Trainer zero 0.1028 0.1030 0.1027 0.1675 0.0404
(0.304) (0.305) (0.304) (0.374) (0.197)
Gelding 0.0823 0.0848 0.0804 0.0733 0.0909
(0.275) (0.280) (0.272) (0.261) (0.288)
Pick 0.2725 0.2545 0.2857 0.2565 0.2879
(0.446) (0.437) (0.453) (0.438) (0.454)
Select 0.1311 0.0242 0.2098 0.1414 0.1212
(0.338) (0.154) (0.408) (0.349) (0.327)
Observations 389 165 224 191 198
Results for Fixed-Effects and Rank-Ordered Logit
Models (t and z Statistics in Parentheses)
Fixed Top Three Top Four
Effects Ranks Ranks
Homebred 0.1783 [**] 0.6530 [***] 0.4770 [**]
(2.39) (2.60) (2.210)
Jockey winning 4.7229 [***] 9.3057 [***] 8.5601 [***]
percentage (6.14) (3.66) (3.86)
Works 0.0435 [**] 0.0859 0.0701
(2.36) (1.35) (1.36)
Bullets 0.1619 [**] 0.2091 0.2460
(2.23) (0.85) (1.19)
Workrank -0.4299 [***] -0.8434 [*] -1.1821 [***]
(3.14) (1.78) (2.90)
Month -0.0046 0.0116 0.0195
(0.16) (0.13) (0.24)
Outstanding juvenile 0.1072 -0.0935 0.1909
sire (1.15) (0.31) (0.73)
New sire -0.0520 -0.4232 -0.0306
(0.40) (0.97) (0.09)
Percent in the money 1.0067 [***] 2.3760 [***] 2.3581 [***]
(8.84) (6.18) (6.91)
First start 0.0895 0.5556 0.5880 [**]
(0.90) (1.55) (2.04)
Trainer's winning 3.0742 [***] 9.3394 [***] 9.3697 [***]
percentage (4.23) (3.72) (3.91)
Trainer zero 0.0772 1.3079 [**] 0.8536
(0.43) (2.05) (1.52)
Gelding -0.0772 -0.2531 -0.0046
(0.53) (0.52) (0.01)
Full Rank
Homebred 0.2120 [*]
(1.62)
Jockey winning 6.8675 [***]
percentage (4.42)
Works 0.0547 [*]
(1.72)
Bullets 0.2520 [*]
(1.78)
Workrank -0.9252 [***]
(3.45)
Month 0.0150
(0.27)
Outstanding juvenile 0.4781 [**]
sire (2.44)
New sire -0.0055
(0.02)
Percent in the money 2.0112 [***]
(7.80)
First start 0.1564
(0.92)
Trainer's winning 5.2968 [***]
percentage (3.62)
Trainer zero 0.1183
(0.39)
Gelding -0.2398
(0.89)
Pick 0.7656 [***] 1.5096 [***] 1.5329 [***] 1.2592 [***]
(9.61) (6.10) (7.22) (7.37)
Select 0.3657 [***] 0.6637 [*] 0.6690 [**] 0.4378 [**]
(3.57) (1.91) (2.25) (2.06)
Constant -4.3961 [***]
(20.480)
Log likelihood -169.073 -227.512 -489.888
[R.sup.2] 0.6566
(*.)Significant at the 10% level.
(**.)Significant at the 5% level.
(***.)Significant at the 1% level.