The impact of atmospheric conditions on the baseball totals market.
Paul, Rodney J. ; Weinbach, Andrew P. ; Weinbach, Chris 等
Introduction
Bahill et al. (2009) published a paper in the International Journal
of Sports Science and Engineering entitled "Effects of Altitude and
Atmospheric Conditions on the Flight of a Baseball." This paper
illustrated, through both physics and empirical analysis, that altitude
and weather affect air density, which impacts how far a batted baseball
will travel. Air density consists of four factors: altitude,
temperature, humidity, and barometric pressure. Altitude, temperature,
and humidity each have an inverse relationship with air density, while
barometric pressure is positively related to air density.
The findings of the paper by Bahill et al. (2009) show that
altitude is the most important factor determining air density, followed
by temperature, barometric pressure, and humidity. The distance a
baseball travels is ultimately affected by two countervailing forces,
drag force and Magnus force. As air density gets smaller, drag force
gets smaller which allows the ball to travel farther, but Magnus force
also gets smaller as air density decreases, which decreases how long the
ball will be held aloft, ultimately decreasing the distance the ball
will travel. In simulations shown in the study by Bahill et al. (2009),
drag force dominates Magnus force which results in a baseball traveling
farther when air density decreases. This could have has a major impact
on the number of home runs in a baseball game. Through the estimates of
the authors, a 10% decrease in air density (on a typical July afternoon)
will produce a 4% increase in the distance a baseball travels.
The results of the study of Bahill et al. (2009) are likely to have
profound importance for the baseball totals betting market. Since
atmospheric conditions have an impact on the distance a baseball will
travel, which influences the number of home runs and number of runs
scored in a baseball game, air density and other weather-related
variables could play a role in affecting the game outcome in the totals
market. If air density truly impacts scoring in a baseball game, this
variable will have an inverse relationship to runs scored in a game.
Controlling for the posted total in the over/under market, it is
straightforward to test if air density will impact the total number of
runs scored in a game and influence the number of overs vs. unders in
the baseball totals wagering market. If air density actively plays a
role, and is currently unrecognized or inaccurately interpreted by
market forces, simple wagering strategies based on atmospheric
conditions may generate positive returns in the betting market.
In addition to studying air density, given the availability of
other weather condition variables from baseball box scores, we are also
able to investigate the role of wind speed, wind direction, and overall
weather condition (sunny, cloudy, clear, etc.). We include air density,
wind speed and direction, and dummy variables for weather condition in a
series of regression models to determine if these factors impact the
total number of runs scored in a game, the frequency of overs in the
totals market, and the percentage bet on the over to determine the
extent to which these factors influence the totals market. From there,
we construct simple betting strategies and use betting simulation to
determine the returns to these strategies using data from the 2012
baseball season.
Literature Review on Totals Markets and Weather
Baseball totals were first studied by Brown and Abraham (2002), who
found some market inefficiencies based upon steaks of overs and unders
during the 1997 season, but subsequently found that the market was
efficient in other years. Paul and Weinbach (2004) commented on the
paper and noted that the market efficiency studies carried out by Brown
and Abraham (2002) were incomplete, as their data set did not include
the odds adjustment commonly seen in the baseball totals market (where
the total itself is typically accompanied by an adjustment either toward
the over or the under that makes bettors lay more to win $1 on a
particular side of the wager). This elicited a response to the comment
by Brown and Abraham (2004) and an evaluation of the debate by Gandar
and Zuber (2004), which ultimately concluded that the odds adjustments
are necessary for any study of market efficiency involving baseball
totals.
The baseball totals market was also studied by Paul et al. (2013),
where more detailed betting market data obtained from
www.sportsinsights.com was used to investigate the preferences of
gamblers in both the sides and totals market. In relation to totals, the
under bet was shown to be popular when high-quality pitchers, proxied by
Cy Young Award voting, were pitching in a particular game. This result
is counter to baseball games where these elite pitchers are not pitching
and across other sports where the over is shown to be a much more
popular wager than the under.
Studies of the impact of weather on financial markets are not new,
both in and out of sports. Roll (1984) studied the impact of weather on
orange juice prices and further investigated the role of weather
conditions on security prices and transaction volume in the stock market
(Roll, 1988). Hirschleifer and Shumway (2003) investigated the role of
sunshine and other weather conditions on stock returns and found
significant impacts on investor reactions and market prices.
The role of weather as a potential fundamental factor that affects
betting markets fits into the research questions posed by Brown and
Sauer (1993). Weather and track conditions were suggested as important
factors related to betting patterns, returns, and market efficiency in
the study of horse racing betting markets as early as the late 1970s
(Figlewski, 1979). The impact of weather effects on the totals market
has been shown previously in Borghesi (2008). Heat, wind, and rain were
shown to have a negative impact on scoring in the National Football
League. Bettors were shown to not fully account for these factors when
placing totals wagers on the NFL. Statistically significant profits were
shown when weather effects were incorporated into an NFL totals market
betting strategy.
Regression Model of Atmospheric Conditions on Runs Scored
To determine if atmospheric conditions play a significant role in
the number of runs scored in a baseball game, we use a simple regression
model with the number of runs scored in a game as the dependent
variable. Since the number of runs scored is based upon team performance
variables, we use the betting market total (over/under wager) as our
control variable for the strength of the scoring prowess of the teams
playing and the ability of the starting pitchers in the game. If all of
the information about atmospheric conditions is included in the market
price, due to the knowledge of these factors by the sports book and/or
bettors, then air density, wind speed and direction, and weather
condition dummy variables should not have a significant impact on runs
scored beyond the adjustment already included when using the market
price (total) as an independent variable. All data used on game
outcomes, betting market odds, and betting percentages were gathered
from www.sportsinsights.com.
Baseball total market bets are generally not placed at flat odds.
Most totals include an odds adjustment, with more money needed to be
placed on either the over or the under to win one dollar. These odds
adjustments are available within our data set and this information is
included in the regression model by converting the odds adjustment into
the probability of an over for each game. If the odds adjustment is on
the over (laying more on the over to win one dollar), the probability of
an over is greater than one-half, while if the odds adjustment is on the
under (laying more on the under to win one dollar), the probability of
an over is less than one-half. This variable is included to capture the
additional likelihood of more (fewer) runs than the posted total based
on the odds adjustment on the over (under).
The next set of independent variables included in the regression
model account for the different atmospheric conditions seen at baseball
games. First, directly from the study of Bahill, Baldwin, and Ramberg
(2009), we include air density as an independ ent variable. The equation
to compute air density is noted on page 119 of their article and is
noted below as equation (1):
Air Density = 1.045 + 0.01045{-0.0035 (Altitude - 2600) - 0.2422
(Temperature 85) - 0.0480 (Relative Humidity - 50) + 3.4223 (Barometric
Pressure - 29.92)}. (1)
As noted in the introduction, air density has a negative
relationship with the distance a ball travels and therefore should have
a negative impact on the number of runs scored in a game. If the posted
total on the game does not fully incorporate the information embedded in
air density (altitude, temperature, humidity, barometric pressure), this
variable should have a negative and significant effect on the total
number of runs scored in a baseball game. Data on the variables used in
the calculation of air density were gathered from historical records
archived for each city in Major League Baseball by the website
www.weatherunderground.com, which actively captures weather data from a
large number of weather stations, providing information on local weather
conditions in very close proximity to the ballparks.
Beyond air density, we also include wind speed as an independent
variable. Wind speed will also influence how far a ball will fly, but it
also plays a role in moving the ball while in-air, leading to more
difficult fielding conditions. Windy days can lead to more errors by
defenders and potentially more hits than otherwise, if winds were calm.
The wind speed variable was taken directly from the box score of each
game.
In addition to the box score posting the wind speed, it also notes
the wind direction. Home runs are likely to be more frequent when the
wind is blowing out from home plate toward the outfield fence. The wind
direction could be included in the game total, if the market
incorporates this information into the closing price. The wind direction
variables are constructed as dummies for each of the categories of wind
direction noted in the box score. These include wind in (from center
field (CF), right field (RF), and left field (LF)), wind out (from CF,
RF, and LF), wind right to left, wind left to right, and wind varies
(swirling wind). These are all compared to the baseline where the wind
condition is listed as "none." To account for game played in
dome stadiums, a dome dummy is included, and for stadiums with a
retractable roof, a dummy is included for these games when the roof is
closed.
The overall weather conditions are also included in the box score
of every game, so we included these categories as dummy variables as
well. The weather conditions included are all compared to the omitted
dummy variable category of partly cloudy. The other weather conditions
included as independent variables are sunny, cloudy, clear, overcast,
drizzle, and rain. We also ran the regression omitting these categories,
but it did not change the statistical significance of the other
variables included in the model.
The regression model therefore takes the following form
incorporating all of the independent variable categories described
above:
(Total Runs Scoredi) = [[alpha].sub.0] + [[beta].sub.1](Betting
Market [Total.sub.i]) + [[beta].sub.2](Over Probability Based on Odds
Adjustment) + [[beta].sub.3](Air Density) + [[beta].sub.4](Wind Speed) +
[summation] [[beta].sub.i](Wind Direction Dummies) + [summation]
[[beta].sub.j](Weather Condition Dummies) + [[epsilon].sub.i] (2)
Summary statistics of the non-binary variables are shown in Table
1. Frequencies of the binary variables (wind direction and weather
condition) are shown in Appendix 1.
The regression results using ordinary least squares are shown in
Table 2. The coefficient is presented along with the t-statistic in
parentheses accompanying the individual variable. *-notation is shown to
denote statistical significance of the independent variables with ***
noting statistical significance at the 1% level, ** noting statistical
significance at the 5% level, and * noting statistical significance at
the 10% level. Due to the existence of both heteroskedasticity and
autocorrelation in the data, we used Newey-West heteroskedasticity- and
autocorrelation-constant standard errors and covariances. The results
shown in Table 2 reflect the use of these adjusted standard errors and
covariances.
The regression results for total runs scored in a game revealed
that the total was found to have a positive and significant effect on
total runs scored, but it was found to be less than one (0.6940). The
probability of an over based on the odds adjustment to the total was
found to be statistically insignificant. The two key variables of
interest, air density and wind speed, were both found to have
statistically significant results in relation to total runs scored in a
game, both at the 1% level. The R-squared of the regression is low (as
are other similar regressions below) as this reflects the noisy outcomes
of sporting events, which are part of the reason that sports are both
interesting to watch and to study. Similar levels of R-squared were
found in Sauer et al. (1988) and studies that followed which note the
large variability in sporting contest outcomes.
Air density was shown to have a negative effect on total runs
scored, as the model of Bahill, Baldwin, and Ramberg (2009) explained in
their paper on the physics of baseball. The greater the air density
during the game, the fewer runs that are typically scored in that
contest. Wind speed was shown to have a positive, although much smaller,
effect on total runs scored. This is likely due to the difficulties that
high winds pose on the defensive players in a baseball game.
The variables related to direction of the wind were not found to
have statistically significant results in relation to total runs scored.
This either implies that the total on the game fully captures the
direction of the wind (i.e., days when the wind is blowing out which
likely leads to more home runs) or the wind direction is not an
important determinant of total runs scored in a game. The weather
conditions listed in the box score for the day of the game were not
found to have a statistically significant effect on total runs, except
for the category of clear days. Clear days were shown to have a negative
effect on total runs scored (statistically significant at the 10%
level). Clear days likely give defenders an advantage, particularly in
the outfield, where the lack of clouds may help players more clearly see
the flying baseball, leading to more fly ball outs in a game.
With air density and wind speed being shown to have statistically
significant effects on total runs scored, above and beyond the posted
total on the game, the next step is to determine if betting strategies
based upon these variables have any possibility of earning positive
returns. With this in mind, we next set up a simple logit model with the
game outcome compared to the total as the dependent variable. When the
game is an over, the dependent variable takes a value of one, when the
game is an under, the dependent variable takes a value of zero. All
pushes (cases where the total runs scored equal the total) are removed
from the data set for this regression. The independent variables
included in this regression are the same as the previous regression as
shown in equation (2).
The results of the logit model show that betting strategies based
upon air density and wind speed in a baseball could yield positive
returns. Both air density and wind speed were found to have
statistically significant effects on the number of overs in the baseball
betting market with air density being significant at the 1% level and
wind speed being significant at the 5% level. Higher air density was
shown to make the under a more likely result in the baseball totals
betting market, while greater wind speed was shown to make the over a
more likely result. As for other independent variables in the logit
model, the total was shown to have a negative and significant effect (1%
level), meaning that higher totals result in more unders in the betting
market, which has been shown previously in the literature in baseball
(Paul et al., 2013). The dummy variable for clear days was again shown
to have a negative impact the on probability of an over with statistical
significance at the 10% level. All other independent variables, other
than the intercept, were not shown to have a significant impact on the
outcome of the game in the totals betting market.
Before applying the results seen above to betting simulations to
calculate the returns to betting strategies based upon air density and
wind speed, it is useful to attempt to understand why the sports book
may not include these factors in the market price. One rationale could
be that the book makers do not fully understand the impact of the
subtleties of the atmospheric conditions, but another possibility is
that the bettors could misinterpret the effects of certain
weather-related factors in the marketplace. To address this possibility,
we use data on betting percentages available from
www.sportsinsights.com. Their data set includes information on the
percentage of bets on the over and the under in the totals market for
baseball. The use of this data provides the opportunity to investigate
how bettors respond to the atmospheric conditions in a given city on the
day a baseball game is played. The same independent variables are used
as the previous two regressions (from equation (2)), but the dependent
variable is now the percentage bet on the over by bettors in the
wagering market. Regression results using ordinary least squares are
shown in Table 4.
The percentage bet on the over was shown to rise with the total, as
the total has a positive and significant effect on the percentage bet on
the over at the 1% level. This has been shown previously in baseball
(Paul et al., 2013), as bettors prefer wagering on the over compared to
the under, especially when two high-scoring teams play each other, as
betting on more scoring compared to less likely brings more consumption
value to the bettor.
Air density was shown to have a positive and significant effect on
the percentage bet on the over at the 1% level. As air density rises,
more bettors place wagers on the over, despite the results seen in
Bahill et al. (2009) and the previous regression results shown in this
paper which illustrate that air density is inversely related to scoring
in a baseball game. In short, the bettors appear to not understand the
relationship between air density and scoring as they simply appear to
get the relationship wrong. Bettors may not have access to (or consider
using) detailed weather data, or may misinterpret or simply ignore this
variable. This result could imply that sports books do not fully adjust
the totals market to atmospheric conditions while the bettors ignore or
misinterpret the relationship between these variables.
Wind speed was not shown to have a statistically significant effect
on the percentage bet on the over. The remaining independent variables
were also not shown to have statistically significant effects on the
percentage bet on the over other than the intercept and the dummy
variables for the weather conditions of cloudy and clear (both
significant at the 10% level). Both cloudy and clear days were shown to
increase the percentage bet on the over in the regression model.
Given the results in this section, it is likely that betting
strategies based upon atmospheric conditions may yield positive results
in the baseball totals betting market. To illustrate this possibility we
construct a simple series of betting rules based on these variables in
the next section.
Betting Simulations
Given that air density and wind speed were shown to have
significant impacts on the total number of runs scored in baseball
games, simple betting simulations are generated to illustrate the
returns to wagering strategies based upon these atmospheric conditions.
Air density was shown to have a negative impact on runs scored
(increasing the likelihood of the under) and wind speed was shown to
have a positive impact on runs scored (increasing the likelihood of the
over). Given that the impact of these variables is more likely influence
the betting market when they are considerably above or below the mean
value of these variables, we have chosen a simple betting rule based on
a value of a standard deviation above or below the mean of these values
as the trigger point for placing a wager. Although simplistic, we
believe that calculating the returns from this basic strategy will allow
for the determination if it is possible to earn positive returns through
knowledge of the atmospheric conditions for a baseball game.
When the air density is shown to be a standard deviation above its
mean value, this would lead to a bet on the under (as air density is
inversely related to runs scored), while if it is a standard deviation
below its mean value it would lead to a bet on the under. The wagering
strategy based on wind speed is similar, except that wind speed was
shown to have a positive relationship with runs scored, so that when
wind speed is a standard deviation above its mean, a bet on the over is
placed, and when it is a standard deviation below its mean, an under
wager is made. Given that clear days were also shown to have a negative
impact on total runs scored and was also statistically significant in
the logit model on the frequency of overs, the results for placing a
wager on the under on clear days is also calculated. The results for
these strategies are shown in Table 5 with the number of bets placed
(N), the actual return to the betting strategy noted, the expected
return of the betting strategy, and the z-test comparing the actual and
expected returns based upon the test established by Gandar et al.
(2002).
The simple betting simulations reveal that strategies based upon
air density yielded positive returns during the 2012 baseball season.
When air density is high (a standard deviation above the mean), a
positive return of 0.0609 per dollar bet is shown when wagering on the
under due to the inverse relationship between air density and total runs
scored. In addition, when air density is low (a standard deviation below
the mean), a bet on the over is shown to generate positive returns of
0.0434 per dollar bet. Combining these strategies together yields a
large enough sample that the z-test is able to reject the null
hypothesis of a fair bet at the 5% level. It appears that simple
wagering strategies based upon air density yield positive and
significant results for bettors.
Wind speed did not yield as straightforward results as air density.
When wind speed was high, wagers on the over outperformed wagers on the
under, yielding slight positive (statistically insignificant) returns.
When wind speed was low, however, the bet on the under was outperformed
by the over, as over wagers yielded positive but insignificant returns.
It appears that wind speed and its impact on scoring may be a bit more
nuanced than this simple betting rule captures.
In relation to clear days, where the regression results of the
previous section revealed that fewer runs are scored on days that meet
this weather condition, the betting simulation confirmed that result as
wagers placed on the under on clear days earned returns of 0.0168 per
dollar bet. This result, combined with the sample size, could not reject
the null hypothesis of market efficiency.
Combinations of these atmospheric conditions were tried, but given
the small sample size where these factors overlap, meaningful results
could not be inferred. Perhaps future studies with more years of
detailed data on baseball games will reveal interesting results with
strategies involving a combination of these factors.
Conclusions
Based on the work on Bahill et al. (2009) on the physics of
baseball, the impact of air density on total runs scored in a baseball
game compared to the posted total on the game in the baseball betting
market was examined. Air density, which consists of altitude,
temperature, humidity, and barometric pressure, was shown to have a
negative impact on runs scored beyond that suggested by the posted
betting market total. In addition, air density was shown to have a
negative and significant effect on the frequency of overs in a logit
model related to outcome of wagers placed in the baseball totals market.
In addition to the effects of air density, we added the variables
of wind speed, wind direction, and box score-listed weather conditions
to the analysis of the effects of atmospheric conditions on the baseball
totals market. Wind speed was shown to have a positive and significant
effect on runs scored (beyond the posted total) and was shown to
positively influence the frequency of overs. The wind direction
variables did not reveal significant results and all but one of the
weather condition variables were also statistically insignificant. The
only weather condition variable that was shown to have a significant
result in both the total runs scored regression and in the logit model
of game outcomes was clear days. Clear days were shown to negatively
impact scoring, likely due to clear days offering optimal weather
conditions for fielders to perform defensively.
These factors were then included in simple betting simulations to
determine if they could yield positive returns in the baseball totals
market. For air density and wind speed, a simple betting rule based on
the actual atmospheric conditions being a standard deviation above or
below the mean value of these variables was the foundation for the
betting rule, with returns to wagering on the over and the under
calculated for these situations. For clear days, the returns to the days
where this condition was noted in the box score of the baseball game
were also calculated.
Wagering on the under when air density was high and wagering on the
over when air density was low were both shown to generate positive
returns. When combining both strategies related to air density,
statistically significant results using the z-test of Gandar et al.
(2002) were found at the 5% level. Although returns to wagering on the
over when wind speed was high was shown to outperform wagering on the
under, both strategies yielded negative returns. When the wind speed was
low, however, the opposite of the expected result was found as the over
outperformed the under, but statistically insignificant results were
found. Wagering the under on clear days was also shown to generate
positive returns, but they were not quite large enough to generate
statistically significant results. It should be noted that we are using
a single season of Major League Baseball data. Given the use of a single
season of data, it is still important to note the comments of Osborne
(2001) as it relates to market efficiency studies containing short
samples. Longer time horizons are preferable to shorter time frames when
studying market efficiency and betting market returns. However, a single
season of baseball games does contain 2,430 observations and is
equivalent to over nine NFL seasons when considering the number of games
per year.
Overall, it appears that air density and other atmospheric
conditions play a key role in the amount of scoring in a baseball game.
Given the positive returns to these strategies, it is possible that the
book makers do not fully appreciate the impact of these conditions on
scoring, or they may understand that the bettors do not recognize the
importance of these conditions. A regression using the percentage bet on
the over in the baseball totals market revealed that air density was
shown to have a positive and significant effect on the percentage bet on
the over, which is the exact opposite result of what is seen with game
outcomes. It appears the bettors may actually interpret the impact of
atmospheric conditions in the wrong direction, preferring the over to
the under when air density is high, even though these conditions make
the under the more likely betting market outcome. This result may help
in understanding why the sports book does not appear to fully
incorporate air density into the total as the betting public seemingly
ignores or misinterprets the impact of atmospheric conditions on the
number of runs scored in a baseball game.
Appendix 1: Frequencies of Binary Variables-Wind Direction
and Weather Conditions
Sample of 2012 Major League Baseball Games (2430 observations)
Wind Direction Frequency Weather Condition Frequency
In from CF 126 Sunny 259
In from RF 178 Cloudy 420
In from LF 162 Clear 516
Out to CF 317 Overcast 127
Out to RF 284 Partly Cloudy 743
Out to LF 204 Drizzle 16
Right to Left 344 Rain 11
Left to Right 408
Varies 55
None 271
Dome 81
References
Bahill, T. A., Baldwin, D. G., & Ramberg, J. S. (2009). Effects
of altitude and atmospheric conditions on the flight of a baseball.
International Journal of Sports Science and Engineering, 3(2), 109-128.
Borghesi, R. (2008). Weather biases in the NFL totals market.
Applied Financial Economics, 18(12), 947-953.
Brown, K. H., & Abraham, J. F. (2002). Testing market
efficiency in the Major League Baseball over-under betting market.
Journal of Sports Economics, 3(4), 311-319.
Brown, K. H., & Abraham, J. F. (2004). Response to comment on
testing market efficiency in the Major League Baseball over-under
betting market. Journal of Sports Economics, 5(1), 96-99.
Brown, W., & Sauer, R. (1993). Does the basketball market
believe in the hot hand? American Economic Review, 83(5), 1377-1386.
Figlewski, S. (1979). Subjective information and market efficiency
in a betting market. Journal of Political Economy, 87(1), 75-88.
Gandar, J. M., Zuber, R. A., Johnson, R. S., & Dare, W. (2002).
Re-examining the betting market on Major League Baseball games: Is there
a reverse favorite-longshot bias? Applied Economics, 34, 1309-1317.
Gandar, J. M., & Zuber, R. A. (2004). An evaluation of the
debate over testing market efficiency in the Major League Baseball
over-under betting market. Journal of Sports Economics, 5(1), 100-105.
Hirshleifer, D., & Shumway, T. (2003). Good day sunshine: Stock
returns and the weather. Journal of Finance, 58(3), 1009-1032.
Osborne, E. (2001). Efficient markets? Don't bet on it.
Journal of Sports Economics, 2(1), 50-61.
Paul, R. J., & Weinbach, A. P. (2004). Comment on testing
market efficiency in the Major League Baseball over-under betting
market. Journal of Sports Economics, 5(1), 93-95.
Paul, R. J., Humphreys B. R., & Weinbach, A. P. (2013) The lure
of the pitcher: How the baseball betting market is influenced by elite
starting pitchers. In L. V. Williams & D. S. Siegel (Eds.), The
Oxford handbook of the economics of gambling. Oxford, UK: Oxford
University Press.
Roll, R. (1984). Orange juice and weather. American Economic
Review, 74(5), 861-880.
Roll, R. (1988) R2. Journal of Finance, 43(3), 541-566.
Sauer, R., Brajer, V., Ferris, S. & Marr, M. (1988). Hold your
bets: Another look at the efficiency of the gambling market for National
Football League games. Journal of Political Economy, 96(1), 206-213.
Rodney J. Paul (1), Andrew P. Weinbach (2), and Chris Weinbach (3)
(1) Syracuse University
(2) Coastal Carolina University
(3) STEM Partners, LLC
Rodney J. Paul is a professor of sport management in the David B.
Falk College of Sport and Human Dynamics. His research interests include
studies of market efficiency, prediction markets, behavioral biases, and
the economics and finance of sports.
Andrew P. Weinbach is an associate professor of economics and the
Colonel Lindsey H. Vereen Endowed Business Professor at the E. Craig
Wall Sr. College of Business Administration. His research interests
include the economics and finance of sports, consumer behavior, and the
economics of lotteries and gambling.
Chris Weinbach is a technology consultant for STEM Partners, LLC.
He has a lifetime interest in the game of baseball and baseball
statistics.
Table 1: Summary Statistics-2012 Major League Baseball
(2430 observations)
Total Combined Scored Wind
Runs Air Speed
Density
Mean 8.1599 8.3687 1.1422 7.3681
Standard Deviation 1.1234 4.5822 0.0461 4.9832
Median 8.0000 8.0000 1.1466 7.0000
Table 2: Regression Model of Total Runs Scored in Baseball Games
based on Atmospheric Conditions
Dependent Variable: Total Runs Scored
Variable Coefficient Coefficient
(t-stat) Variable (t-stat)
Intercept 14.7510 *** Right to Left -0.3593
(4.0792) (-0.5044)
Total 0.6940 *** Left to Right -0.8239
(6.7961) (-1.2197)
Over Probability Based -0.3845 Varies -0.1016
on Odds Adjustment (-0.2378) (Swirling Wind) (-0.1242)
Air Density -10.1084 *** Dome -1.0379
(-4.0159) (-1.2690)
Wind Speed 0.0859 *** Roof Closed 0.1976
(3.2119) (0.2932)
In from CF -0.8325 Sunny 0.1842
(-1.1520) (0.5806)
In from RF -0.7177 Cloudy -0.0045
(-1.0005) (-0.0165)
In from LF -0.3996 Clear -0.4369 *
(-0.5715) (-1.6849)
Out to CF -0.7016 Overcast -0.0366
(-1.0317) (-0.0910)
Out to RF -0.6003 Drizzle -0.4309
(-0.8820) (-0.3825)
Out to LF -0.8452 Rain 1.2937
(-1.2266) (0.9214)
R-squared 0.077
* -notation denotes statistical significance of t-test (* -10% level,
** -5% level, *** -1% level)
Table 3: Logit Model of Over Results in the Baseball Totals Betting
Market
Dependent Variable: Game Result is
Over in Totals Market (Binary)
Variable Coefficient Coefficient
(t-stat) Variable (t-stat)
Intercept 5.6177 *** Right to Left -0.0650
(3.4317) (-0.1538)
Total -0.1219 *** Left to Right -0.0598
(-2.6785) (-0.1433)
Over Probability 0.2517 Varies -0.0339
Based
on Odds Adjustment (0.2973) (Swirling Wind) (-0.0704)
Air Density -4.2534 *** Dome -0.4690
(-3.8207) (-0.9975)
Wind Speed 0.0253 ** Roof Closed 0.2202
(2.3197) (0.5276)
In from CF -0.1102 Sunny 0.0183
(-0.2490) (0.1228)
In from RF -0.2363 Cloudy -0.0514
(-0.5464) (-0.4053)
In from LF -0.0505 Clear -0.2277 *
(-0.1154) (-1.9140)
Out to CF -0.0378 Overcast 0.1113
(-0.0900) (0.5623)
Out to RF -0.0536 Drizzle -0.1359
(-0.1268) (-0.2637)
Out to LF -0.0471 Rain 0.2867
(-0.1094) (0.4670)
McFadden R-Squared 0.010
*-notation denotes statistical significance of t-test (* -10% level,
** -5% level, *** -1% level)
Table 4: Percentage Bet on the Over in the Baseball Totals Market
Dependent Variable: Percentage Bet on the Over
Variable Coefficient Variable Coefficient
(t-stat) (t-stat)
Intercept -30.4803 *** Right to Left 2.2730
(-1.8378) (0.6613)
Total 2.4442 *** Left to Right 5.1827
(5.3251) (1.5118)
Over Probability Based 12.6474 Varies 2.1846
on Odds Adjustment (1.5987) (Swirling Wind) (0.4516)
Air Density 53.6446 *** Dome 2.1256
(4.6128) (0.8270)
Wind Speed 0.0015 Roof Closed -0.0069
(0.0168) (-0.0019)
In from CF 3.5968 Sunny -1.0662
(1.0041) (-0.9336)
In from RF 0.9809 Cloudy 1.6965 *
(0.2716) (1.8378)
In from LF 2.2092 Clear 1.8431 *
(0.6097) (1.8923)
Out to CF 2.8023 Overcast 1.0794
(1.0651) (0.7503)
Out to RF 4.5872 Drizzle -2.5667
(1.3047) (-0.6169)
Out to LF 2.7056 Rain -1.0376
(0.7678) (-0.2052)
R-squared 0.051
* -notation denotes statistical significance of t-test (* -10% level,
** -5% level, *** -1% level)
Table 5: Betting Simulations of Simple Strategies in Baseball
Totals Market
Betting Situation N Actual Expected Z
Return Return
Air Density High- 239 0.0609 -0.0436 1.6248
Wager on Under 228 0.0434 -0.0441 1.3402
Air Density Low -
Wager on Over 467 0.0523 -0.0438 2.0995 **
Combined Bets on
Air Density 372 0.0115 -0.0439 1.0788
Wind Speed High -
Wager on Over 325 -0.0769 -0.0435 -0.6044
Wind Speed Low -
Wager on Under 697 -0.0297 -0.0437 0.3762
Combined Bets on
Wind Speed 257 0.0168 -0.0437 1.3412
Clear Day--Wager
on Under
** -denotes statistical significance of z-test at the 5% level.