摘要:We analyze complete sequences of successes (hits, walks, and sacrices)
for a group of players from the American and National Leagues, collected over
4 seasons. The goal is to describe how players' performances vary from season
to season. In particular, we wish to assess and compare the e
ect of available
occasion-specic covariates over seasons. The data are binary sequences for each
player and each season. We model dependence in the binary sequence by an
autoregressive logistic model. The model includes lagged terms up to a xed order.
For each player and season we introduce a di
erent set of autologistic regression
coecients, i.e., the regression coecients are random e
ects that are specic to
each season and player. We use a nonparametric approach to dene a random
e
ects distribution. The nonparametric model is dened as a mixture with a
Dirichlet process prior for the mixing measure. The described model is justied by
a representation theorem for order-k exchangeable sequences. Besides the repeated
measurements for each season and player, multiple seasons within a given player
dene an additional level of repeated measurements. We introduce dependence at
this level of repeated measurements by relating the season-specic random e
ects
vectors in an autoregressive fashion. We ultimately conclude that while some
covariates like the ERA of the opposing pitcher are always relevant, others like an
indicator for the game being into the seventh inning may be signicant only for
certain seasons, and some others, like the score of the game, can safely be ignored.
关键词:Dirichlet Process, Partial Exchangeability, Semiparametric Random
E
ects