摘要:Multivariate data summarized over areal units (counties, zip codes,
etc.) are common in the eld of public health. Estimation or testing of geo-
graphic boundaries for such data may have varied goals. For example, for data
on multiple disease outcomes, we may be interested in a single set of \composite"
boundaries for all diseases, separate boundaries for each disease, or both. Dif-
ferent areal wombling (boundary analysis) techniques are needed to meet these
di
erent requirements. But in any case, the underlying statistical model needs
to account for correlations across both diseases and locations. Utilizing recent
developments in multivariate conditionally autoregressive (MCAR) distributions
and spatial structural equation modeling, we suggest a variety of Bayesian hi-
erarchical models for multivariate areal boundary analysis, including some that
incorporate random neighborhood structure. Many of our models can be imple-
mented via standard software, namely WinBUGS for posterior sampling and R for
summarization and plotting. We illustrate our methods using Minnesota county-
level esophagus, larynx, and lung cancer data, comparing models that account for
both, only one, or neither of the aforementioned correlations. We identify both
composite and cancer-specic boundaries, selecting the best statistical model using
the DIC criterion. Our results indicate primary boundaries in both the composite
and cancer-specic response surface separating the mining- and tourism-oriented
northeast counties from the remainder of the state, as well as secondary (residual)
boundaries in the Twin Cities metro area
关键词:Areal data; Cancer; Multivariate conditionally autoregressive (MCAR)
model; Surveillance, Epidemiology and End Results (SEER) data.