The Bayesian approach to variable selection in regression is a powerful
tool for tackling many scienti¯c problems. Inference for variable selection models is
usually implemented using Markov chain Monte Carlo (MCMC). Because MCMC
can impose a high computational cost in studies with a large number of variables,
we assess an alternative to MCMC based on a simple variational approximation.
Our aim is to retain useful features of Bayesian variable selection at a reduced cost.
Using simulations designed to mimic genetic association studies, we show that this
simple variational approximation yields posterior inferences in some settings that
closely match exact values. In less restrictive (and more realistic) conditions, we
show that posterior probabilities of inclusion for individual variables are often
incorrect, but variational estimates of other useful quantities|including posterior
distributions of the hyperparameters|are remarkably accurate. We illustrate how
these results guide the use of variational inference for a genome-wide association
study with thousands of samples and hundreds of thousands of variables.