There are different ways of specifying and running Bayesian models from within R. Here I will compare three different methods, two that relies on an external program and one that only relies on R. I won’t go into much detail about the differences in syntax, the idea is more to give a gist about how the different modeling languages look and feel. The model I will be implementing assumes a normal distribution with fairly wide priors:
Let’s start by generating some normally distributed data to use as example data in the models.
1 2 3
Method 1: JAGS
JAGS (Just Another Gibbs Sampler) is a program that accepts a model string written in an R-like syntax and that compiles and generate MCMC samples from this model using Gibbs sampling. As in BUGS, the program that inspired JAGS, the exact sampling procedure is chosen by an expert system depending on how your model looks. JAGS is conveniently called from R using the rjags package and John Myles White has written a nice introduction to using JAGS and rjags. I’ve seen it recommended to put the JAGS model specification into a separate file but I find it more convenient to put everything together in one .R file by using the
textConnection function to avoid having to save the model string to a separate file. Now, the following R code specifies and runs the model above for 20000 MCMC samples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
One big difference between the JAGS modeling language and R is that JAGS parameterizes many distribution using precision rather than standard deviation where the precision is $\tau = 1/\sigma^2$ . Therefore the SDs of the prior distributions became $0.0001 = 1/100^2$ and $0.0625 = 1 / 4^2$ .
By using the
coda.samples function to generate the samples rather than the
jags.samples function it is possible to use the plotting and summary functions defined by the coda package.
plot shows the trace plots and marginal densities of the two parameters…
… and using
summary we get credible intervals and point estimates for the parameters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Method 2: STAN
STAN is a fairly new program that works in a similar way to JAGS and BUGS. You write your model in STAN’s modeling language, STAN compiles your model and generates MCMC samples that you can use for further analysis in R. All this can be done from within R using the rstan package. One way that STAN differs from JAGS is that STAN compiles the model down to a C++ program which uses the No-U-Turn sampler to generate MCMC samples from the model. STAN’s modeling language is also a different, requiring all variables and parameters to be explicitly declared and insisting on every line ending with
;. A good introduction to STAN and rstan can be found on STAN’s wiki page. The following R code specifies the model and runs it for 20000 MCMC samples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Another difference from JAGS, noticeable in the model definition above, is that STAN (in line with R) has vectorized functions. That is, what had to be written as…
1 2 3
… in JAGS can now simplified in STAN as:
After running the model in STAN we can now inspect the trace plots…
… and the parameter distributions.
By writing the variable holding the MCMC results in the terminal we get point estimates and credible intervals.
1 2 3 4 5 6 7 8 9 10 11 12 13
Method 3: LaplacesDemon
Update: LaplaceDemon is unfortunately gone since mid July 2015, both the web page and the GitHub account has been removed. I’ve tried to contact the author, but has gotten no response. The last version of LaplaceDemon is now available here, but as the author has “disappeared” the future of the package is uncertain.
LaplacesDemon seems to be a rather unknown R package (I’ve found very few mentions of it on R-bloggers for example) which helps you run Bayesian models using only R. LaplacesDemon implements a plethora of different MCMC methods and has great documentation available on www.bayesian-inference.com. As opposed to JAGS and STAN there is no modeling language but instead you have to craft your own likelihood function using R. The downside of this is that your model perhaps won’t look as pretty as in JAGS but the upside is that you can be much more flexible in what constructs and probability distributions you use. Specifying and running the model using the exotic Hit-and-Run Metropolis (HARM) algorithm looks like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
As far as I’ve understand there is no special mechanism for keeping the parameters inside a bounded range, therefore the SD $\sigma$ is sampled on the log-scale and then exponentiated to make sure $\sigma$ is always positive.
The trace plots and distributions of MCMC samples can then be plotted…
… and by
Consorting with Laplace’s Demon we get a long list of info including point estimates, credible intervals and MCMC diagnostics.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
Comparison of the Three Methods
Concluding this little exercise I thought I’d point out what I feel are some pros and cons of the three methods.
- + Mature program, lots of documentation and examples (for example, the great BUGS book).
- + Nice modeling language, it is almost possible to enter the likelihood specification verbatim.
- + Takes care of the sampling for you (in the best case).
- - It is difficult to use probability distributions not included in JAGS.
- - You’re stuck with Gibbs sampling for better and for worse.
- + An expressive modeling language similar to JAGS.
- + An active and supportive community (Especially the mailing lists).
- + Compiling down to C++ should result in efficient sampling.
- + The No-U-Turn sampler requires no configuration.
- + Can also do optimization (for example, maximum likelihood estimation)
- - The modeling language requires a bit more bookkeeping than JAGS’ (for example, why all the `;` ?).
- - Compiling down to C++ takes some time (> 30 s on my Dell Vostro 3700)
- - It is difficult to use probability distributions not included in STAN (but easier than in JAGS)
- - No direct support for discrete parameters.
- + Relies only on R.
- + Easy to use any probability distribution implemented in R.
- + Good documentation.
- + Lots of different MCMC methods to try.
- - No fancy modeling language, the likelihood has to be specified using an R function.
- - No large user community.
- - You have to select and configure the MCMC sampling yourself.
- - As everything is implemented in R the sampling might be slower that in JAGS and STAN.