Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript.
Here is a motivating example: Say that you have the heights of the last ten American presidents…
// The heights of the last ten American presidents in cm, from Kennedy to Obama
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];
… and that you would like to fit a Bayesian model assuming a Normal distribution to this data. Well, you can do that right now by clicking “Start sampling” below! This will run an MCMC sampler in your browser implemented in JavaScript.
If this doesn’t seem to work in your browser, for some reason, then try this version of the demo.
Here is the model you just sampled from…
$$\mu \sim \text{Normal}(0, 1000) \ \sigma \sim \text{Uniform}(0, 1000) \ \text{heights}_i \sim \text{Normal}(\mu, \sigma) ~~~ \text{for} ~ i ~ \text{in} 1..n$$
… and this is how it is implemented in JavaScript:
/* The code below assumes that you have loaded the two modules of bayes.js:
* - mcmc.js which implements the sampler and creates the global
* object mcmc.
* - distributions.js which implements a number of log density functions
* for common probability distributions and that creates the global object
* ld (as in log density).
*/
// The data
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];
// Parameter definitions
var params = {
mu: {type: "real"},
sigma: {type: "real", lower: 0}};
// Model definition
var log_post = function(state, heights) {
var log_post = 0;
// Priors (here sloppy and vague...)
log_post += ld.norm(state.mu, 0, 1000);
log_post += ld.unif(state.sigma, 0, 1000);
// Likelihood
for(var i = 0; i < heights.length; i++) {
log_post += ld.norm(heights[i], state.mu, state.sigma);
}
return log_post;
};
// Initializing the sampler, burning some draws to the MCMC gods,
// and generating a sample of size 1000.
var sampler = new mcmc.AmwgSampler(params, log_post, heights);
sampler.burn(1000);
var samples = sampler.sample(1000);
I’ve implemented a JavaScript MCMC procedure for fitting a Bayesian model before, but that was just for a specific model (I also implemented a MCMC procedure in BASIC, but don’t ask me why…). The idea with bayes.js is to make it easier for me (and maybe for you) to make demos of Bayesian procedures that are easy to put online. If you would like to know more about bayes.js just head over to it’s GitHub page where you will find the code and a README file full of details. You can also check out a couple of interactive demos that I’ve made:
AmwgSampler
can do…)These demos rely on the plotly library and I haven’t tested them extensively on different platforms/browsers. You should be able to change the data and model definition on the fly (but if you change some stuff, like adding multidimensional variables, the plotting might stop working).
The two major files in bayes.js are:
AmwgSampler
) algorithm presented by Roberts and Rosenthal (2009) . Loading this file in the browser creates the global object mcmc
.ld.*
(for example, ld.norm
and ld.pois
) and uses the same parameters as the d*
density functions in R. Loading this file in the browser creates the global object ld
.In addition to this the whole thing is wrapped in an Rstudio project as I’ve use R and JAGS to write some tests.
ld.norm
defined in distributions.js resulted in 10x slower sampling on Firefox 37.Roberts, G. O., & Rosenthal, J. S. (2009). Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2), 349-367. pdf
]]>On the 21st of February, 2015, my wife had not had her period for 33 days, and as we were trying to conceive, this was good news! An average period is around a month, and if you are a couple trying to go triple, then a missing period is a good sign something is going on. But at 33 days, this was not yet a missing period, just a late one, so how good news was it? Pretty good, really good, or just meh?
To get at this I developed a simple Bayesian model that, given the number of days since your last period and your history of period onsets, calculates the probability that you are going to be pregnant this period cycle. In this post I will describe what data I used, the priors I used, the model assumptions, and how to fit it in R using importance sampling. And finally I show you why the result of the model really didn’t matter in the end. Also I’ll give you a handy script if you want to calculate this for yourself. :)
During the last part of 2014 my wife kept a journal of her period onsets, which was good luck for me, else I would end up with tiny data again. In total we had the dates for eight period onsets, but the data I used was not the onsets but the number of days between the onsets:
period_onset <- as.Date(c("2014-07-02", "2014-08-02", "2014-08-29", "2014-09-25",
"2014-10-24", "2014-11-20", "2014-12-22", "2015-01-19"))
days_between_periods <- as.numeric(diff(period_onset))
So the onsets occur pretty regularly, hovering around a cycle of 28 days. The last onset was on the 19th of January, so on the 21st of February there had been 33 days since the last onset.
I was constructing a model covering period cycles, pregnancies and infertility, and as such it was obviously going to make huge simplifications. Some general assumptions I made were:
Now to the specific assumptions I made:
days_between_periods
) is assumed to be normally distributed with unknown mean (mean_period
) and standard deviation (sd_period
).0.19
(more about where this number comes from below) if you are fertile as a couple (is_fertile
). Unfortunately not all couples are fertile, and if you are not then the probability of getting pregnant is 0. If fertility is coded as 0-1 then this can be compactly written as 0.19 * is_fertile
.n_non_pregnant_periods
)
is then (1 - 0.19 * is_fertile)^n_non_pregnant_periods
next_period
) is going to be more than the current number of days since the last period (days_since_last_period
). That is, the probability of next_period < days_since_last_period
is zero. This sounds strange because it is so obvious, but we’re going to need it in the model.That was basically it! But in order to fit this I was going to need a likelihood function, a function that, given fixed parameters and some data, calculates the probability of the data given those parameters or, more commonly, something proportional to a probability, that is, a likelihood. And as this likelihood can be extremely tiny I needed to calculate it on the log scale to avoid numerical problems. When crafting a log likelihood function in R, the general pattern is this:
log_like <- 0.0
).dnorm
, dbinom
and dpois
) you calculate the likelihoods of the different the parts of the model. You then multiply these likelihoods together. On the log scale this corresponds to adding the log likelihoods to log_like
.d*
functions return log likelihoods just add the argument log = TRUE
. Also remember that a likelihood of 0.0 corresponds to a log likelihood of -Inf
. So, a log likelihood function corresponding to the model above would then be:
calc_log_like <- function(days_since_last_period, days_between_periods,
mean_period, sd_period, next_period,
is_fertile, is_pregnant) {
n_non_pregnant_periods <- length(days_between_periods)
log_like <- 0
if(n_non_pregnant_periods > 0) {
log_like <- log_like + sum( dnorm(days_between_periods, mean_period, sd_period, log = TRUE) )
}
log_like <- log_like + log( (1 - 0.19 * is_fertile)^n_non_pregnant_periods )
if(!is_pregnant && next_period < days_since_last_period) {
log_like <- -Inf
}
log_like
}
Here the data is the scalar days_since_last_period
and the vector days_between_periods
, and the rest of the arguments are the parameters to be estimated. Using this function I could now get the log likelihood for any data + parameter combination. However, I still only had half a model, I also needed priors!
To complete this model I needed priors on all the parameters. That is, I had to specify what information the model has before seeing the data. Specifically, I needed priors on mean_period
, sd_period
, is_fertile
, and is_pregnant
(while next_period
is also a parameter, I didn’t need to give it an explicit prior as its distribution is completely specified by mean_period
and sd_period
). I also needed to find a value for the probability of becoming pregnant in a cycle (which I set to 0.19
above). Did I use vague, “objective” priors here? No, I went looking in the fertility literature to something more informative!
For the distribution of the days_between_periods
the parameters were mean_period
and sd_period
. Here I used estimates from the article The normal variabilities of the menstrual cycle
(Cole et al, 2009) which measured the regularity of periods in 184 women aged 18-36 years. The grand mean number of days between periods was here 27.7 days, with the SD of the per participant mean being 2.4. The group SD of the number of days between periods was 1.6. Given these estimates I then decided to put a Normal(27.7, 2.4) distribution over mean_period
and a Half-Normal distribution with mean 1.6 over sd_period
, corresponding to a Half-Normal with a SD of 2.05. Here they are:
For the parameters is_fertile
and is_pregnant
I based the priors on frequencies. The proportion of couples that are fertile is tricky to define, as there different definitions of infertility. Van Geloven et al. (2013) made a small literature review and got that between 2% and 5% of all couples could be considered infertile. As I’ve seen numbers as high as 10%, I decided to go with the higher end of this range and put a prior probability of 100% - 5% = 95% that a couple is fertile.
is_pregnant
is a binary parameter standing for whether the couple are going get (or already are) pregnant the current cycle. The prior I used here was the probability of getting pregnant in a cycle. This probability is obviously 0.0 if the couple is infertile, but how large a proportion of active, fertile couples get pregnant in a period cycle? Unfortunately I didn’t find a source that explicitly stated this, but I found something close. On page 53 in Increased Infertility With Age in Men and Women Dunson et al. (2004) give the proportion of couples trying to conceive who did not get pregnant within 12 cycles, stratified by the age of the woman:
prop_not_preg_12_cycles <- c("19-26 years" = 0.08,
"27-34 years" = 0.13,
"35-39 years" = 0.18)
Using some back-of-the-R-script calculations I calculated the probability to conceive in a cycle: As these proportions presumably include infertile couples I started by subtracting 0.05, the proportion of couples that I assumed are infertile. My wife was in the 27-34 years bracket so the probability of us not conceiving within 12 cycles, given that we are fertile, was then 0.13 - 0.05. If p is is the probability of not getting pregnant during one cycle, then $p^{12} = 0.13 - 0.05$ is the probability of not getting pregnant during twelve cycles and, as p is positive, we have that $p = (0.135 - 0.05)^{1/12}$. The probability of getting pregnant in one cycle is then 1 - p and the probabilities for the three age groups are:
1 - (prop_not_preg_12_cycles - 0.05)^(1/12)
## 19-26 years 27-34 years 35-39 years
## 0.25 0.19 0.16
So that’s where the 19% percent probability of conceiving came from in the log likelihood function above, and 19% is what I used as a prior for is_pregnant
. Now I had priors for all parameters and I could construct a function that returned samples from the prior:
sample_from_prior <- function(n) {
prior <- data.frame(mean_period = rnorm(n, 27.7, 2.4),
sd_period = abs(rnorm(n, 0, 2.05)),
is_fertile = rbinom(n, 1, 0.95))
prior$is_pregnant <- rbinom(n, 1, 0.19 * prior$is_fertile)
prior$next_period <- rnorm(n, prior$mean_period, prior$sd_period)
prior$next_period[prior$is_pregnant == 1] <- NA
prior
}
It takes one argument (n
) and returns a data.frame
with n
rows, each row being a sample from the prior. Let’s try it out:
sample_from_prior(n = 4)
## mean_period sd_period is_fertile is_pregnant next_period
## 1 29 1.24 1 0 30
## 2 29 3.73 1 0 28
## 3 27 1.29 1 1 NA
## 4 27 0.57 0 0 27
Notice that is_pregnant
can only be 1
if is_fertile
is 1
, and that there is no next_period
if the couple is_pregnant
.
I had now collected the triforce of Bayesian statistics: The prior, the likelihood and the data. There are many algorithms I could have used to fit this model, but here a particularly convenient method was to use importance sampling. I’ve written about importance sampling before, but let’s recap: Importance sampling is a Monte Carlo that is very easy to setup and that can work well if (1) the parameters space is small and (2) the priors are not too dissimilar from the posterior. As my parameter space was small and because I used pretty informative priors I though importance sampling would suffice here. The three basic steps in importance sampling are:
sample_from_prior
.)calc_log_like
.)sample
.)(Note that there are some variations to this procedure, but when used to fit a Bayesian model this is a common version of importance sampling.)
The result of using importance sampling is a new sample which, if the importance sampling worked OK, can be treated as a sample from the posterior. That is, it represents what the model knows after having seen the data. Since I already had defined sample_from_prior
and calc_log_like
, defining a function in R doing importance sampling was trivial:
sample_from_posterior <- function(days_since_last_period, days_between_periods, n_samples) {
prior <- sample_from_prior(n_samples)
log_like <- sapply(1:n_samples, function(i) {
calc_log_like(days_since_last_period, days_between_periods,
prior$mean_period[i], prior$sd_period[i], prior$next_period[i],
prior$is_fertile[i], prior$is_pregnant[i])
})
posterior <- prior[ sample(n_samples, replace = TRUE, prob = exp(log_like)), ]
posterior
}
So, on the 21st of February, 2015, my wife had not had her period for 33 days. Was this good news? Let’s run the model and find out!
post <- sample_from_posterior(33, days_between_periods, n_samples = 100000)
post
is now a long data frame where the distribution of the parameter values represent the posterior information regarding those parameters.
head(post)
## mean_period sd_period is_fertile is_pregnant next_period
## 33231 28 2.8 0 0 37
## 22386 27 2.4 1 1 NA
## 47489 27 2.1 1 1 NA
## 68312 28 2.3 1 1 NA
## 37341 29 2.9 1 1 NA
## 57957 30 2.6 1 0 36
Let’s start by looking at the mean and standard deviation of the number of days between each period:
As expected the posteriors are more narrow than the priors and, looking at the posteriors, it’s probable that the mean period cycle is around 29 days with a SD of 2-3 days. Now to the important questions: What’s the probability that we are a fertile couple and what’s the probability that we were pregnant on the 21st of February? To calculate this we can just take post$is_fertile
and post$is_pregnant
and calculate the proportion of 1
s in these vectors. A quick way of doing this is just to take the mean
:
mean(post$is_fertile)
## [1] 0.97
mean(post$is_pregnant)
## [1] 0.84
So it was pretty good news: It’s very probable that we are a fertile couple and the probability that we were pregnant was 84%! Using this model I could also see how the probability of us being pregnant would change if the period onset would stay away a couple of days more:
post <- sample_from_posterior(34, days_between_periods, n_samples = 100000)
mean(post$is_pregnant)
## [1] 0.92
post <- sample_from_posterior(35, days_between_periods, n_samples = 100000)
mean(post$is_pregnant)
## [1] 0.96
Yeah, while we are at it, why not see how the probability of us being fertile and pregnant changed during the months we were trying to conceive:
So, this make sense. As the time since the last period gets longer the probability that we are going to be pregnant the current cycle increases, but as soon as there is a period onset that probability falls down to baseline again. We see the same pattern for the probability of being fertile, but for every period cycle we didn’t get pregnant the probability of us being fertile gets slightly lower. Both these graphs are a bit jagged, but this is just due to the variability of the importance sampling algorithm. Also note that, while the graphs above are pretty, there is no real use looking at the probability over time, the only thing that’s informativ and matters is the current probability.
But all of this doesn’t really matter now. And the probabilities I calculated do not matter either. Before-the-fact probabilities reflect uncertainty regarding an outcome or a statement, but after-the-fact there is no uncertainty left. What uncertainty the probabilities represented is gone. I’m certain that my wife and I were pregnant on the 21st of February, and I know this because one week ago, on the 29th of October, we received this little guy:
If probabilities matter to you you can find a script implementing this model here which you can run on your own data, or a friend’s. In this post I left out the code creating the plots, but all of that can be found here. And, as this post was really just an excuse to post baby photos, here are some more of me and my son checking out the statistical literature:
Doing Bayesian Data Analysis makes him a bit sleepy, but that’s OK, he’ll come around! Looking at Fisher’s Statistical Methods for Research Workers on the other hand makes him furious…
Bortot, P., Masarotto, G., & Scarpa, B. (2010). Sequential predictions of menstrual cycle lengths. Biostatistics, 11(4), 741-755. doi: 10.1093/biostatistics/kxq020
Cole, L. A., Ladner, D. G., & Byrn, F. W. (2009). The normal variabilities of the menstrual cycle. Fertility and sterility, 91(2), 522-527. doi: 10.1016/j.fertnstert.2007.11.073
Dunson, D. B., Baird, D. D., & Colombo, B. (2004). Increased infertility with age in men and women. Obstetrics & Gynecology, 103(1), 51-56. doi: 10.1097/01.AOG.0000100153.24061.45
Van Geloven, N., Van der Veen, F., Bossuyt, P. M. M., Hompes, P. G., Zwinderman, A. H., & Mol, B. W. (2013). Can we distinguish between infertility and subfertility when predicting natural conception in couples with an unfulfilled child wish?. Human Reproduction, 28(3), 658-665. doi: 10.1093/humrep/des428
Wilcox, A. J., Dunson, D., & Baird, D. D. (2000). The timing of the “fertile window” in the menstrual cycle: day specific estimates from a prospective study. Bmj, 321(7271), 1259-1262. doi: 10.1136/bmj.321.7271.1259, pdf
]]>
Romantic kissing is a cultural universal, right? Nope! At least not if you are to believe Jankowiak et al. (2015) who surveyed a large number of cultures and found that “sexual-romantic kissing” occurred in far from all of them. For some reasons the paper didn’t include a world map with these kissers and non-kissers plotted out. So, with the help of my colleague Andrey Anikin I’ve now made such a map using R and the excellent leaflet package. Click on the image below to check it out:
Jankowiak, W.R., Volsche, S.L. & Garcia, J.R., 2015. Is the Romantic-Sexual Kiss a Near Human Universal? American Anthropologist, 117(3), 535-539. doi: 10.1111/aman.12286 pdf
]]>The BASIC programming language was at one point the most widely spread programming language. Many home computers in the 80s came with BASIC (like the Commodore 64 and the Apple II), and in the 90s both DOS and Windows 95 included a copy of the QBasic IDE. QBasic was also the first programming language I encountered (I used it to write a couple of really horrible text adventures). Now I haven’t programmed in BASIC for almost 20 years and I thought I would revisit this (from my current perspective) really weird language. As I spend a lot of time doing Bayesian data analysis, I though it would be interesting to see what a Bayesian analysis would look like if I only used the tool that I had 20 years ago, that is, BASIC.
This post walks through the implementation of the Metropolis-Hastings algorithm, a standard Markov chain Monte Carlo (MCMC) method that can be used to fit Bayesian models, in BASIC. I then use that to fit a Laplace distribution to the most adorable dataset that I could find: The number of wolf pups per den from a sample of 16 wold dens. Finally I summarize and plot the result, still using BASIC. So, the target audience of this post is the intersection of people that have programmed in BASIC and are into Bayesian computation. I’m sure you are out there. Let’s go!
There are many many different version of BASIC, but I’m going for the one that I grew up with: Microsoft QBasic 1.1 . Now, QBasic is a relatively new BASIC dialect that has many advanced features such as user defined types and (gasp!) functions. But I didn’t use any fancy functions back in the 90s, and so I’m going to write old school BASIC using line numbers and GOTO
, which means that the code should be relatively easy to port to, say, Commodore 64 BASIC.
Getting QBasic is easy as it seems to be freeware, and can be downloaded here. Unless you are still using DOS, the next step would be to install the DOSBox emulator. Once you’ve started up QBASIC.EXE
you are greeted with a friendly, bright blue IDE which you can try out by entering this customary hello world script that will clear the screen and print “HELLO WORLD”.
Note that in QBasic the line numbers are not strictly necessary, but in older BASICs (like the one for Commodore 64) they were necessary as the program was executed in the order given by the line numbers.
We are going to implement the Metropolis-Hastings algorithm, a classic Markov chain Monte Carlo (MCMC) algorithm which is one of the first you’ll encounter if you study computational methods for fitting Bayesian models. This post won’t explain the actual algorithm, but the Wikipedia article is an ok introduction.
The Bayesian model we are going to implement is simple univariate Laplace distribution (just to once in awhile give the Normal distribution a day off). The Laplace is similar to the Normal distribution in that it is continuous and symmetric, but it is more peaked and has wider tails. It has two parameters: A location $\mu$, which then also defines its mean and median, and a scale $b$ which defines the width of the distribution.
Image source, Credits: IkamusumeFan
Like the sample mean is the maximum likelihood estimator for the mean of a Normal distribution, the sample median is the maximum likelihood estimator for the location parameter $\mu$ of a Laplace distribution. That’s why I, somewhat sloppily, think of the Normal distribution as the “mean” distribution and of the Laplace distribution as the “median” distribution. To turn this into a fully Bayesian models we need prior distributions over the two parameters. Here I’m just going to be sloppy and use a $\text{Uniform}(-\infty, \infty)$ over $\mu$ and $\log(b)$, that is, $\text{P}(\mu),\text{P}(\log(b)) \propto 1$. The full model is then
$$ x \sim \text{Laplace}(\mu, b) \ \mu \sim \text{Uniform}(-\infty, \infty) \ \log(b) \sim \text{Uniform}(-\infty, \infty) \$$
The data we are going to use is probably the cutest dataset I’ve worked with so far. It consists of counts of the number of wolf pups in a sample of 16 wolf dens (source):
Image Source, Credits: spacebirdy / CC-BY-SA-3.0
Before delving into BASIC, here is a reference implementation in R of what we hope to achieve using BASIC:
# The wolf pups dataset
x <- c(5, 8, 7, 5, 3, 4, 3, 9, 5, 8, 5, 6, 5, 6, 4, 7)
# The log posterior density of the Laplace distribution model, when assuming
# uniorm/flat priors. The Laplace distribution is not part of base R but is
# available in the VGAM package.
model <- function(pars) {
sum(VGAM::dlaplace(x, pars[1], exp(pars[2]), log = TRUE))
}
# The Metropolis-Hastings algorithm using a Uniform(-0.5, 0.5) proposal distribution
metrop <- function(n_samples, model, inits) {
samples <- matrix(NA, nrow = n_samples, ncol = length(inits))
samples[1,] <- inits
for(i in 2:n_samples) {
curr_log_dens <- model(samples[i - 1, ])
proposal <- samples[i - 1, ] + runif(length(inits), -0.5, 0.5)
proposal_log_dens <- model(proposal)
if(runif(1) < exp(proposal_log_dens - curr_log_dens)) {
samples[i, ] <- proposal
} else {
samples[i, ] <- samples[i - 1, ]
}
}
samples
}
samples <- metrop(n_samples = 1000, model, inits = c(0,0))
# Plotting a traceplot
plot(samples[,1], type = "l", ylab = expression(Location ~ mu), col = "blue")
# Calculating median posterior and 95% CI discarding the first 250 draws as "burnin".
quantile(samples[250:1000,1], c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 4.489 5.184 6.144
(For the love of Markov, do not actually use this R script as a reference, it is just there to show what the BASIC code will implement. If you want to do MCMC using Metropolis-Hastings in R, check out the MCMCmetrop1R
in the MCMCpack package, or this Metropolis-Hastings script.
R is sometimes called “a quirky language”, but the script above is a wonder of clarity and brevity compared to how the BASIC code is going to look…
Let’s start by clearing the screen (CLS
) and defining the variables we need. Here DIM
defines arrays and matrices.
Already in this short piece of code there are a lot of fun quirkiness going on:
75
above). As there are no functions in old school BASIC the line numbers will be used to jump around in the program using statements like GOSUB
and GOTO
.SAMPLES!
? In QBasic all variables are integers by default and to change the type you append a symbol to the variable name. So THIS$
is a string while THAT!
is a floating point number. Since our parameters are continuous most of our variables will end with a !
.DATA
command is a relatively nice way to put data into the program, which can later be retrieved sequentially with the READ
function. If you have more than one dataset? Tough luck…Let’s continue with defining the model. It’s going to be written as a subroutine which is an isolated piece of code that ends with a RETURN
statement. It can then be called with a GOSUB <line>
statement which jumps to <line>
, executes the code up to the RETURN
statement and then jumps back to the line after the GOSUB
. It is a little bit like a poor man’s function as you can’t pass in any arguments or return any values. But that’s not a problem since all variable names are global anyway, your subroutine can still access and set any variables as long as you are careful not to use the same variable name for two different things. The subroutine below assumes that the vector PARAMS!
is set and will overwrite the LOGDENS!
variable. The subroutine also uses READ X!
to read in a value from the DATA
and RESTORE
to reset the READ
to the first data point.
Here is now the Metropolis-Hastings algorithm. Nothing new here, except for that we now use GOSUB 520
to jump into the model subroutine at line 520
and that we use the RND statement to generate uniformly random numbers between 0.0 and 1.0.
Now we could piece it all together by putting the following code directly after the initial declarations:
This will fill up SAMPLES!
with 1000
MCMC draws from the posterior distribution of the location ($\mu$) and the log scale ($\log(b)$). But that won’t make us any happier, we still would need to summarize and plot the posterior. The code below takes the SAMPLES!
matrix and calculates the mean and SD of the posterior, and a 95% credible interval (I usually prefer the median posterior and a quantile interval, but that would require much more coding, as there is no built in method for sorting arrays in BASIC.)
Some more BASIC strangeness here is that PRINT "HELLO", "YOU!"
would produce print HELLO YOU!
while PRINT "HELLO"; "YOU!"
would print HELLOYOU!
. That is, whether the separator is ,
or ;
decides whether there will be a space between printed items.
Finally we would like to draw a traceplot to make sure that the Markov chain looks ok. While there is no built in function for sorting, there is great graphics support! The bulk of the code below is mostly just rescaling the sample coordinates so that they fit the screen resolution (320 x 200). Notice also the fun syntax for drawing lines: LINE (<x1>, <y1>)-(<x2>, <y2>)
. That many of the BASIC statements feature “custom” syntax might be a symptom of that LINE
, PRINT
, IF
, etc. are statements and not functions.
And finally we might want to write the samples to disk:
Now we are finally ready to piece everything together, sampling and plotting!
The last GOTO
just jumps to the last line of the program which makes the program exit. If we set DOSBox to emulate the speed of a 33 MHz 386 PC (according to this page), it takes about a minute to generate the 1000 samples and summarize the result. Here is the output:
As drawing the lines of the traceplot is really slow we get the animation above “for free”. :)
So what about the wolf pups? Well, this exercise wasn’t really about data analysis, but looking at the posterior summary above it seems like a good guess is that the median number of wolf pups in a den is around 4.5 to 6.0. I should also point out that both the BASIC code and the “reference” implementation result in similar posteriors:
Here is the full program in text format (MCMCDEMO.BAS), or, if you prefer, as a screenshot:
All in all, it seems like I could have done pretty advanced Bayesian statistics using the computational tools I had access to 20 years ago! (Of course, one could easily have implemented a similar program in C or FORTRAN much much earlier…) If this post has inspired you, and you wish to use BASIC as your main tool for scientific computing (please don’t!), you’ll find a host of QBasic tutorials here. You can also dive directly into the wonderful world of BASIC using this excellent javascript emulation of Apple II BASIC.
GOTO
and GOSUB
just feels very old school. END IF
s and NEXT I
s instead.)IF
-statement, and there was no ELSE
! So if you wanted to execute many commands as the result of an IF
statement you would have to fake it with GOTO
statements instead.print 5*5+ 7
→ PRINT 5 * 5 + 7