Publishable Stuff

Rasmus Bååth's Blog

Hello stranger, and welcome! 👋😊
I'm Rasmus Bååth, data scientist, engineering manager, father, husband, tinkerer, tweaker, coffee brewer, tea steeper, and, occasionally, publisher of stuff I find interesting down below👇

A Fun Gastronomical Dataset: What’s on the Menu?


I just found a fun food themed dataset that I’d never heard about and that I thought I’d share. It’s from a project called What’s on the menu where the New York Public Library has crowdsourced a digitization of their collection of historical restaurant menus. The collection stretches all the way back to the 19th century and well into the 1990’s, and on the home page it is stated that there are “1,332,271 dishes transcribed from 17,545 menus”. Here is one of those menus, from a turn of the (old) century Chinese-American restaurant:

The data is freely available in csv format (yay!) and here I ’ll just show how to the get the data into R and I’ll use it to plot the popularity of some foods over time.

Read on →

How I made some Pokémon Business Cards


As I’m in the industry now I figured I needed some business cards and as it seems the 90s never left us and Japanese monsters are hip again, I decided to make them Pokémon themed.

I think they turned out pretty well, and here I’m just going to give some pointers on how I did them.

Read on →

Bayesian Bootstrap: The Movie + Some Highlights from UseR! 2016


Not surprisingly, this year’s UseR! conference was a great event with heaps of talented researchers and R-developers showing off the latest and greatest R packages. (A surprise visit from Donald Knuth didn’t hurt either.) What was extra great this year was that all talks were recorded, including mine. So if you want to know more about how the non-parametric Bootstrap is really a Bayesian procedure, and how you can run the Bayesian bootstrap in R using my bayesboot package, just press play. :)

Read on →

How to Cut Your Planks with R


Today I’m extraordinarily pleased because today I solved an actuall real world problem using R. Sure, I’ve solved many esoteric statistical problems with R, but I’m not sure if any of those solutions have escaped the digital world and made some impact ex silico.

It is now summer and in Sweden that means that many people tend to overhaul and rebuild their wooden decks as you need somewhere to sit during those precious few weeks of +20°C (70° F) weather. And so, we also decided to rebuild our algae ridden, half-rotten deck and everything went well until we got to the point where we had to construct the last steps leading into the house. As we had been slightly sloppy when buying planks we only had five left, and when naïvely measuring out the lengths we needed it seemed that the planks were not long enough. Now the problem was this: Was there some way we could saw the planks into the lengths we needed or did we have to go all the way to the lumber yard to get more planks?

These were the planks we had (in centimeters):

planks_we_have <- c(120, 137, 220, 420, 480)

Read on →

bayesboot: An R package for doing the Bayesian bootstrap


I recently wrapped up a version of my R function for easy Bayesian bootstrappin’ into the package bayesboot. This package implements a function, also named bayesboot, which performs the Bayesian bootstrap introduced by Rubin in 1981. The Bayesian bootstrap can be seen as a smoother version of the classical non-parametric bootstrap, but I prefer seeing the classical bootstrap as an approximation to the Bayesian bootstrap :)

The implementation in bayesboot can handle both summary statistics that works on a weighted version of the data (such as weighted.mean) and that works on a resampled data set (like median). As bayesboot just got accepted on CRAN you can install it in the usual way:

Read on →

Posterior Update of Bayes@Lund 2016


For the third year round I and Ullrika Sahlin arranged Bayes@Lund, a mini-conference bringing together researchers interested in or working with Bayesian methods in and around Sweden. This year we were thrilled to have over 70 attendees, both from near and far, perhaps due to our interesting invited speakers Eric-Jan Wagenmakers and Robert Grant, or perhaps due to the promise of fika (a Swedish word referring to a break involving coffee and/or tea with cake and/or cookies and/or pastries, the more and the better). Perhaps it was a combination…

Read on →

bayes.js: A Small Library for Doing MCMC in the Browser


Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript.

Here is a motivating example: Say that you have the heights of the last ten American presidents…

// The heights of the last ten American presidents in cm, from Kennedy to Obama 
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];

… and that you would like to fit a Bayesian model assuming a Normal distribution to this data. Well, you can do that right now by clicking “Start sampling” below! This will run an MCMC sampler in your browser implemented in JavaScript.

If this doesn’t seem to work in your browser, for some reason, then try this version of the demo.

Read on →

Eight Christmas Gift Ideas for the Statistically Interested


Christmas is soon upon us and here are some gift ideas for your statistically inclined friends (or perhaps for you to put on your own wish list). If you have other suggestions please leave a comment! :)

1. Games of probability

A recently released game where probability takes the main role is Pairs, an easy going press-your-luck game that can be played in 10 minutes. It uses a custom “triangular” deck of cards (1x1, 2x2, 3x3, …, 10x10) and is a lot of fun to play, highly recommended!

Another good gift would be a pound of assorted dice together with the seminal Dice Games Properly Explained by Reiner Knizia. While perhaps not a game, a cool gift to someone that already has a pound of dice would be a set of Non transitive Grime dice.

Read on →

A Bayesian Model to Calculate Whether My Wife is Pregnant or Not


On the 21st of February, 2015, my wife had not had her period for 33 days, and as we were trying to conceive, this was good news! An average period is around a month, and if you are a couple trying to go triple, then a missing period is a good sign something is going on. But at 33 days, this was not yet a missing period, just a late one, so how good news was it? Pretty good, really good, or just meh?

To get at this I developed a simple Bayesian model that, given the number of days since your last period and your history of period onsets, calculates the probability that you are going to be pregnant this period cycle. In this post I will describe what data I used, the priors I used, the model assumptions, and how to fit it in R using importance sampling. And finally I show you why the result of the model really didn’t matter in the end. Also I’ll give you a handy script if you want to calculate this for yourself. :)

Read on →

The Map of Romantic Kissing with Leaflet and R


Romantic kissing is a cultural universal, right? Nope! At least not if you are to believe Jankowiak et al. (2015) who surveyed a large number of cultures and found that “sexual-romantic kissing” occurred in far from all of them. For some reasons the paper didn’t include a world map with these kissers and non-kissers plotted out. So, with the help of my colleague Andrey Anikin I’ve now made such a map using R and the excellent leaflet package. Click on the image below to check it out:

Read on →