Correlation does not imply causation, right but, as Edward Tufte writes, “it sure is a hint.” The Pearson productmoment correlation coefficient is perhaps one of the most common ways of looking for such hints and this post describes the Bayesian First Aid alternative to the classical Pearson correlation test. Except for being based on Bayesian estimation (a good thing in my book) this alternative is more robust to outliers and comes with a pretty nice default plot. :)
A Hack to Create Matrices in R, Matlab style
The Matlab syntax for creating matrices is pretty and convenient. Here is a 2x3 matrix in Matlab syntax where ,
marks a new column and ;
marks a new row:
1 2 

Here is how to create the corresponding matrix in R:
1


1 2 3 

Functional but not as pretty, plus the default is to specify the values column wise. A better solution is to use rbind
:
Oldies but Goldies: Statistical Graphics Books
I just wanted to plug for three classical books on statistical graphics that I really enjoyed reading. The books are old (that is, older than me) but still relevant and together they give a sense of the development of exploratory graphics in general and the graphics system in R specifically as all three books were written at Bell Labs where the Slanguage was developed. What follows is not a review but just me highlighting some things that I liked about these books. So, without further ado, here they are:
 Exploratory Data Analysis by John W. Tukey (1977)
 Graphical Methods for Data Analysis by John M. Chambers, William S. Cleveland, Beat Kleiner and John W. Tukey (1983)
 The Elements of Graphing Data by William S. Cleveland (1985)
Bayesian First Aid: Two Sample ttest
As spring follows winter once more here down in southern Sweden, the two sample ttest follows the one sample ttest. This is a continuation of the Bayesian First Aid alternative to the one sample ttest where I’ll introduce the two sample alternative. It will be a quite short post as the two sample alternative is just more of the one sample alternative, more of using John K. Kruschke’s BEST model, and more of the coffee yield data from the 2002 Nature article The Value of Bees to the Coffee Harvest.
A Significantly Improved Significance Test. Not!
It is my great pleasure to share with you a breakthrough in statistical computing. There are many statistical tests: the ttest, the chisquared test, the ANOVA, etc. I here present a new test, a test that answers the question researchers are most anxious to figure out, a test of significance, the significance test. While a test like the two sample ttest tests the null hypothesis that the means of two populations are equal the significance test does not tiptoe around the canoe. It jumps right in, paddle in hand, and directly tests whether a result is significant or not.
The significance test has been implemented in R as signif.test
and is ready to be source
d and run. While other statistical procedures bombards you with useless information such as parameter estimates and confidence intervals signif.test
only reports what truly matters, the one value, the pvale.
For your convenience signif.test
can be called exactly like t.test
and will return the same pvalue in order to facilitate pvalue comparison with already published studies. Let me show you how signif.test
works through a couple of examples using a dataset from the RANDOM.ORG database:
Bayesian Mugs Galore!
Having no personal mug at the department I recently created a Bayesian themed one with the message “Make the Puppies Happy. Do Bayesian Data Analysis.” This is of course a homage to the cover of Johns K. Kruschke’s extraordinary book Doing Bayesian Data Analysis. I also ordered some extra copies of the mug and posted to some Bayesian “heroes” of mine and yesterday I got a mug back from Christian Robert (!) and an awesome one too! Here they are together with a not so interested cat (no treats in the mugs…)
Bayesian First Aid: One Sample and Paired Samples ttest
Student’s ttest is a staple of statistical analysis. A quick search on Google Scholar for “ttest” results in 170,000 hits in 2013 alone. In comparison, “Bayesian” gives 130,000 hits while “box plot” results in only 12,500 hits. To be honest, if I had to choose I would most of the time prefer a notched boxplot to a ttest. The ttest comes in many flavors: one sample, twosample, paired samples and Welch’s. We’ll start with the two most simple; here follows the Bayesian First Aid alternatives to the one sample ttest and the paired samples ttest.
Announcing pingr
: The R Package that Sounds as it is Called
pingr
is an R package that contains one function, ping()
, with one purpose: To go ping on whatever platform you are on (thanks to the audio package). It is intended to be useful, for example, if you are running a long analysis in the background and want to know when it is ready. It’s also useful if you want to irritate colleagues. You could, for example, use ping()
to get notified when your package updates have finished:
1


Bayesian First Aid: Binomial Test
The binomial test is arguably the conceptually simplest of all statistical tests: It has only one parameter and an easy to understand distribution for the data. When introducing null hypothesis significance testing it is puzzling that the binomial test is not the first example of a test but sometimes is introduced long after the ttest and the ANOVA (as here) and sometimes is not introduced at all (as here and here). When introducing a new concept, why not start with simplest example? It is not like there is a problem with students understanding the concept of null hypothesis significance testing too well. I’m not doing the same misstake so here follows the Bayesian First Aid alternative to the binomial test!
Bayesian First Aid
So I have a secret project. Come closer. I’m developing an R package that implements Bayesian alternatives to the most commonly used statistical tests. Yes you heard me, soon your t.test
ing days might be over! The package aims at being as easy as possible to pick up and use, especially if you are already used to the classical .test
functions. The main gimmick is that the Bayesian alternatives will have the same calling semantics as the corresponding classical test functions save for the addition of bayes.
to the beginning of the function name. Going from a classical test to the Bayesian version will be as easy as going from t.test(x1, x2, paired=T)
to bayes.t.test(x1, x2, paired=T)
.
The package does not aim at being some general framework for Bayesian inference or a comprehensive collection of Bayesian models. This package should be seen more as a quick fix; a first aid for people who want to try out the Bayesian alternative. That is why I call the package Bayesian First Aid.