Correlation, Causation, Independence
    By Tommaso Dorigo | December 13th 2012 03:24 AM | 9 comments | Print | E-mail | Track Comments
    About Tommaso

    I am an experimental particle physicist working with the CMS experiment at CERN. In my spare time I play chess, abuse the piano, and aim my dobson...

    View Tommaso's Profile
    This is a post about basics. That's because I think a point needs to be made which is surprisingly not as well-known as its elementary nature would have you guess.

    Correlation -in its most used version, due to Pearson- is a measure of how two quantities can be observed to be in linear dependence on one another. It is a very common quantity to report the results of scientific studies, particularly but not exclusively in the social sciences. Researchers try to evidence the presence of a correlation between two phenomena as a preliminary step to investigating whether one can be the cause of the other.

    There is of course nothing wrong in measuring correlation. The problem is of course when interpreting the results. If I see a tight correlation between chocolate consumption in a country and its rate of Nobel prize recipients, should I conclude that eating chocolate makes one smarter ? Or should I rather conclude that winning a Nobel prize makes one eat more chocolate ?

    Puns aside, the distinction between correlation and causation should be clear to anybody reading this blog. For instance those arguing that vaccines cause autism on the basis of vague correlation measurements should have a look at the graph above (courtesy Hank's Facebook account) which would have them conclude that organic foods are rather the cause of autism! (But others might conclude that it is parents of autistic children who buy all the organic foods...)

    So, the point about correlation and causation is clear. But there is another point to make which I think is not always clear to everybody. The absence of correlation between two variables is a much weaker condition than their independence. We often use "uncorrelated" as a synonim and substitute of "independent", but this is completely wrong from a mathematical standpoint! Two uncorrelated variables may in fact be completely dependent one of the other!

    Wikipedia has a nice figure to illustrate the point. It is shown below.

    As you can see from the bottom set of graphs, you can have many different interdependence patterns between two variables with a zero correlation coefficient. But what the graph does not show is that you can even have an exact functional dependence between two variables (e.g. meaning, in the case of organic foods and autism, that if you told me the sales in $ of organic foods I could tell you exactly how many cases of autism are diagnosed that year) and still get a zero correlation coefficient!

    Such is for instance the case of y=x^2, when x is in [-1,1]. This is a perfect parabolic relationship, and sets of points drawn at random from the curve will have a correlation coefficient compatible with zero (in the ensemble sense that you will find x% of sets with zero correlation at confidence level x%).

    This means that, while you must be careful about concluding that there is some cause-effect relationship between two observable quantities based on their correlation, you must be even more careful when attempting to conclude that there is independence of the two from the absence of a significant correlation between them!

    Please remember this often overlooked fact!


    I must admit that after reading this post I feel as ignorant about this issue as before. And I am a professional mathematician! (not a statistician, though). I only felt a little less ignorant after checking on wikipedia what is the "product-moment correlation coefficient". You should probably make this point clear, if you want to reach more readers than mere physicists - and, in my opinion, also if you would like to reach an "a-ha!" at the end of the sentence "you can even have exact functional dependence between two variables [...] and still get a zero correlation coefficient! "

    The difference between the correlation of autism and sales of organic foods and injection of children with organic mercury, both dating back to 1988, is that there are over 170 published, peer-reviewed papers listed on Pubmed linking mercury preserved vaccines to autism but non linking organic food to autism.

    Gerhard Adam
    Yet, if that were true, then one would expect to see a decline in autism, since the organic mercury was banned and hasn't been used in childhood/infant vaccines for over a decade.
    Mundus vult decipi
    I'd love to see that list of 170 peer-reviewed studies linking autism and vaccines, since that pretty much overturns 100% of epidemiology and opens the door for a trillion dollars in lawsuits.
    There won’t be any lawsuits. A provision in the 2005 Senate Bill S-3 called the "Protecting America in the War on Terror Act," effectively insulates the pharmaceutical industry from liability for thimerosal poisoning. This replaced a piggyback on the 2002 Homeland Security bill that freed drug companies of liability in lawsuits regarding thimerosal, which was repealed in 2003.

    Organic mercury has been replaced in early-childhood vaccines by aluminum compounds that are also known to be neurotoxic. The first law of toxicology is “The poison is in the dose”. The total dose of aluminum that two-month olds receive from vaccines is around 50 times the recommended maximum dose, just as total mercury dose in vaccines was tens of times the then-recommended maximum exposure for a child, a large number of shots being delivered in rapid succession. These doses are potentially harmful and people are right to be concerned. Thimerosal is still present in flu vaccines. In 2009/10 when pregnant women were given the swine-flu shot on top of the seasonal flu shot there was a coinciding rise in miscarriages that amounts to around a 10-fold increase in risk. Once again, a possible mercury overdose is in the picture.

    Gerhard Adam
    ...aluminum compounds that are also known to be neurotoxic.
    Please provide link that supports such an assertion [actual research and not opinion]. dose of aluminum that two-month olds receive from vaccines is around 50 times the recommended maximum dose
    Again, provide a link. 
    ...when pregnant women were given the swine-flu shot on top of the seasonal flu shot there was a coinciding rise in miscarriages that amounts to around a 10-fold increase in risk.
    Again, a link.

    You're making claims that are not backed up by the data.  So, if you're going to make such assertions then provide the links to scientific papers that support your view.  If you're only going to link to sites that already support your view [again without evidence], then it's simply propaganda and not worthy of consideration.

    ... and please ... no Mercola links.
    Mundus vult decipi
    On the neurotoxicity of aluminum:

    On aluminum and autism:

    On the safety of aluminum adjuvants and total aluminum exposure from early childhood vaccines:

    And on a possible link between miscarriage and the flu vaccines given in 2009/10:

    You'll notice these aren't mercola links.

    For what it's worth, whilst I'm willing to accept that there may very well be a link between vaccines and autism I don't believe that this is a particularly significant contributor to the increase in autism diagnoses. There are two reasons for this. Firstly, compared to the total number of vaccinations there are relatively few families claiming that their children have been injured by the vaccinations, so there would have to be huge under-reporting for vaccination to be the main contributor (which I think in part proves Tomaso's original point). Secondly, I've spent a fair bit of time teaching over the last decade or so and have witnessed a very significant increase in efforts to get special educational needs diagnosed early and reliably, which seem to be working. Despite this, I would hesitate to get an infant vaccinated as I'm satisfied that the risk is likely real, and stand by my right to do so.

    Did you create the organic-autism graph above? I have been trying to track down the author so I can credit it when I present it. I think it is an excellent example of the folly of throwing around correlations to try to suggest causation. We see it with autism-vaccine claims, as well as health effects of GE crops.

    I first saw it on the Skeptical Libertarian Facebook page but there was no article, just the graph. You might ask them.