Imagine if there were a simple single statistical measure everybody could use with any set of data and it would reliably separate true from false. Oh, the things we would know!

That is unrealistic, of course.

Yet statistical significance is commonly treated as though it were a magic wand. Take a null hypothesis or look for any association between factors in a data set and abracadabra! Get a “p value” over or under 0.05 and you can be 95% certain it’s either a fluke or it isn’t. You can eliminate chance! You can separate the signal from the noise! You can declare that Republicans are anti-science and Democrats have prettier daughters!

Except you can't. In physics, we call it convergence. The longer you spend on an analysis, the lower the margin of error - but you might be getting a terrifically accurate, completely wrong answer. If you run the analysis again and get the same answer, that is not 'replicated' and therefore true - if you solved the wrong problem.

And that can spell doom in science also. Statistical significance only estimates the probability of getting a similar result if you repeat the experiment, given the same circumstances.


Writing at Scientific American blogs, Hilda Bastian makes two key points: The need to avoid over-precision and take confidence intervals or standard deviations into account, because when you have the data for the confidence intervals, you have a better picture than statistical significance’s p value can possibly provide; it’s important to not consider the information from one study in isolatio. One study on its own is rarely going to provide “the” answer.

Statistical significance and its part in science downfalls By Hilda Bastian, Scientific American