On Saturday instead I will be at the aula magna of the Mantova University, where in company with Gian Francesco Giudice (a CERN theorist) I will discuss the Higgs boson discovery and the aftermath. That is a more "serious" event and we will be discussing in front of a paying audience. I hear that the event is already sold out, so it will (should) be interesting!
Friday's event is called "Five Sigma". I have 30' to explain to laypeople what it is that 5 sigmas actually mean. I have drafted something and I offer it below, in the hope that you can tell me what should be changed and what is definitely to remove, in the interest of allowing everybody to understand the things I try to explain.
Please bear in mind that this is a very first draft. Indeed, I need your input! Also, it is a quick-and-dirty translation from the Italian text I originally drafted, so excuse me for the occasional lapse of syntax.
Abstract:Five standard deviations: the signal of the Higgs boson had to reach this level in order for a discovery to be announced. But what does this mean ?
[In a brief introduction, I will describe the LHC, the experimental facilities, and the physics of proton-proton collisions. That should take me five to ten minutes, oh yeah ;-) ]
As if we were looking for the classic needle in the haystack, in order to search for a new particle we need to concentrate on a particular characteristic capable of distinguishing it from background noise. For a needle, this could be its shiny appearance, or its shape or colour. For an elementary particle, the distinctive feature is clearly its mass: if we are able to measure the mass of the particle from the products of its decay -what we can detect in our detectors-, we have a powerful means for our search: while backgrounds have mass values randomly distributed, and if histogrammed these will have a flat or flattish shape, without evident deformations, the sought particle will give an excess of "events" always at the same value.
(show a graph with a mass distribution and a peaking structure on the blackboard).
This in short is what has been done to discover the Higgs boson last July: identify, amidst trillions of collisions produced by the accelerator, those few thousands that "look like" what we expected to see if a Higgs boson were produced; compute, with the detected signals, the mass to which each event corresponds; and finally, put these data in a graph.
By doing the above one could end up seeing a characteristic "hump" in some point of the graph. Why a hump and not a spike at a very well-defined value ? Because our mass measurement suffers from some imprecision: the experimental resolution is not infinite, so we expect a Gaussian-shaped hump, with some non-zero width.
And now who says we can conclude we have found a new particle ? First of all, we need to construct a model of how the background behaves. We will find, if we have done a good job, a good model which fits successfully our data points everywhere except where the hump is, like in the graph.
(Pass a smooth curve through the data histogram in the graph).
This curve represents our "null hypothesis": if there is no Higgs boson in the data, the events distribute, on average, according to this shape. Note, however, that the data exhibit statistical fluctuations: if for instance in a single point of the histogram (a "bin") we expect to see 100 events according to our "null hypothesis", this is the average value of our expectation, and not necessarily will we count exactly 100 events there: we might see 95, or 112, or 103. The intensity of these statistical fluctuations depends on something we call "Poisson statistics", which in short says that in 68% of cases, if you expect 100 events you will count something between 90 and 110. What Poisson statistics tells us is that the "sigma", the standard deviation of 100 events, is equal to 10, the square root of 100.
As a side note, 68% is a number which corresponds to the area of a Gaussian distribution taken between minus-one-sigma and plus-one-sigma: sigma in fact is the parameter of a Gauss curve which says how wide this curve is. If I widen the Gaussian, I am increasing its sigma, so that between -1 and +1, in sigma units, there is always 68% of the total area of the curve.
(Graph showing different Gauss curves, driving home the point).
Now, let us imagine that we count all together the events in our "hump", for simplicity. In the real analysis we do something much more complicated, a "unbinned likelihood fit", but lets ignore this detail. The signal distributes in four adjacent bins, and our "null hypothesis" predicts that we should count about 400 events, with a standard deviation equal to the square root of 400, or about 20. What can we then say if we see 400 events ? 420 ? 450 ?
If we expect 400 events and we see exactly 400, certainly our "null hypothesis" passes the test: we cannot conclude that there might be other processes, like the presence of a particle with a mass corresponding to the interval we picked, contributing an additional amount of events to those four bins.
If we expect 400 events and we see 420 we have to act nonchalant: it happens over 16 times in 100 that one sees an excess at least as large as that one in the data (16% is half of 100% minus 68%). What those 20 more events are is an excess of "one sigma", one standard deviation.
If we see 450, we are instead at the level of two-and-a-half sigma. 50 events of excess, with an uncertainty of +-20.
(show in the graph where we are in the Gaussian tail).
In that case we could start wondering whether those 50 excess events might really be the result of 50 Higgs events in the data, an addition to the 400 of background constituting our "null hypothesis". But, since we are talking about a probability of about 1% -the area of a Gaussian from +2.5 sigma to infinity-, we can only hope that, by collecting more data, that excess becomes stronger statistically. As it is now, it does not allow us to announce a discovery! In fact, in particle physics we are not satisfied to see a departure from expectations at the 1% level in order to claim we have seen something new. I will say more about why we are much more strict at the end of the lesson.
Let us now imagine we multiply by four our data sample, by collecting data four times longer than we did at the beginning. Now the 400 events have become 1600 (our null hypothesis) in those four adjacent bins. What is one standard deviation in that case ? It is always the square root, which for 1600 events is about 40. Not 80! The error has doubled (from 20 to 40), rather than becoming four times larger. Increasing the statistics by a factor of four has decreased the relative error by a factor of sqrt(4), or 2. So our background expectation, according to the null hypothesis, is 1600+-40.
Now if those 50 events of excess that we saw in the first fourth of the data had been due not to a fluctuation of backgrounds (which would be most likely to get washed away by the added data) but to a true contribution of a new particle, we would expect to see four times more of it, that is 200 events sitting on top of 1600, for a total of 1600+200=1800 events.
If we see 1800 events and we expect 1600 from our null hypothesis, and the error on 1600 is the standard deviation, that is sqrt(1600)=40 events, we have an excess that is roughly five times larger than the sigma: 200/40= 5. [Note that the actual estimate of the standard deviation might be the square root of 1800, if we base our background prediction on the data actually observed in those bins; or be still equal to 1600 events, if this is a prediction coming from a subsidiary dataset. This is a detail and I will not stress it in the presentation].
The probability that a statistical fluctuation of those 1600 events departs from the average by five standard deviations is about one in three millions: it is rare enough to allow us to claim a discovery!
So, "five sigma" is the minimum requirement to announce a discovery in particle physics. This is what happened at CERN last year: a first experimental evidence of Higgs events was seen in the data collected until December 2011, but the excess was not sufficient to announce a discovery: it amounted to about three standard deviations. In July 2012, however, the added data allowed CMS and ATLAS to independently claim they had 5-sigma excesses in their datasets.
Five sigma are a very tough requirement. If you compared it to the levels at which statistical evidence is claimed in medicine, leave alone social sciences, you would conclude that physicists are masochists. However, there is a good reason for this. We search for new particles and effects in a number of places all of the time; a two- or even three-sigma excess is bound to occur here or there which is just due to a statistical fluctuation: small probabilities add incoherently, making it virtually certain that some discrepancy will be seen somewhere ! Only the strict "five-sigma" criterion protects us from claiming false discoveries!