I am endlessly amazed by observing, time and again, that even experienced colleagues fall in the simplest statistical traps. Mind you, I do not claim to be any better - sorry, let me rephrase: to have been any better in the early days of my career as an experimentalist. But then, I started to appreciate that to really understand physics results I needed to at least get familiar with a small set of notions in basic Statistics.

So I insist that my colleagues should pay more attention to a few basic concepts. In this blog I have erratically tried to educate my readers on Statistics topics; but the matter is not as exciting as real particle hunts or discoveries, so I know that I cannot expect a large audience when I get down to formulas and hard math concepts. Because Statistics, see, is indeed tough.

Nevertheless let me try today at least to explain something you should understand if you are to interpret correctly some of the results you are often exposed to, if you follow particle physics. Let us imagine that we are looking for a signal of a new particle, and we actually see one, with a large significance. I have two questions for you.

1) Say we observe an excess of event counts due to our searched signal, and with it we measure the cross section to be 10+-2 nanobarns, where the quoted uncertainty (+-2) is statistical only. What can we say about the significance of the observation on which the measurement is based ?

1A) It is equal to 5 standard deviations;
1B) It is equal or larger than 5 standard deviations;
1C) None of the above is necessarily true.

2) Say we measure the cross section to be 9+-3 nanobarns, where the uncertainty is the combination of statistical and systematic effects. What can we say about the significance of the observation ?

2A) It is equal to 3 standard deviations;
2B) It is equal or larger than 3 standard deviations;
2C) None of the above is necessarily true.

I am sure you are looking at these answers and are wondering what the hell is my point. Let us take the first question, then. For sure, 10 is "five-sigma" away from zero, since the statistical uncertainty is 2, and we are assuming Gaussian distributions for the uncertainties. The problem is that significance is a measure of the incompatibility of the observation with the null hypothesis, and the null hypothesis is that the cross section of the tentative new signal is zero: measuring 10+-2 does not tell us much about the compatibility of the data with the background-only hypothesis, because that depends on the background! Let me give you three examples.

- I expect 9500 events from background sources, with high precision; and I see 10000. This is a five-sigma effect, and indeed, upon subtracting backgrounds, I have 500+-100 events of signal (10000 has a Poisson uncertainty equal to sqrt(10000)=100), so the cross section has (at least) a 20% uncertainty.
- I expect 1 event from background sources, with high precision; and I see 26. This again allows me to get a cross section measurement with 20% statistical uncertainty (25+-5.1 events of signal, since 5.1 is the square root of 26, the Poisson uncertainty on the event count). However, the significance of observing 26 events when I expect 1 is a very, very large number - much larger than 10 standard deviations!

- I expect 100 events from background sources, with a systematic uncertainty of +-50%. I see
169 events. The excess is 69+-13 (13 is the square root of 169), but this is not a significant observation at all - the systematic uncertainty on the background tells us that the background-only hypothesis is perfectly acceptable, being just 1.4 standard deviations away from the observed counts.

What we get from these examples is that we cannot obtain information on the validity of the background-only hypothesis (which is what we are referring to if we talk about "significance": significance comes with "with respect to the null hypothesis", be it that we specify it or not) by just looking at the fitted signal cross section - especially if the uncertainty we are given is statistical only !

Now, let us examine example 2. Here we have a complication: we are calling in "systematic effects" without qualifying them better. Systematic effects may be due to the uncertainty in the background prediction as in the third case exemplified above, or in the signal acceptance, or on the luminosity of the data... You name it. Some of these will affect our estimate of significance of the background-only hypothesis, others have nothing to do with it.

For instance, if I measure a cross section of 9+-3 nanobarns, the error (+-3) may be due to my 33% systematic uncertainty in the luminosity of the data: if that is true, then the cross section comes from a measurement of a signal with very high precision, and the uncertainty has nothing whatsoever to do with the size of the signal, but rather on the derivation of the cross section from the signal observed, through the formula N = σ L (N events observed, σ cross section, L luminosity corresponding to the studied data). The significance of the signal, N, may be as large as we want, but we still have "only" a 33% cross section determination.

I hope this clarifies that if you see a report of observation of a new particle, with a cross section measured with 50% accuracy, that does not mean that the observation is on shaky ground. That is a different question you are asking !