Note: I have discussed today's topic in one of my best articles here some time ago, and I also gave even more technical insight in another piece. I decided to revisit the topic once more under the stimulus of a online HEP magazine, which is going to feature a text of mine soon. They do not care if I use the same text here too, so you get to read it here first.
A dangerous beast is hiding in today's searches for new physics -or even for "old" physics, such as the Higgs boson- at the Large Hadron Collider. It is called "Look-Elsewhere Effect", LEE for insiders. What is it, and why should you care ?
Imagine you look for a heavy particle decaying to a pair of hadronic jets: a commonplace test case in high-energy physics. You have your background model, which predicts the observable shape of the dijet mass distribution, and you know what kind of a bump in that shape a new particle signal would produce. So you search for such a bump in the data, but -not knowing where it might appear- you search everywhere.
You have worked all day, and the night is nearing; you prepare yourself a Martini and spin your analysis program. To your amazement, the program finds a significant bump at some particular mass value: is it a real signal?
To claim it is a new signal, the effect must reach or exceed the "five-sigma" significance level, five standard deviations away from the expectation: that's a rather silly but well-established rule. But if yours gives only 3.5 or four sigma, are you allowed to get excited and wake up your boss, or should you sit back and sip your Martini, with a "I know better" grin on your face ?
I claim the latter is a better option. You have fallen prey of the LEE: you looked in many places for a possible signal, and found a significant effect somewhere; this happens more often than it would if you had stated beforehand where the signal would be, because of the "probability boost" of looking in many places. A good rule of thumb is the following: if your signal has a width W, and if you examined a spectrum spanning a mass range from M1 to M2, then the "boost factor" due to the LEE is (M2-M1)/W. This may easily amount to a factor of 10 or 100, depending on the details of your
search. An effect occurring by chance once in ten thousand cases in a given place of your spectrum may actually be just a unexciting one-in-a-hundred fluctuation!
In fact, the "5-sigma" rule I mentioned above was conceived with exactly this particular effect in mind. Five sigma is a really, really rare occurrence (three in ten millions), and even including the LEE, plus considering non-Gaussian tails in measurement systematics (the other worry that kept the significance bar high in finding a working point for an "observation" claim), it is still something to take quite seriously.
Nowadays the annoying persistence of the Standard Model has brought us to seek compromises. We cannot grow old waiting for five-sigma signals, so we content ourselves with publishing 3-sigma ones; yet our scientific integrity demands us to account for the LEE. This is actually less easy to do than just multiplying a probability by (M2-M1)/W as in the example above: in complex searches such as that for the Higgs boson, which combine bump hunts in many channels, this is actually quite a headache.
A recent paper by Eilam Gross and Ofer Vitells (Eur. Phys. J. C70:525-530,2010) has clarified some of the technical issues. The searches for the Higgs boson by ATLAS and CMS nowadays size up the LEE by studying the probability of the background-only hypothesis as a function of the Higgs mass: the more the observed p-value distribution wiggles up and down as the signal mass hypothesis change, the stronger is the "trials factor", i.e. the required Look-Elsewhere-Effect correction. Another important thing to keep in mind is that the trials factor grows linearly with the observed significance (see figure below), a fact which had been overlooked in the past.
All the above is stuff for experts, for sure. But outsiders have better be aware that a three-sigma effect should not be blindly dubbed "evidence" for something new in the data. As travelers to a foreign country whose tax habits are unknown, you better ask before you buy, "LEE included or not" ?
Above: trials factor due to the LEE in a idealized bump search as a function of the observed significance Z_fix of a signal (blue curve). The growth with significance of the trials factor matches at high Z_fix the result of an analytical approximation (red dashed curve); the black curve shows the upper bound of the trials factor. For more details see Eur.Phys.J.C70:525-530,2010.
- PHYSICAL SCIENCES
- EARTH SCIENCES
- LIFE SCIENCES
- SOCIAL SCIENCES
Subscribe to the newsletter
Stay in touch with the scientific world!
Know Science And Want To Write?
- Matter Can Potentially Accelerate The Expansion Of The Universe
- Metal Hip Replacements Implanted Since 2006 More Prone To Failure
- The Number Of My Publications Has Four Digits
- Unique Fragment From Earth’s Formation Returns Home
- Does lower literacy make you a sucker for online health ads?
- Professor Frenkel: Why Shouldn't We Drop Algebra From Our Education System?
- The Universe, Where Space-time Becomes Discrete
- "Even using Wikipedia, an illustration of the conventional prejudice on the matter energy density..."
- "In Reading University Library there is a most interesting book Felix Klein and Sophus Lie by I..."
- "Correction (will merge this into the article later): Orange dwarf stars have lifetimes of 15 -..."
- "Lobos, after what you say about academia, I still wander why you keep the Harvard Veritas coat..."
- "For a pedagogical introduction to the Friedmann equations, see for instance this set of lectures..."
- Parents' presence at bedside found to decrease neonatal abstinence syndrome severity
- Breastfeeding app shows promise in supporting first-time mothers
- Study shows asthma-related Twitter posts can predict rise in hospital visits
- Mental health diagnoses rise significantly for military children
- Combination of face-to-face and online bullying may pack a powerful punch