In a post where he means to attract the wrath of the whole of CMS on me, using the fact that I teased my most gullible readers with a (wrong) covert give-away of the Higgs mass, he also mentions that, by using a formula I warned should be used with caution (the simple formula of the weighted average of two measurements), he can now quote the best estimate of the Higgs boson mass, from the rumored CMS and ATLAS mass measurements (!) and their quoted significances (!!).
Let me tell you straight away that:
- neither CMS or ATLAS have released information on these quantities;
- neither CMS or ATLAS have claimed to have observed the Higgs boson;
- neither CMS or ATLAS will probably do any of the above on December 13th, because there is not enough information in the data to do so yet. The experiments will probably be just quoting exclusion limits, and p-value distributions for the no-signal hypothesis.
Now, let's go back to Lubos. He pretends he knows masses and significances "measured" by the two experiments, and he proceeds to combine them. How ? By using a weighted average. He argues that if masses are m1 and m2, and signal significances are s1 and s2, then one should just do M = (m1*s1^2+m2*s2^2)/(s1^2+s2^2). This way Lubos is able to quote, hear oh hear, a Higgs mass with two decimal places!
I could not resist teaching Lubos some statistics in the comments thread of his blog. Knowing, however, that he sometimes plays the game unfairly (although he can also act more fairly sometimes, as he showed at least in one occasion), I prefer to repeat the argument here, to avoid it from getting lost in the hyperspace of unapproved comments.
So my answer to Lubos is the following: taking a weighted average with significances is close to meaningless. For a scientist, of course - for the grocery clerk maybe it can be fair enough (and I know grocery clerks who would beat Lubos at chess blindfolded, for instance).
The explanation is easy, but you need to follow it. The proof is by example.
Imagine you have two experiments, who measure a new particle signal from an excess of events over expected backgrounds. Experiment 1 measures 10 events when 0.1 would be expected from backgrounds, while Experiment 2 measures 50 events when 15 would be expected from backgrounds. Note that it is by no means strange to be in such a starkly different situation: different experiments apply different strategies to optimize their selections, plus -in the case of the Higgs- one could be mostly observing an excess of H->ZZ decays (rare, but with high S/N ratio) or of H->gamma-gamma decays (frequent, and background-ridden).
So what do we do with the numbers ? First of all note that Experiment one has by far the better significance: observing 10 events when expecting 0.1, the significance is a very large number. Instead, observing 50 events when expecting 15 is a much weaker observation. So the "weights" that Lubos would unknowingly and unwittingly blindly use in his weighted average are very different: the larger weight would be given to the mass measurement produced by Experiment 1, proportionally to the square of the significance it has.
On the other hand, ask yourself which of the two experiment has the better measurement of the signal properties. Take the cross section as an example since it is directly derived from the event counts. The first experiment measures a cross section corresponding to 9 events with an error of about 3- so the relative uncertainty is 33%, roughly speaking. The second, instead, measures 35 events with an error of 7, or a 20% uncertainty. The second experiment can measure the cross section much better !! How strange !
Not strange at all - it is due to the fact that signal significance is not directly proportional of the "number of sigma" that separate signal plus background counts from background counts alone.
Now, imagine you are measuring the mass of the particle. In one case you have a 10 event sample, in the other you have a 50 event sample. Of course you have more background in the second case, but your mass measurement will have a smaller uncertainty in the second case.
So, taking a weighted average with as weights the squared significances rumored to be observed by the two experiments is not only far-fetched (and let's omit to say it forgets that the experiments have different mass resolutions in the most signal-rich Higgs decay channel), it is simply dumb and incorrect, except in very rare situations.
I hope the above may be useful to the many of us who had never stopped to think of the subtleties behind significances in the regime of small event counts. As for Lubos, he is smart so I believe he will realize he was wrong, and rather than starting to argue that in the case of the Higgs search at the LHC his formula works, he will move on to some other way of defamation ;-)