About a month ago I held a three-hour course on "Statistics for Data Analysis in High-Energy Physics" in the nice setting of Engelberg, a mountain location just south of Zurich. Putting together the 130 slides of that seminar was a lot of work and not little fun; in the process I was able to collect some "simple" explanatory cases of the application of statistical methods and related issues. A couple of examples of this output is given in a post on the fractional charge of quarks and in an article on the weighted average of correlated results. But there is more material to be dug out.
Among the many things I had marked down for subsequent discussion in this blog is a very interesting statistical "paradox", which highlights the opposite conclusions that a researcher might reach if she were to adopt either of the two "schools of thought" of theoretical Statistics. Examples of this kind are important because we use statistical procedires for most of our scientific results, and as scientists we are accustomed to think at the data as giving unequivocal answers. Statistics, however, messes things up, since the experimenter's choices have an impact on the final result.
The paradox in question is called "Jeffreys-Lindley" and arises in cases when one tests a hypothesis with a high-statistics dataset. Rather than discuss the generalities, I think it is most proficuous if I go straight to a practical example, leaving the discussion to the end.
Imagine you analyze data from one of the LEP experiments: these were four detectors (L3, ALEPH, DELPHI, OPAL) that looked at electron-positron collisions at a center-of-mass energy of 91 GeV. Such collisions are very clean -the projectiles are structureless, and their interaction is rare enough that only one collision every few second is produced. Further, the final state has zero total electric charge, and the "hermeticity" of detectors for e+e- collisions allow the detection of a large fraction -close to 100%- of the charged particles that get produced. So the researcher might decide to study with high precision the possible charge bias of her detector by just counting tracks with positive and negative curvature. The track curvature is due to the strong axial magnetic field existing at the center of these detectors (except L3, where the field was absent).
After counting many tracks, a result significantly different from zero would clearly indicate that the detector or the software have a different efficiency to reconstruct tracks curving in one or the opposite direction under the action of the axial magnetic field.
Let us say that she collects n_tot=1,000,000 tracks, and gets n_pos=498,800 positive and n_neg=501,200 negative tracks. The hypothesis under test is that the fraction of positive tracks is R=0.5 (i.e., no charge bias), and let us add that she decides to "reject" the hypothesis if the result has a probability of less than 5%: this is referred to as a "size 0.05" test.
A Bayesian researcher will need a prior probability density function (PDF) to make a statistical inference: a function describing the pre-experiment degree of belief on the value of R. From a scientific standpoint, adding such a "subjective" input is questionable, and indeed the thread of arguments is endless; what can be agreed upon is that in science a prior PDF which contains as little information as possible is mostly agreed to be the lesser evil, if one is doing things in a Bayesian way.
A "know-nothing" approach might then be to choose a prior PDF by assigning equal weights to the possibility that R=0.5 and to the possibility that R is different from 0.5. Then the calculation goes as follows: the probability to observe a number of positive tracks as small as the one observed can be written, with x=n_pos/n_tot, as N(x,σ), with σ^2=x*(1-x)/n_tot (we are in a regime where the Gaussian approximation holds, and the variance written above is indeed the Gaussian approximation to the variance of a Binomial ratio). N(x,σ) is just a Gaussian distribution of mean x and variance σ^2.
If we have a prior, and the data, we can use Bayes theorem to obtain the posterior PDF: in formulas,
(If the above does not make sense to you, you might either want to check Bayes' theorem elsewhere, or believe the calculation as is and try to make sense of the rest of this post -the math is inessential).
From the above value, higher than the size α=0.05 and actually very close to 1, a Bayesian researcher concludes that there is no evidence against R=0.5; the obtained data strongly support the null hypothesis.
Frequentists, on the other hand, will not need a prior, and they will just ask themselves how often a result as "extreme" as the one observed arises by chance, if the underlying distribution is indeed N(x,σ^2), with x=0.5 and σ^2=x*(1-x)/n_tot as before. One then has:
(we are multiplying by two the probability in the second row, since if H0 holds we are just as surprised to observe an excess of positive tracks as a deficit!).
From the above expression, the Frequentist researcher concludes that the tracker is indeed biased, and rejects the null hypothesis H0, since there is a less-than-2% probability (P'<α) that a result as the one observed could arise by chance! A Frequentist thus draws, strongly, the opposite conclusion than a Bayesian from the same set of data. How to solve the riddle ?
There is no solution: the two statistical methods answer different questions, so if they agreed on the answer it would be just by sheer chance. The strong weight given by Bayesian on the hypothesis of a unbiased tracker is questionable, but not unreasonable: the detector was built with that goal. Notice, however, that it is only the high-statistics power of the data that allows the contradiction with a Frequentist result to emerge. One might decide to play with the prior PDF and make it such that the Bayesian and Frequentist answers coincide; but what would we learn from that ?
- PHYSICAL SCIENCES
- EARTH SCIENCES
- LIFE SCIENCES
- SOCIAL SCIENCES
Subscribe to the newsletter
Stay in touch with the scientific world!
Know Science And Want To Write?
- Why Do Spacecraft Like ESA's Schiaperelli Crash On Mars So Easily?
- BPA-Free, With Regrets
- A Dimuon Particle At 30 GeV In ALEPH ??
- New President - Pivot To Moon On Way To Mars? Lunar Spelunking & Science Surprises
- President Obama, Why Humans On Mars Right Now Are Bad For Science
- Biofuels Are A Climate Mistake
- Who Is Trying To Destroy The Internet?
- "For years I've been telling a friend of mine who is diabetic to stop drinking diet soda. He starts..."
- "Instead of the original article - the comments of Chandra Kant Raju are here..."
- "Good points indeed. But I think what the ESA did this time is unusual, so unusual, I don't think..."
- "I agree with most of what you say, but I don't think it's accurate to say that there isn't a statistically..."
- "Oh, you mean things like this?Sunset photo by Tom Hall.That's just the thing I explained above..."
- Pollution Is Many Things - Expensive Is Not One Of Them
- Necrotizing Fasciitis: A Profound Mystery in Medical Microbiology
- Early Math Classes Biased Against Girls, Affecting Career Choices, Study Finds
- Sucralose Study Ripe for Scare-Mongering
- Is Modern Feminism Incompatible with Science?
- When All Else Fails, Bribing Kids to Eat Better