Faster Than Light Neutrinos Succumb To Bayesian Method
    By Sascha Vongehr | November 11th 2011 07:16 AM | 19 comments | Print | E-mail | Track Comments
    About Sascha

    Dr. Sascha Vongehr [风洒沙] studied phil/math/chem/phys in Germany, obtained a BSc in theoretical physics (electro-mag) & MSc (stringtheory)...

    View Sascha's Profile

    In 2011, science has been confronted with several high profile awkward situations of having to explain why standard methods like classical significance analysis are acceptable in for example medical studies on the safety of a new vaccine but not when results put orthodoxy into doubt. The most infamous among them is the 6 sigma significance of the OPERA confirmation of previously by MINOS indicated faster than light neutrinos. Second place: evidence for precognition in a work [1] that abides by all the usual scientific methods and passed peer review in a top tier journal. Maybe strong hints for quantum brain processes [2] (discussed here) come in third. The latter is another of many phenomena that skeptics out to defend scientism argue to be impossible [3].

    Such novel results are of course doubted as they should be, with a healthy dose of scientific skepticism. However, they are also bitterly argued against. Scientists who seriously consider possible consequences of the novel results are portrayed as crackpots and made to look ridiculous. The latter is perpetrated also by scientists who claim to defend the scientific method, and the sad thing is, it is the majority of scientists that are visible to the public who engage in such unscientific bullying and silencing. The public does not understand the fine details and the perception that arises is that yet again orthodoxy and bias trumps the scientific method and public trust into all things science vanishes accordingly. Knee jerk skepticism backfires.

    This article will introduce on a lay level how biased skeptics actually argue in somewhat more detail, where they employ pseudoscience in the name of their “war against pseudoscience”. This includes a hopefully memorable (mnemotic) introduction of Bayesian Probability that may be worth reading by itself and derives Cromwell's Rule, which is basically the rule that skeptics exploit.

    Extraordinary Claims require Extraordinary Authority

    A fashionable reply is Marcello Truzzi’s "Extraordinary claims require extraordinary proof". Carl Sagan popularized this as "Extraordinary claims require extraordinary evidence", which may though rather be based on Laplace’s "The weight of evidence for an extraordinary claim must be proportioned to its strangeness." Anyway, "Extraordinary claims require extraordinary evidence" sounds nice, which is the most important aspect of it, because this motto serves mainly as a truncheon in the skeptic’s tool bag. It hides the Argument from Authority by one more step: Who decides what counts as extraordinary rather than expected?

    In order to have the argument from authority appear like a proper scientific argument, they employ Bayesian Probability, which is indeed the most advanced way of dealing with uncertainty. We will see below how it is corrupted but first discuss that this is done increasingly often; I hope such may help the readers to spot occurrences of such false arguments. In short: Any undesired scientific significance can be diminished by mixing undesired data with a so called ‘prior’ probability, which is often merely a held belief on false support.

    Bayesian Probability: A verdict in a criminal trial should not be based on the probability of guilt, but rather the probability of the evidence, given that the defendant is innocent.

    Bayesian Updating whenever something is undesired? They wouldn’t, would they?

    You may have heard "Extraordinary claims require extraordinary evidence", but people employing Bayesian probability to hide an argument from authority – is that actually done? I personally just had a vague suspicion nourished by online comments about that one should use Bayesian Inference and that such would argue against for example faster than light neutrinos. Yet I did not think much of it. Until I saw this: Wagenmakers, et al.: “Why psychologists must change the way they analyze their data: The case of psi.” [4].

    Effectively, undesired evidence is to be multiplied with a prior equal to zero, namely the dogma insisting on that faster than light or influence from the future or the earth orbiting the sun are impossible, period. Those who did not get caught up in orthodoxy know that relativity looks very much like a merely emergent, low-energy phenomenon [5], thus faster than light phenomena are expected at extremely high energies. It is simply not true [6] that faster than light particles necessarily travel back in time and kill your grandpa (they don’t). On closer inspection, an unpopular phenomenon often becomes expected and perhaps fundamentally happens all the time (read: is ordinary) rather than extraordinary. That the world is not flat is “extraordinary” simply because of ordinary orthodoxy.

    People are increasingly aware of the history of science, so just calling something “extraordinary” is of course not sufficient. Scientific significance of empirical evidence is based on statistical measures. Statistics allows for any desired level of sophistication.

    The Bayesian Method

    Many still fight over what probability is at all. In the Classical interpretation, probability is the ratio of favorable outcomes to possible outcomes. This is circular, because it assumes that the different possibilities have some probability assigned already, say by symmetry arguments: the two sides of a coin are equivalent as far as the falling and landing are concerned when tossing one, leaving heads and tails equiprobable. In the Frequentist approach, probability is strictly counting the occurrences of the different outcomes over many trials. If you did not practically count yet, you assume the system under investigation, say a coin, to be similar to one you experimented with before.

    The Bayesian approach (pronounced BAYZ-ee-un) is a mixture of both. Classical probability enters in the beginning and is often expressed in terms of subjective degree of belief in some proposition. Empirical counting then feeds into the “Bayesian updating”. How does this work?

    Well, if you want a PhD in physics, you need to work with P, H and D, namely probabilities P, Hypotheses H, and Data D. The joint probability of both, namely hypothesis and data to be true at the same time, we write PH&D. It is obviously the probability of the hypothesis given the data, we write PHgD, multiplied with the probability of the data, which is expressed via PD:

    PH&D = PHgD PD

    PH&D is the same as PD&H, therefore we may as well write PDgH PH on the right hand side. The interpretation is now different: PDgH is the likelihood of the data given the truth of (assuming) the hypothesis. This is multiplied with the probability of the hypothesis. Now you know the difference between probability and likelihood.

    The advantage of starting like this is that the thus easily remembered line

    PH&D = PHgD PD = PDgH PH

    has two not as easily remembered formulas inside of it. The left equation gives you the conditional probability of H given D (whatever H and D stand for), namely PHgD = PH&D / PD. The right equation gives you Bayes' theorem:

    PHgD = PDgH PH / PD

    This one has an interesting interpretation. The first term is also just the conditional probability for H given D. However, it is now called the posterior probability, because it is the new probability that your hypothesis has taken on after taking into account the data! PH, the probability of the hypothesis before new data are taken into account, is called the prior probability or short prior, because it came first. “Posterior = Likelihood * Prior / Data

    Bayesian updating goes as follows: As you accumulate more data, you update your previous prior probability with this formula to get the improved probability, called posterior probability. The next time around, you use this posterior as the new prior. This is the most consistent way to calculate probabilities, because there are for example no tricks like Dutch books possible with this method. In case a true probability exists (say I rigged a game and let you play), you will via accumulation of ever more data eventually see the posterior probability approach the true probability, even if starting from a very wrong prior assumption. This is excellent science, because contrary to the classical approach, starting with a wrong concept will not chain you to that wrong concept. The truth is out there; you find it in the data.

    Let me give a simple example and then show how to corrupt it.

    A Simple Black and White Example:

    There are two kinds of urns with ten balls each. The first type of urn has one black ball and nine white ones. The second type of urn has four black ones and six white ones. You are given one urn and draw a ball at random. It turns out to be black. What is the probability that you where given the first type of urn?

    Call the hypothesis that you were given the first or second type of urn “H1” and “H2”, respectively. Since you do not know the chances with which you were given the first rather than the second type of urn, you naturally assume that the prior probabilities of these hypotheses, PH1 and PH2, are equal, namely both are 50% (or PH1 = PH2 = 1/2). The data, here your black (B) ball, will now be used to improve on this prior assumption.

    The posterior probability given the data D= B and asking for hypothesis H1 is given by Bayes' theorem stated above, namely PHgD = PDgH PH / PD becomes PH1gB = PBgH1 PH1 / PB.

    The likelihood PBgH1 is 1/10, because the first urn has one black ball out of ten in total. The probability of the data PD is gotten via adding all possible ways in which they can arise while however weighting these ways by their probabilities. In general this is PD = PDgH1 PH1 + PDgH2 PH2.

    For the data being just the black ball D = B holds therefore

    PB = PBgH1 PH1+ PBgH2 PH2 = 1/10 * 1/2 + 4/10 * 1/2 = 5/20

    And so we are almost finished. The result is: PH1gB = PBgH1 PH1 / PB = 1/10 *1/2 / (5/20) = 1/5.

    Only one out of five! Naturally, we expected this, because the first type of urn contains so many white balls, the hypothesis that it was the first type of urn is not supported by the data of having drawn a black ball. You can do the same calculation for the second hypothesis H2:

    PH2gD = PDgH2 PH2/ PD becomes PH2gB = PBgH2 PH2 / PB = 4/10 * 1/2 / (5/20) = 4/5.

    In the beginning, you knew nothing much, so your priors were unbiased (= 1/2), but now your data has given you some insight about what type of urn you were given: H1 has posterior 1/5 while H2 has 4/5; the total is 100%.

    Enter Faster Than Light Neutrinos

    The above is routine in science. For example, the balls could be faster than light (FTL) neutrino data. Mathematical methods apply generally. We are living in some universe, probably a multiverse. The particular physics we “are given” we do not know, especially not the physics applicable to our problem at hand; that is why we do the experiments in the first place. Hypothesis H1 assumes that FTL neutrinos do not exist, so they pop up only seldom in the data, but sometimes they do, say because of statistical flukes in measurement devices or systematic errors. H2 is the hypothesis that FTL particles should show up at high energies since such is possible in emergent relativity and relativity looks very much like an emergent symmetry. It is expected from several cutting edge models which are vital to unify physics and there are even hints from previous experiments.

    Ha – experiments! Great – we have a good prior, namely the posterior from the previous experiments. Sadly, those experiments did not only have low significance, but it is hard to put these previous experiments into actual numbers. In the case of MINOS neutrino data, they may indicate either low superluminal velocities or, if taken together with supernova data and OPERA data, one better interprets them as very high initial superluminal phenomena over short distances. If it is not even agreed whether the velocity is below that of light, a little above, or very much above that of the velocity of light, how can we set the probability PH1 to any particular number? Since the previous data are highly controversial, the prior must be unbiased as before (as in unbiased science).

    The results stay the same: H1, the hypothesis that has very few opportunities for FTL particles to turn up in the data, turns out unlikely. H2 allows FTL neutrinos and is supported by the FTL data. Serious scientists take them yet more seriously and perhaps one can start to use these new, highly significant posteriors as the priors next time.

    Enter Scientists' fixed Beliefs

    A sober scientific analysis is sadly not what usually happens in controversial cases where a sober mind would be most necessary. Many plainly refuse the hypothesis H2. They dogmatically believe in H1! They argue that FTL is impossible on principle, drawing on whatever convenient arguments. Indeed, some do go on and call everybody a crackpot and Einstein denier who does not agree with that PH2 = 0, period. In this case, Bayesian updating does not care about any new data because the prior belief is so strong that whatever your data are, the posterior stays to be the prior belief. This is called Cromwell's Rule:

    “The reference is to Oliver Cromwell, who famously wrote to the synod of the Church of Scotland on August 5, 1650 saying

        “I beseech you, in the bowels of Christ, think it possible that you may be mistaken.”

    As Lindley puts it, if a coherent Bayesian attaches a prior probability of zero to the hypothesis that the Moon is made of green cheese, then even whole armies of astronauts coming back bearing green cheese cannot convince him. Setting the prior probability (what is known about a variable in the absence of some evidence) to 0 (or 1), then, by Bayes' theorem, the posterior probability (probability of the variable, given the evidence) is forced to be 0 (or 1) as well.” Source: Cromwell's Rule

    In our example, PH2 = 0 lets the following happen: PD = PDgH1 PH1 + PDgH2 PH2 becomes PD = PDgH1 + 0 and thus equal to PDgH1. Therefore, the posteriors equal the priors:

    PH1gB = PBgH1 PH1 / PB = PBgH1 1 / PBgH1 = 1


    PH2gB = PBgH2 PH2 / PB = PBgH2 0 / PBgH1 = 0

    Scientists  do not just put in a prior of zero. That would be too obvious. They argue PH2 so small that it for all practical purposes may as well be zero. They massage arguments so long, entering assumptions and dismissing previous studies, until they have a prior PH2 that is small enough so that PH2gB is much smaller than PH1gB.

    The trick is to make PBgH2 PH2 much smaller than PBgH1 PH1 = PBgH1 (1 – PH2). One can for example contrive arguments until PBgH1 and PBgH2 are as desired, but even if those would be known, you can still simply argue until PH2 is much smaller than PBgH1 (1 – PH2) / PBgH2 in order to refuse letting the reality of the data change your world view. In our example above, claiming that PH2 must be assumed to be much smaller than 1/5 does the job.


    If we allow belief to enter controversial priors, the scientific method will be rendered impotent. If scientists may do so, there is no reason why intelligent design (ID) should not insist on its own prior about that evolution is impossible. If you are a real skeptic who is interested in science outreach and reestablishing the trust of the public, you do not force feed your beliefs via a perversion of the scientific method. You agree to unbiased priors and let the experimental data speak.


    [1] Daryl J. Bem: “Feeling the Future: Experimental Evidence for Anomalous Retroactive Influences on Cognition and Affect” Journal of Personality and Social Psychology, 100,407-425 (2011) [NOTE: Quoting peer-reviewed papers does not imply commitment to the truth of their claims!]

    [2] Erik M. Gauger, Elisabeth Rieper, John J. L. Morton, Simon C. Benjamin, and Vlatko Vedral: Phys.Rev. Lett. 106(4), 040503 (2011)

    [3] M. Tegmark: “Importance of quantum decoherence in brain processes.” Phys Rev E 61(4), 4194 (2000)

    [4] Wagenmakers, et al.: “Why psychologists must change the way they analyze their data: The case of psi.” Journal of Personality and Social Psychology, 100, 426-432 (2011)

    [5] Vongehr, S.: “Supporting Abstract Relational Space-Time as fundamental without Doctrinism against Emergence.” arXiv:0912.3069v2 (2009)

    [6] Liberati, S., Sonego, S., Visser, M.: “Faster-than-c Signals, Special Relativity, and Causality.” Annals of Physics 298, 167–185 (2002)


    More from Sascha Vongehr sorted Topic for Topic


    Well done Sascha!

    Nice article!
    I am very definitely not a scientist or statistician, and you explained for the first time that made sense to me what 'Bayesian probability' actually means. In even simpler laypersons terms, I now take it to describe an iterative approach in establishing probability, with each new data modifying the previously calculated probability ever closer to the 'correct' probability. .

    How successfully can one really use this approach as a benchmark to counter the fact that orthodoxy is stubborn in rejecting the FTL hypothesis though?

    If we do some kind of Baysian analysis on the hypothesis that orthodoxy usually turn out to be correct whenever anomalous results are found, would this not support the stubborn attitude in general?


    Maybe this and several comments below are best answered by pointing out once more explicitly that the given example was not meant to suggest that proper Bayesian priors would give FTL neutrinos large posterior probability. The point of the example is to make it obvious how very sensitive this method is to fudging around with what is belief more than science.

    If nothing is gained (all stay confirmed in their beliefs) but additionally the scientific method looks increasing like sophistication in justifying/rationalizing belief, then we may better not use such argumentation at all. Often, an unbiased prior may be a lost battle but win the war. Unfortunately, career science today is all about surviving the next battle.
    Thor Russell
    OK, what about the situation where you are testing different theories to explain an effect (say high temperature superconductivity) and there have already been 100 theories put forward to explain the effect but they have all been disproven by evidence. What prior probability should you give the 101th? Because if evidence supports it with a 95% confidence level, then surely you are justified in saying that it is promising, but by no means proven, or even likely?
    I have practical experience from this kind of thing from a different angle, where I really wish that results would be significant but somehow they turn out not to be. For example I make a minor change to a website to hopefully result in a minor improvement in some metric, e.g. sales of the product it promotes. It runs for a few days giving a 95% chance of being "real" but then more often than not turns out not to be. It seems in this case I should give the minor change a high prior probability of making no difference at all in order to make sense of the results, and avoid continual disappointment.
    Thor Russell
    I agree that some scientists are too skeptical, but I think triple-check it is good for everybody in science. So, K2K is going to repeat OPERA experiments.

    Let me remind you the story of Element Z=118 performed in Berkeley Lab 13 years ago. Fraud or mistake,we still do not know. Dubna could repeat the experiment. There was even an lawsuit for data fabrication in Berkeley.

    A year ago, I was participating on a seminar, where me and my collages from LLNL, ORNL, and Dubna were discussing the discovery of Element Z=117. I had to keep the discovery in a deep secret more than 8 months, until all data ross-checkings finish. Even after that, when Z=117 publication reached Nature, GSI in Germany started a new experiment, which had to repeat exactly LLNL-ORNL-Dubna experiment. And .... GSI confirmed the discovery with 7 sigma.

    So, I think Tommaso is right. We have to be patient, and we have to wait until K2K repeats OPERA experiment.



    So we should give contradictory evidence more weight because its probability is so small (to be almost zero)?

    We have to be very skeptical of evidence of the extraordinary unless there is a good model to support it.

    I run into this all the time with ESP and perpetual motion 'evidence' as an argument against authority: if 'science' says it can't be true it must really exist. Do they offer a new model of the universe to support their evidence (like Relativity did 100 years ago)? Do they offer a deeper understanding of the interaction of matter and energy (like wave/particle model of light)?


    Without a new coherent model of the universe to provide a framework where their evidence makes sense all they are offering is literally magic and superstition.

    You not only have to prove yourself right, you have to prove that our current understanding of the universe is wrong, and provide a new different vision of the universe.

    Relativity did just that 106 years ago, extending our Newtonian universe into a relativistic universe at extreme speeds and distances.

    Extraordinary claims require extraordinary evidence plus an extraordinary new model of the universe.

    You forget a few things, the two most important being:
    1) Advances in science start quite often with empirical data that cannot be explained by any available model yet, often for centuries. Such is not "magic and superstition".
    2) There are models that allow FTL particles. The ever repeated phrase that somehow "our current understanding of the universe is wrong" if there are FTL neutrinos is simply based on people not knowing their relativity theory properly.
    Sascha, you've used far too much passive voice and forms of "be" for this article to read compellingly. Practice the virtues of active voice, and your writing will hit the bullseye.

    I rather like the passive. Often the actual subject is hidden in the machinery, the obvious trigger man just a patsy. But I will remember your hint. Yes, it does not help readability of course.
    I may understand the Bayesian method, however, I think the error the Opera Collaboration made in their analysis is of a less profound nature.
    In an experiment there is always a stimulus and a response.
    Using a response for which there is no corresponding stimulus is invalid, because there was no experiment.
    Using a stimulus for which there is no corresponding response is invalid as well, for the same reason.
    The latter is the case in the current analysis of the OPERA Collaboration.

    Only a part of the PEW contains start time information of the proton (stimulus) that later resulted in a neutrino detection (response).
    The remaining parts or the PEW contain start time information of protons for which there was not a neutrino detection.
    The current analysis allows the remaining parts to determine the shape of the PDF; it cannot be ruled out that this results in bias, because of the irrelevant start time information in the PEWs.

    A number of physicists pointed out that these remaining parts are required for constructing the PDF to enable the maximum likelihood analysis and they dismissed the idea that this was invalid.
    This seems the mainstream view and I am wondering what to think about that.
    It explains why the analysis is taken for granted.

    See also



    The reason why
    The remaining parts or the PEW contain start time information of protons for which there was not a neutrino detection.
    The current analysis allows the remaining parts to determine the shape of the PDF; ...
    is due to certain (tacit) assumptions:
    1. Neutrinos are notoriously difficult to detect, so it is fully expected that there will be neutrino generation events for which there is no corresponding detection event.
    2. There is no reason to expect that the neutrino generation events that happened to have no corresponding detection event will be any different than those generation events for which there is a corresponding detection event.  (See #1 for part of the reason for this.)
    Now, you are correct that it is not completely impossible that #2 may be violated.  So, one could restrict oneself to those neutrino generation events for which there are corresponding detection events.  However, it should come as no surprise that this will result in a PDF with far more uncertainty, thus greatly decreasing the significance of the OPERA results.

    However, I would certainly not characterize this as "It explains why the analysis is taken for granted."  I most certainly have seen very little evidence that "the analysis is taken for granted."

    Hi David,

    If there is no reason to expect that the neutrino generation events that happened to have no corresponding detection event will be any different than those generation events for which there is a corresponding detection event, then, why is the sum of the PEWs a PDF?
    Of course, the neutrino generation itself is the same, however, when a neutrino is detected, the probability of detecting an event predicted by this PEW is 1, if the PEW is properly scaled as a probability.
    Exactly as in a lottery. If you buy a lottery ticket, you have a small chance of winning a prize.
    If a prize is won with this ticket, the chance of wining a prize with this ticket is 1.
    With the PEW there is one difference. It is not exactly 1, because due to quantization noise of BCT and digitizer, there is an uncertainty in the probability of +- 1/2 because of the Gaussian proton distribution.
    To take this one step further, the 1 ns PEW part with a corresponding event has the value 1 +- 1/2 and for the other parts it is 1/2 +- 1/2, again properly scaled.
    Ironically, there is a very simple way to exclude the PEW parts without corresponding event.
    Currently the PEWs are summed while the start of the PEWs are still aligned with the kicker magnet signal.
    It is also possible to time-shift the PEWs, so that the corresponding events are aligned; in that case the parts of the PEW with a corresponding event are aligned as well.
    If the PEWs are summed now and a scan is done over the obtained sums of the individual PEW parts, it can be shown that a single maximum will indicate the start time.
    If I go too fast, see my website for details.


    Sascha Lubos's website has the rumour ... the new 2ns bunch tests still has the nuetrino's going superluminal. official report due in in a couple of days. Whatever is going wrong is in the system or the effect is real definitely nothing to do with bunching.

    Rumor has it OPERA has reproduced their result taking out most of the statistics. This is getting interesting.

    Let's say you wonder if golfers of different nationalities, ages, and religions can hit holes in one. So you get 5,000 people who differ across these attributes and have each try for a hole in one on a par three hole. After 4,000 people have missed, lo and behold, a 42 year old Philippine Catholic woman hits a hole in one! The remainder of the 5,000 people also miss. Can you conclude that 42 year old Philippine Catholic women have the extraordinary ability to get holes in one?

    You probably see the flaw in concluding this. 5,000 people hit golf shots and only one made a hole in one. It probably didn't have much if anything to do with the fact that the successful golfer was female, 42, Philippine, or Catholic. And if these 5,000 tried again, common sense tells us this same person would be very unlikely to get another ace.

    If you follow this example and its flaws in reasoning, then you can also follow the flaws in Bem's experiment and reasoning. His experimental was exactly similar. He kept trying until a small group exceeded chance, then attributed their success to their demographic attributes. And he didn't have them try again to see if they could repeat their success. His experiment did not meet the established requirements of statistical methods. But he wrote it up so confusingly that the journal reviewers didn't realize it. After it was published everyone realized what he had done.

    His experimental was exactly similar. He kept trying until a small group exceeded chance
    You are telling serious untruths and I do not tolerate such. This issue of the number of pilot studies he did is discussed at length already in the original paper! His statistics is done according to standard methods and has passed four peer reviewers knowledgeable about statistics.

    You do nothing here but insult and tell lies in the hope that some of the dirt sticks. Don't try this again or I will delete it. This here is a serious science column, not your PZ minions' comment jerk off bonanza!
    Bert Morrien
    The newest outcome of Opera's neutrino velocity measurement included also the result of an alternative analysis.
    This result was compatible with the earlier finding, and so was the result of a new experiment with much shorter pulses.
    This means, Opera’s current analysis must be valid.
    This means also that Opera knew exactly what they were doing.
    Consequently, the PDF obtained by summing the PEWs is valid, despite the lack of PEW parts with a corresponding event.
    This is because with enough events, the event distribution resembles the shape of the PDF sufficiently for trusting the outcome of a maximum likelihood analysis.
    It is regrettable that this point never became clear to me before.

    The lesson learned is that declaring the PDF and Opera’s analysis invalid is a good example of narrow minded reasoning; a humble apology is in order here.

    ??? I think you have to explain yourself a little less cryptic. Who declared the PDF and Opera's analysis invalid and should apologize? Those generally who put in a prior of basically zero without even doing any calculation or somebody in particular doing a certain calculation?
    "This is because with enough events, the event distribution ..."
    Not sure whether this is a relevant comment under a post on Bayesian updating over a series of different experiments ("enough events" all from the same setup all have the same systematic error).
    Bert Morrien

    OK, fair enough. It was me who learned the lesson. I simply do not know enough about statistics to appreciate the validity of Opera’s original analysis.
    I was talking about the PEW parts, i.e. the 1 ns elements of the digitized waveform.It struck me that the PDF is mainly composed of PEW parts that have no corresponding event.These parts must be considered as noise, because they cannot have a relation to the start time of a proton that caused an event.I fully appreciate that Opera dealt with this noise in an appropriate way, i.e. they did not allow this noise to cause a biased result.
    Being a skilled and experienced hard- and software engineer, I envisioned an analysis method that basically only relies on PEW parts with a corresponding event. I forwarded this idea to Alessandro Bertolin on Oct 3 2011.Although not confirmed, maybe this was the alternative analysis mentioned in the latest version of“Measurement of the neutrino velocity with the OPERA detector in the CNGS beam"

    "An alternative method to extract the value of delta-t consists in building the likelihood function by associating each neutrino interaction to its waveform instead of using the global PDF. This method can in principle lead to smallerstatistical errors given the direct comparison of each event with its related waveform. However, particular care must be taken in filtering the electronic noise, white and coherent, that affects individual waveforms, while it cancels out inthe global PDF."

    My method relies on the fact that each event has a corresponding PEW part. This part describes the proton density at the time the proton was fired that caused the neutrino detection.
    It describes also the neutrino density at the detector TOF later. This value is proportional to the probability of detecting a neutrino.
    If the PEW parts with a corresponding event are summed, the contribution of each PEW part is proportional to the frequency that PEW parts with the same amplitude are contributing to the sum. In other words, the contributions form a quadratic function.
    For PEW parts without a corresponding event, the contributions form a linear function because these have no relation to any event and therefore are only random PEW samples.
    The quadratic funcltion leads to a higher sum than the linear one. I am aware, that this is not true if insufficient events are availeble, but let us assume that there are enough events.

    Now, let PEW[t] be the value of a PEW element at time t with respect to the kicker magnet signal.
    For all available PEWs with their corresponding event, obtain the sums of PEW[event-x] for a range of x values, which includes the TOF.
    The value of x corresponding to the highest sum must be the TOF.

    Note that this sum contains only a small amount of PEW parts without a corresponding event due to timing uncertainties and these have no bias effect.
    Of course, the practical implementation will have to deal with things like how many events are needed and how to deal with the strong 200 MHz component of the PEW. That is why this analysis method is incomplete, it lacks a more solid quantitative understanding.
    That does not invalidate my method, because I think all creativity starts with a qualitative reasoning.

    I always told my kids that if they want to know something, they must sort it out.
    Maybe you have doubts, but I think this is about being sceptical.