Tevatron: Evidence Of The Higgs In B-Bbar Final States

The Tevatron experiments have jointly published on the arxiv two days ago a paper which is titled "Evidence for a particle produced in association with weak bosons and decaying to a bottom-antibottom quark pair in the search for the Higgs boson at the Tevatron collider". You can get the paper in the arxiv.

The article is the final chapter of a more than decade-long search for the Higgs boson at the Tevatron. In fact I should say two-decade-long, since the first searches started almost twenty years ago. I have been a member of the CDF collaboration (one of the two collaborations that jointly produced the paper; the other is of course DZERO, the other experiment at the Tevatron collider) since 1992, and I spent over half of my research time there on the topic of the search of resonances decaying into b/bbar states. So this post is from an insider, if you will. This specification is needed as an introduction to my personal views on some details of the matter, which I express below.

The facts

First of all let us discuss the facts, i.e. what the paper says. This is a combination of results of Higgs boson searches by CDF and DZERO restricted to the cases when the Higgs boson decays into a bottom-antibottom quark pair: other searches are neglected here. There are various channels that enter this combined result, all having the common trait of assuming that the particle decaying to bbbar is produced together with a W or a Z boson, as the Higgs is predicted to do in one of its production channels in hadron collisions.

So we have ppbar->WH, with W going to electron-neutrino and muon-neutrino pairs; ppbar->ZH, with Z going to electron pairs or muon pairs, and ppbar->ZH, with Z going to neutrinos. These searches have a common trait in the triggers, which rely on the leptons (electrons candidates, muon candidates, or missing transverse energy signalling the escape of energetic neutrinos), and the application of multivariate techniques to reduce backgrounds. Multivariate techniques are also used to boost the b-tagging identification inside jets. All searches then concentrate on the invariant mass of the pair of b-quark tagged jets, after a mass-dependent optimization of cuts.

It is impossible to do justice to all these analyses in a post, so the above description is all I can offer as far as the analysis techniques are concerned. The result of the combination of the various channels across the two experiments is offered using both a Bayesian technique and the CLs frequentist criterion; both are well-known technology in this business.

The collaborations observe an excess of events at the highest values of signal discriminant, which makes their upper limit on the signal cross section weaker than expected. They test the null hypothesis (that the data only contains background) and find that the p-value has a dip in the 120-135 GeV region, with a minimum value corresponding to 3.3 standard deviations, at 135 GeV. When corrected for the multiplicity of places where a signal is sought, this becomes a 3.1 sigma effect.

Perhaps the most intriguing figure in the paper is the one shown on the right, where the mass distribution of the excess of b-tagged events over non-electroweak background processes is shown and compared to the predicted excess due to production of WZ (in red), ZZ (in yellow) and VH (in green) processes. The Tevatron data clearly shows the two WZ and ZZ processes (where one Z boson decayed to b-quark pairs, thereby generating the observed bump in the dijet mass distribution); as for the Higgs, one could believe that the high point at 120-140 GeV is due to it, or could rather question the dips at 20-60 and 180-220 GeV. Your pick.

Diatriba mode on

And here, dear reader, comes the part where I need to express my views. I know that by doing this I may upset some of my CDF colleagues, even ones that have worked very hard for years on the Higgs decay to b-quark pairs (but so I did). Let me apologize to them pre-emptively: I know that having a blog is a privilege, which allows one to speak to an audience, and I usually do not use that privilege to broadcast views in opposition to those of collaborations to which I take part. This is just being professional, of course. But in this case I will make an exception, because I do not believe I am hurting anybody's interests or damaging CDF (or DZERO) in any way if I express my personal views on this matter. Still, I apologize if this article feels inappropriate to them.

So, the heart of the matter for me is the following: I have been in the CDF collaboration for twenty years, and I do not recall that in the past we ever claimed evidence for a new particle or process with a similarly marginal backing; that is, we did it for one process -single top production- which we were convinced was there by force: but that is just the exception that confirms the rule.

In the past 25 years CDF has been appreciated for its very careful approach to research in HEP, both in the depth and thoroughness of the internal review of all its results and in its conservative stand when publishing measurements.

Since the nineties, CDF has set a very high standard in the field. I have been at both ends of the barricades when heated discussions took place within my collaboration on whether a potential new signal or anomalous effect was worth to be published or should be kept private until more studies could be carried out, and I was often surprised by the overly cautious approach of some of my colleagues, who would e.g. be willing to wait and sit on a controversial result -even for months, or years!, rather than present it to the outside world "as is". In truth, I most of the times criticized this approach, on the basis that I found it silly that 500 physicists would keep 5000 more oblivious of a potential signal, when this could stimulate new theoretical developments. Why keeping it secret ? To protect reputations and careers ? I found this attitude not quite scientific.

On the other hand, the publication of anomalous effects was always a very painful delivery, whereby all claims originally contained in versions 0.x and 1.x of paper drafts would be toned down and caveats of all sorts would be added, to make it clear to the readers of our papers that we were publishing our results without supporting this or that interpretation, just to make them known to the outside world. We would publish since we considered that we had done all that was possible to verify the correctness of our results, but were not taking a stand on their meaning yet. I recall several discussions on whether a 2.x sigma effect should be quantified in the abstract or left in the middle of page fourteen, to deemphasize it. Words such as "evidence", "anomalous", "discrepancy" were red flags for an active pool of internal censors.

I can also recall at least a dozen cases when controversial situations arose in CDF; the controversy was often first on the correctness of the result, and then when a result could not be withheld indefinitely, it moved on the wording in the drafts and the way to interpret it in a publication. The whole thing often entailed heated debates of all kinds, name-calling, slammed doors, and retaliation acts not always of academic nature. The scientific process is not always smooth sailing, indeed. Yet it can still be healthy and effective even if it needs to at times border the inappropriateness of physical confrontation among colleagues!

Despite all the internal digestion pains, to the outside CDF always presented a very professional, consistent "feet on the ground" approach on interpreting its results. And to my memory a 3-sigma effect was never emphasized the way I see it done in the Higgs combination paper.

What makes the difference ? Now the one below is my personal opinion, and as I said I did not participate in the review process of the draft, so please take this as a disclaimer that I am not writing on behalf of anybody but myself. To me, the source of the difference appears to be the fact that the LHC experiments have in the meantime produced a 5-sigma evidence of the presence of a Higgs-boson like particle in their data.

I am saying this because I believe that if the LHC had not seen the Higgs at 125 GeV, and the Tevatron were still planning to collect more data, the Tevatron Higgs combination would not be phrased this way. Claiming evidence for a new particle based on 3-sigma excess coming from a mix of multivariate analyses would be most likely avoided, due to the fear that the 3 sigma would later be found to be just due to a background fluctuation. Careers would be at stake: this would force a more careful wording. No "evidence" in the abstract, none in the conclusions. Maybe not even the number 3 in the abstract either.

So, I am led to believe that the choice of writing that the Tevatron data shows "evidence" for the Higgs decay into b-bbar comes from the clear belief that the Higgs is indeed there. So far so good: indeed, everybody is by now convinced. One is then bound to ask: why not writing this in the paper ? The paper mentions past searches of the Higgs, and has a references for the earlier Higgs searches and detailed descriptions of the upper and lower mass bounds. When it comes to mentioning the 5-sigma observation of July 4th, however, the paper relegates this note in the references.

I can hear the objection: "oh, but that is not a published result". Let me answer single-wordedly to that: bullshit. Communication of science these days is made in real time. Heuer announced the discovery to the world on July 4th, and no peer-reviewed journal can subvert that fact.

I would have made a quite different editorial choice, more transparent in this respect: I would have written one line of introduction on past searches of the Higgs in place of the 18 lines of the right column of page 6. Sufficed to say: "the LHC experiments have reported observing the Higgs boson at 125 GeV. If this is the SM Higgs, we have good sensitivity to the bb decay channel. This paper therefore focuses on that signature: if we see a signal at 125 GeV, it is probably contributed by Higgs decays, so we can measure stuff with it." Okay, I would find a better wording for "stuff", but you got my point.

Omitting such a statement in the text means that one wants to work independently on that datum. Indeed, the search spans 50 GeV (from 100 to 150) and this is anyway a meaningful thing to do - it allows one to verify one's background estimates away from the putative signal region. But then, why should one say, as is written in the abstract and the conclusions,

"We interpret this result as evidence of the presence of a particle that is produced in association with a W or a Z boson and decays to a bottom-antibottom quark pair."

making no mention of the LHC having established the Higgs existence, that is, as if this could be said in a paper independently of the LHC find ?

Please note that even such subtle details as writing the word "evidence", "interpret as", and repeating them twice in a paper have in the past raised heated debates even when the observed effects were of much larger significance.

So, here it is: I am in disagreement with the emphasis that is given in this paper to the observed effect in the data. The Higgs is certainly there at 125 GeV where the LHC experiments have found it, but the 3.1-sigma 135-GeV excess found by the Tevatron experiments (or if you prefer, the 2.8-sigma excess found at 125 GeV) might well be a background fluctuation if we did not know about the LHC observation. Actually, it could well be a background fluke anyway!

A background fluctuation where the signal sits ??!! Yes, quite possible -actually, we know already that a fluctuation occurred in Tevatron data, since the observed excess is twice as large as the one expected for a SM Higgs; so this may be either a signal or a background fluke, but a fluke it certainly is (unless, of course, we marry improbable hypotheses of Higgs bosons with weird couplings, large for fermions in the US and large for bosons in Europe).

Please look at the figure on the right, taken by Fig.6 in the paper. There you can see the observed p-value in black, and dashed is instead the p-value that the experiments predicted they would observe if there were a Higgs particle in their data. At 135 GeV, the point which crosses the magic "3-sigma evidence" line, the Higgs signal would just be expected to produce a one-sigmaish deviation (for 125 GeV, the sensitivity is of just 1.5 sigma). If there is a Standard Model Higgs boson in the data, the two additional standard deviations at 135 GeV may well be due to the background fluctuating up, not the signal.

In other words, the Tevatron claim of seeing evidence for the Higgs boson is based on observing a fluctuation which is much larger than that expected for a Higgs. This is of course also reflected by the measured signal rate, which is twice the predicted one even for the 125 GeV mass point (which of course would not be emphasized in the paper if the LHC experiments had not found a particle there...).

Should I sign the paper ?

Now, the question becomes even more a personal one.

I am a member of the CDF collaboration, but have been unable to follow the review process of this paper -and thus to fight against the discovery tone of the text, or for a more open clarification that the interpretation of the excess is influenced by the LHC find. My fault, of course.

Yet, there still is one possibility left for me: taking my name off the publication. Note, this is allowed in CDF, and it is not infrequently done, for a variety of reasons. Nobody really gets upset anymore if a colleague decides to avoid signing a paper: we all have the right to do so, and in fact new schemes for signing papers with an "opt in" procedure are under study.

So I think that taking my name off the Tevatron Higgs search paper is a serious possibility. Indeed, it could be even deemed advisable to do so, given that I am also signing CMS papers on the same subject, which are in some way in competition with the CDF Higgs search.

As far as the ethical issue above is concerned, I believe I need to consult with other colleagues who are in the same situation; but in the meantime I also ask for your advice in the light of the opinions I have expressed on the tone of the paper. Before I do, I need to provide you with some required input on the matter, though. The input concerns my personal contribution to Tevatron Higgs searches in precisely the final state which has been used to find this 3-sigma effect.

So please consider:

- I have spent over a decade in CDF on the topic of reconstructing b-quark-pair resonances, with the explicit aim of one day seeing the Higgs boson decay in that final state. I started in 1996, when I was the first in CDF (or DZERO, for that matter) to search for the decay of the Z boson to b-quark pairs, a process which is as close as possible to the Higgs decay. Of course, the Z is produced with a over 1000-times higher rate, so the matter could be pursued even with the smaller Run I dataset that CDF had gathered in 1992-96. When I started to search for the Z->bb process I remember very well the reaction of many colleagues. Wei-Ming Yao, for instance, who is one esteemed colleague from Berkeley, and one of the fathers of b-tagging in CDF, said at a meeting "it is impossible, the background is too high" when I asked him if anybody was trying to search for Z->bb (I was still considering whether that would become my PhD thesis topic, which it did). Another esteemed colleague, Jonathan Lewis, when I spoke to him about my search in order to get his advice on muon triggers, had this reaction: "So this is just a check, right ? Why else would one want to search for Z->bb ?". Those where the reactions in 1997 in CDF to my attempts at finding a resonance decaying into bbbar pairs, which I had already convinced myself was the necessary precondition to search for a light Higgs.

- Despite the bad karma around, I did find a signal, defended a PhD thesis on the topic, and later on contributed to a Tevatron Run II study on finding the Higgs, which had a section showing how one could find bbbar resonances, describing the new found Z->bb signal. Later, as Run II begun I helped design a trigger that specifically collected the Z->bb decays which were also by then recognized as an asset for the reduction of the b-jet energy scale. In 2005, as convener of the Jet Energy and Resolution working group in CDF, I furthered those studies with a small group of collaborators (most of the work then was done by Julien Donini, now a professor at Grenoble but back then a post-doc in Padova). We produced a larger signal of Z->bb decays, extracted the b-jet energy scale from it, and published a NIM paper on the topic.

- I of course worked in many other ways at the specific topic of Higgs searches in CDF, but I guess it is irrelevant to list them here at this point. Maybe worth quoting, however, is a seminal study I did in 2003 when I participated in the Higgs Sensitivity Working Group, which was to assess the Tevatron chances for a Higgs discovery. Together with Luca Scodellaro, a graduate student in Padova, we demonstrated that the resolution to the Higgs boson mass in the dijet final state could be significantly improved with specialized multi-variable algorithms, such that the signal observability would be increased. By the way, note that the 2003 study aimed at demonstrating the usefulness of a upgrade of the silicon tracker in CDF and DZERO, a upgrade that was unfortunately rejected... That probably made the difference between the actual situation now, with the Tevatron having an expected sensitivity of less than 2-sigma for a Higgs boson at 125 GeV, and the situation we could have been in if the upgrades had been funded.

So the above should be taken to mean that I believe I have indeed contributed sizeably to the paper appeared two days ago in the arxiv. Should I drop my signature on the paper based on the fact that I disagree with the way the result is reported ?

You be the judge.

Related articles

Comments

Know Science And Want To Write?

Donate or Buy SWAG

Books By Writers Here