The text below is the fifth part of what could have become "Chapter 13" of the book "Anomaly! Collider Physics and the Quest for New Phenomena at Fermilab", which I published in 2016. For part 1 see here; for part 2 see here; for part 3 see here; for part 4 see here.

No superjets in Run 2

It needed to be done, and somebody eventually decided to do it. Were superjet events a lusus naturae, a funny fluctuation in an otherwise perfectly well-behaved sample of data, a sub-subset of events artfully carved to create an anomaly where none existed, or were they the first hint of something new, totally unexplained by current models, a special dataset containing decays of a mysterious new particle? The question was begging to be answered.

During the years of the superjet controversy, several of my CDF colleagues had repeatedly pointed out in internal as well as in public discussions that those 13 events, with their oddity and the arguably tiny probability of their appearance in a world with no physics beyond the Standard Model, had to be considered an a posteriori observation. The "W + jets" data had been studied for a long time: first the focus had been W boson production events accompanied by three or four hadronic jets, wherein decays of top-quark pairs had been sought and finally extracted (see Chapters 4, 5, and 6); then the excess of b-tagged W+2 jet events had caused speculations on the possible decay of a Higgs-like particle (see Chapter 11); and finally, all the attention had been attracted by the subset of events containing a superjet. As the total W+jets sample was not very large, it was impossible to argue that the anomalous 13 superjet events were the result of an unbiased investigation: it would not be a stretch to say that by the end of the nineties, the regulars of the Top group in CDF knew the kinematic characteristics of each of those events by heart.

There were two obvious consequences from declaring that the 13 events were an a posteriori effect. The first was that no real conclusion could be drawn by the very low probability estimates that Kolmogorov-Smirnov tests or other complicated statistical recipes were producing, if the very particular selection recipe which had produced the superjet sample was potentially tailored on the data. The second was that such a selection recipe constituted a perfect a priori prescription for a new analysis which could be carried out on the copious data that Run 2 would soon start providing! Indeed, such a search could definitely be made perfectly unbiased: everything was frozen, from the leptonic W boson selection to the jet definition, from the jet multiplicity bins (2 or 3 jets) where to seek an excess to the requirement that at least one of those jets should possess both a secondary vertex tag and a soft lepton within its cone. Should tuned Monte Carlo simulations or data-driven background estimation methods still under-predict the observed number of such events, a statistically significant effect would be immediately very serious indication of the presence of real new physics. If, on the other hand, data and simulation were found in agreement, then the Run 1 superjets would be destined in the cemetery of spurious particle physics discoveries, along with the dozens of other anomalies and ill-understood measurements that had preceded them. Yet, to play the devil's advocate, one could entertain a third possibility.

As was discussed in detail above, the CDF detector had been modified appreciably from Run 1 to Run 2: in particular, the system responsible for the inner tracking of charged particles -on which not only SECVTX b-quark tagging, but also SLT soft lepton identification were based- had been completely replaced. The muon system had been extended, widening the acceptance of the SLT algorithm; the plug calorimeters were also brand new devices, and they would measure jets much better than the systems they substituted, without leaving a small but annoying crack at the interface with the central detector. Running conditions of the accelerator were also going to be different. The higher number of interactions per bunch crossing, the higher frequency of collisions, and the 10% increase in collision energy could not likely make a large difference taken alone, and yet all those changes meant that the experimental apparatus and its sensitivity to physical processes would not be the same. If they were caused by a new physics process alone, superjet events would be certainly observable in the new detector, which was a better version of the old one; but if they were instead due to some weird combination of new physics and a sneaky instrumental effect that amplified their oddity, their fate was less certain. Such a combination of factors was really remote; but one could argue that when facing such a weird sample of events, Ockham's razor had to be set aside for a moment.

It was Tony Liss, a professor at the University of Illinois at Urbana-Champaign, who directed his two fresh new post-docs Anyes Taffard and Lucio Cerrito toward a search for superjet events in the very first well-understood and usable dataset collected in Run 2, one merely 40% larger than that where the 13 superjet events had been unearthed in the previous run: it corresponded to a total integrated luminosity of 150 inverse picobarns. Liss and colleagues kept as much as possible of the original search unchanged, in order to remain within the bounds of the a priori prescriptions: the estimation of backgrounds, a tight selection of the W boson signature, and the standard method for counting hadronic jets followed the Run 1 choices closely. The identification of soft leptons inside jet cores however slightly differed from that of the Run 1 analysis. SLT, the soft-lepton tagging algorithm, was not fully tested yet for electrons with the new detector, hence the Illinois group restricted their search to SLT muon tags. This was arguably not a big deal, as muons had always made up for the largest part of the SLT tagging power, given the less clean signature of electrons embedded in jet cores: the reduction in acceptance for superjet events would not change the conclusions of the study.

Above, Tony Liss, Lucio Cerrito and Anyes Taffard (left to right) pose in front of a model of the CDF detector. Also present are Greg Veramendi and Ulysses Grundler (source: UIUC physics dept. web site.

One significant methodological difference, however, was indeed introduced. This was motivated by the focus of the authors on specifically checking the interpretation that Giromini had suggested for the Run 1 superjets: namely, the production of a scalar bottom quark with a lifetime comparable to that of the ordinary b-quark, but which decayed to a charged lepton 100% of the time. The Illinois scientists reasoned that such a scalar quark would make its presence felt primarily in the number of SECVTX-tagged jets containing a SLT lepton in them, so they devised a bullet-proof strategy to extract its signal. Their idea was that after estimating the contribution of all background processes to the 1, 2, and >=3 jet bin of the W+ jets sample containing one SECVTX tag (and thus prior to any SLT selection), the background would be rescaled by the ratio between their observed and predicted number, such that the total would end up matching perfectly the number of observed events, separately in each bin of jet multiplicity. So, for instance, if the total predicted number of events with a secondary vertex tag in the 2-jet bin amounted to 77 events, and the observed number amounted to 70, all backgrounds would be subjected to a 10% decrease in their estimate in the same bin of jet multiplicity. Such a procedure was in principle a sub-optimal way of searching for a signal: it could result in unwittingly sweeping under the carpet an excess of observed events over predicted backgrounds. However, that was not the case: in fact, the standard plot of the number of jets accompanying the leptonic W decay signature had already been studied to obtain a new measurements of the top pair production cross section, and no significant excess had been observed in the two-jet bin when selecting events containing secondary vertex b-tags alone. Alas, this made the Run-2 reload of the superjet analysis not totally a priori either!

Once each background source got rescaled by the procedure discussed above, a simple multiplication of its respective normalization by the predicted fraction of SLT-tags allowed to extract a prediction for the number of superjet events: this provided the answer Liss and collaborators were looking for. There was absolutely no excess in the Run 2 dataset. The data contained 5 W+2 jet events with a superjet, whereas 4.5+-0.9 were predicted by the rescaled sum of backgrounds. Those five events, studied one by one, did not provide any clue of being in some way affiliated, by weird kinematics or other properties, with their Run 1 counterparts.

What to conclude from that new result? For sure the American physicists proved they could not see any scalar bottom quark signal in Run 2 data. Under normal circumstances -id est, if the Run 1 excess of superjet events had been only a numerical anomaly -the obvious conclusion would follow as a logical necessity: Run 1 superjet events had been an upward statistical fluctuation. But circumstances were not normal at all: a fluctuation which did not look like any one of the processes making up the background appeared no less mysterious than a real signal of new physics really hidden within the 13 events disappearing in the new analysis. The new analysis implied that whatever source had generated the Run 1 excess and its weird kinematics, it had been of systematic nature: most likely, a detector effect not present in Run 2 any more. In that sense, one could have argued that the new analysis left things just as mysterious as they had been before.

At least, Liss appeared satisfied: although the result did not completely solve the mystery, it did show that his disagreement with Giromini's conclusions on Run 1 data had been well founded. As for the CDF II collaboration, they were by then strongly focusing on those aspects of the scientific program which had motivated the upgrade: precision measurements of top quark properties (in particular, the top quark mass) and B physics results; plus of course the diverse set of searches for the Higgs boson which had been started with enthusiasm. The Illinois group members did continue to study W+jets data with SLT tags for a few more years, focusing on top cross section studies and a measurement of the W+charm production; but they did not produce any further publication on superjet events. It would take several more years and the collection of 30 times more data for a new anomaly, albeit an entirely different one, to be spotted in the infamous W+2 jet dataset. That story is the subject of Chapter 17.