Of course, the process of going from subatomic collisions to submitted papers is a long and complex one. The first part of the journey involves triggering, storage, and reconstruction of the acquired datasets, and reduction of the data into an analysis-friendly format. While this might sound like a partly automatic and painless procedure, it involves the expense of liters of sweat by conscentious collaborators who oversee the data acquisition and their processing.
The second "validation" step involves careful comparisons with simulated data (which also underwent a significant amount of babysitting before they were ready for use) and previously acquired data, to make sure everything is under control. After that, one is finally ready to construct an analysis path meant to turn the multi-dimensional features observed for each collision event into the point estimation and interval estimation of some physical quantity of interest - the publishable measurement! Here physical intuition and experience play a role - smarter physicists and analysts have a chance to make a difference between a routine result and an outstanding one.
The analysis of the data typically involves adapting advanced techniques to the task. E.g., often machine learning is used to discriminate the interesting signal from all competing backgrounds; as well as the careful accounting and statistical modeling of all possible sources of systematic uncertainty affecting the measurement.
Once everything is understood, all subtleties have been inspected, all biases removed, and checks ticked off, one thus produces a measurement. Is that the end? By no means! There starts a third phase, which in some cases may actually be the longest and trickiest: this includes the painstaking process of documenting every step of the analysis and presenting it at meetings, discussing every choice and detail with group leaders and collaborators.
When the whole thing is considered "mature", the analysis is assigned to a committee of wise men, who perform an internal review, meet with authors, ask annoying questions on gory little details of all kinds. This internal "quality control" is very important in a large collaboration, to ensure high standards across the board. Only when these knowledgeable colleagues are convinced that everything is in good order, may the analysis be presented for a preapproval and then an approval within the collaboration.
Is this enough? No - after the approval, a paper needs to be written, when every one of your colleagues is allowed to comment and ask for modifications. And they will: they will ask for changes of all kinds and size, from inserting Harvard commas to performing whole overhauls of the employed statistical technique. Finally, a "final reading" is scheduled, and again the text is screened by a set of reviewers from the "publication committee". When a final green light is given by them, the paper will be submitted to a refereed scientific journal.
... Which of course means that other physicists, external to the collaboration, have their chance of criticizing and commenting on the text and the procedure. But eventually, the paper does get published, and the analyzers can finally open a few bottles of wine to celebrate (the last part is not mandatory but highly advised).
I think the above gives a good idea of how hard it is to produce a new result! Now it is about time to show you one of the latest achievements by my experiment, CMS. This is a recent new upper limit on the rate at which Higgs bosons decay to charm quark pairs. What is the matter with this?
Well, the Higgs boson can interact with all particles endowed with mass - in fact, it is the presence of the quantum field associated to the Higgs boson what gives a mass to all non-massless elementary particles. Furthermore, the strength of the interaction is higher if the mass is larger. So the Higgs boson can indeed decay to a pair of charm quarks with a significant rate, because the charm quark is rather massive (it weighs over 30% more than a whole proton!)
Observing a decay of Higgs bosons to charm quarks is very hard, but this does not have to do with the sheer rate of this process. The fraction of Higgs that disintegrate that way is predicted to be in the 2% range (about one in fifty Higgs boson should decay to charm pairs, according to theoretical calculations), and thus an order of magnitude larger, e.g., than the fraction of Higgs decays into photon pairs, which has been observed as far back as in 2012. What makes it hard to spot is the fact that charm quarks are not very easy to distinguish from lighter quarks and gluons, which are produced by background processes at rates which are many, many orders of magnitude higher than Higgs production.
Now, CMS has recently put together a very performant new algorithm that can identify charm quark originated jets of hadrons with a quite good purity. But that is not enough - it would still be hopeless to search for a H-->cc decay in events that only contain a pair of charm-quark jets within the LHC collisions data that CMS has collected: even with a perfect charm tagger, the Higgs would be a tiny needle in a huge haystack. To increase the signal fraction, CMS searches for Higgs bosons produced in association with a W or a Z boson. In return for an even smaller rate of signal events, the background is very strongly reduced, because while Higgs bosons like to be produced with a weak boson, background jet pairs do not do that frequently at all.
Hence the main strategy is the following: select events that show the distinctive decay signature of a leptonic W or Z boson decay; discard those that do not have two accompanying jets; discard all jet pairs that do not look like they originated from charm-quarks. And then, on the selected data, a further multivariate discriminant can be trained to distinguish the signal lookalikes!
An alternative is to look for the Higgs decay into a single "fat" jet that contains the signal of both charm hadronizations. This targets Higgs bosons that were emitted at very high momentum from the interaction, such that its decay did not provide enough of a kick to the two quarks to enable them to form their own independent jet.
Below you can see what is the distilled result of looking for all these final states, after a complex statistical procedure has combined all independent channels (the W boson is observed both in its decay to electron-neutrino and muon-neutrino pairs, and so is the Z boson, observed in electron-positron as well as muon pairs; in addition, the decay of Z bosons to neutrino pairs is also targeted in the "zero lepton" category, when no electron or muon is seen, but there is a significant amount of missing energy betraying the neutrinos escape).
In the graph, the horizontal scale indicates the value of the "signal strength modifier", μ. If the standard model is correct, mu should equal 1. Since the analysis is not sensitive to the small predicted signal, an upper limit is placed on mu instead. You can see the upper limits separately computed for the three categories of events with zero, one, or two leptons (corresponding to Z->νν, W-->lν, and Z-->ll decays, respectively) as well as their combination. They are displayed as black dots.
The analysis sensitivity is described instead by the green and yellow bands, which tell you what is the 1-sigma and two-sigma range of upper limits that the analysis should expect to obtain, given the method used and the amount of data. The fact that the upper limit observed in the data is higher than the one expected is an indication that the data is more "signal-like" than they should; however, this is a very minor departure (at the level of two standard deviations), so perfectly likely to occur by statistical fluctuations.
By observing a graph like that you can obtain more information, if you look closely. For instance, you see that the highest sensitivity was achieved in the "2L" category, and that the "0L" and "1L" categories are less sensitive. Why is that? While the selection of a signature of the Higgs decays to charm is the same in the three categories, the selection of events with two charged leptons (electron or muon pairs) is capable of strongly reducing all backgrounds - the Z decay is very clean. The selection of W decays is a bit less clean; it is also cleaner than the selection of Z decays to neutrinos (which populates the "0L" category"), which only relies on the missing energy signal to be enhanced in the data.
Another thing to note is the actual result: an upper limit, at 95% confidence level, is obtained at 37 times the standard model predicted rate. Does that mean that observing the Higgs decay to charm quarks is a hopeless business at the LHC? Well, it depends. First of all, by throwing more data in you can expect to decrease that limit significantly in the future (the analysis only used 36 inverse femtobarns of collisions, which is of the order of one hundredth of the data that we plan to collect in the next decade, during the high-luminosity LHC phase).
And then, further improvements in the charm tagging capability can be envisioned. I remember seeing upper limits to signal processes, once in the two- or even three-digit range, going down by orders of magnitude as physicists got more cunning and time went by. One example? The upper limit on the B_s decay to muon pairs, which went down by several orders of magnitude over a couple of decades, allowing eventually the precise measurement of the signal rate!
You can get much more detail on this analysis at this site.
Tommaso Dorigo is an experimental particle physicist who works for the INFN at the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He coordinates the European network AMVA4NewPhysics as well as research in accelerator-based physics for INFN-Padova, and is an editor of the journal Reviews in Physics. In 2016 Dorigo published the book “Anomaly! Collider physics and the quest for new phenomena at Fermilab”. You can get a copy of the book on Amazon.