The article, which is 95 pages long, is available in the arXiv here. It is a very nice review of important advancements in analysis methods produced with a contribution from members of the AMVA4NewPhysics ITN, and it encompasses advancements in b-tagging methods, matrix-element methods, supervised classification with optimized neural networks, and unsupervised learning methods for new physics searches in LHC data. In this post I will just give you a brief summary on its contents. But before, let me tell you about the network itself, which was a very successful endeavour, at least from my very biased perspective.
AMVA4NewPhysics, the ITN
The ITN was born in 2014, from an observation I made with a few colleagues who would later become principal investigators of the network. Our observation was that the quality of the scientific output of the research performed by LHC experiments was increasingly driven, in the quality of its scientific output, by the strength of the machine learning (ML) tools that were being employed to extract physics results. Historically, one observes that HEP research saw a rather slow adoption of advanced multi-variate analysis methods (such as artificial neural networks). Increasing the attention to the development of new ML tools to improve the performance of data analysis seemed something to invest on.
Together with a few colleagues I spent three months of my life, at the end of 2014, to put together an attractive research program, as well as finding the right European institutes as partners. We created a very carefully calibrated training trajectory for 10 Ph.D. students who would pick up skills in advanced training schools and workshops, secondments at academic and non-academic centres, and interaction with other nodes. But most of all, we created a strong team in order to function as a single entity, despite involving researchers from the two big competing LHC experiments ATLAS and CMS.
We must have done a great job, as we won at our first attempt a 2.4M euro funding made available by the Marie-Curie Actions of the Horizon2020 program of the EC, and were able to start hiring our Ph.D. "early-stage researchers" in the summer of 2015. From there on, running the network as planned took 100% of my time, but it was really worth the effort. Some of the research results we produced are described in the article now published, and I will describe them below.
AMVA4NewPhysics Research Highlights
The article is structured in nine sections: an introduction, seven sections describing developed tools and research results, and a short concluding section. The seven research sections are thus titled:
- Supervised Classification Methods for the Search of Higgs Boson Decays to Tau Lepton Pairs
- Multi-variate Techniques for Higgs Pair Production Studies
- Jet Flavour Classification
- Improvements and Applications of the Matrix Element Method
- New Statistical Learning Tools for Anomaly Detection
- Similarity Search for the Fast Simulation of the ATLAS Forward Calorimeter
- Toward the Full Optimization of Physics Measurements
The first interesting result was a reanalysis of the methods of supervised classification that were used for the winning solutions of the Higgs ML Challenge, a kaggle competition that involved 1800 teams of physicists as well as computer scientists, tasked with finding the most significant discrimination of Higgs boson decays to tau leptons from backgrounds. That challenge demonstrated that pools of neural networks ensembled and subjected to strong cross-validation strategies were the absolute state of the art for this kind of problem. By dissecting the various ingredients that allow to improve the performance in the classification task, it was possible to understand what are the most important elements of a powerful classifier.
Below are two tables from the paper. The first one shows a comparison of our solution (from the work of Dr. Giles C. Strong, who published an article that details these studies) with the three first classified ones on Kaggle. Of relevance is the large improvement in training time of our solution. The second table shows which ingredients had the highest weight in improving the classification performance.
A second important contribution to LHC physics results, which is described in detail in the article, was the contribution of members of our network to development of powerful b-tagging algorithms. B-tagging is the identification of hadronic jets that originate from the fragmentation of a bottom quark emitted in the proton-proton collision. Since jets are the most common collective phenomenon in hadron-hadron collisions, and since b-quarks are relative rare and they flag the production of interesting subnuclear reactions (one among all, the Higgs boson decay to b-quark pairs), b-tagging is one of the crucial ingredients in performant LHC searches and measurements.
Specialized deep neural networks were developed to excel in the identification of b-quark jets, and to also classify other originators of the observed jets: DeepCSV, DeepJet, and DeepFlavour. These new tools produced quite significant improvements in the classification performance, leading CMS to reach observation-level significance for the decay of Higgs boson to bottom-quark jets in 2018. The figure below (not part of the AMVA4NewPhysics article, but of the Higgs to bb observation paper) shows the reconstructed mass of Higgs boson decays, where the red histogram is the fitted contribution of the signal.
The last result I wish to mention here is one dear to me, as it was the main product of my collaboration with the main developer, the Padova ESR Pablo de Castron Manzano. Pablo developed a method to retrofit a neural network used for classification of small signals in large multivariate collider data with a knowledge of the total uncertainty on the parameter of interest, once all systematic uncertainties affecting the inference were taken into account. This trick pays large dividends to the accuracy of measurements that have significant contributions from systematic sources of uncertainty.
The algorithm developed, called INFERNO, was published in Computer Physics Communications in 2019, and it has created more interest in these advanced techniques. Indeed, my current research interests now center on the extension of that approach for the full optimization of the design of experiments, and I have created a collaboration (MODE) that develops differentiable programming tools for that very complex task.
Below you can see a block diagram of the INFERNO algorithm. Describing it here would be too much for this article, but you can probably see what I mean when I mention that the NN becomes "aware" of the purpose of its dimensionality reduction, by being retrofitted with information on the variance of the measurement of the parameter of interest, once systematics are modeled in the multidimensional data space.
Tommaso Dorigo (see his personal web page here) is an experimental particle physicist who works for the INFN and the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He coordinates the MODE Collaboration, a group of physicists and computer scientists from eight institutions in Europe and the US who aim to enable end-to-end optimization of detector design with differentiable programming. Dorigo is an editor of the journals Reviews in Physics and Physics Open. In 2016 Dorigo published the book "Anomaly! Collider Physics and the Quest for New Phenomena at Fermilab", an insider view of the sociology of big particle physics experiments. You can get a copy of the book on Amazon, or contact him to get a free pdf copy if you have limited financial means.