So what will I be covering in my 6 hours (well, 4.5 plus question time to be precise) of lectures? I plan to provide a broad overview of the topic, starting with a formal definition of the typical problems, in the first lecture; concentrate on the description of the main tools, particularly BDTs and NNs, in the second lecture; and give some practical examples of conventional and less conventional ML applications to HEP in the third lecture. If time allows, I will also give some practical hands-on lesson on some topics (this has still to be decided with the organizers).
Concerning the third lecture, you might think that it could end up being a boring list of standard classification problems, signal here, background there, train a BDT, find a cut. I beg to differ - in fact, the range of applications of ML techniques to HEP problems has vastly broadened in the course of the past five to 10 years. E.g.:
- binary and multi-class classification tasks: here of course there are heaps of examples. On the other hand there have recently been continuous improvements and brilliant little new ideas on classical problems such as quark vs gluon discrimination, b-tagging, and boosted jet reconstruction. It is also interesting to note that new developments specifically targeting HEP applications are available now, such as techniques that incorporate nuisance parameters in the loss function of deep neural networks, thereby strongly boosting the precision on the parameter of interest (typically the signal fraction in a selected data sample).
- regression problems: here also there are many examples, and we learned quite a few lessons from applying the available tools to our tasks. One thing I might choose to mention is how to estimate the gradient of the ratio of two densities, known only through discrete examples, with a kNN technique, to improve the performance by adapting the definition of distance in the feature space (giving more weight to features along which the gradient is stronger).
- image recognition: convolutional neural networks are now used to improve the measurement of hadronic showers in fine-grained calorimeters. This is a promising new avenue both for the reconstruction of the events in existing experiments, and for the design of more optimized hardware in future ones.
- clustering techniques: these have been shown to be invaluable for the categorization of complex multi-dimensional parameter spaces of theories extending the Standard Model. By properly defining the "distance" of theories in the space as a function of how different is the observable physics (using a suitable test statistic constructed with the observed densities), one can optimally identify benchmark theory points whose study has the largest impact on the searches.
- anomaly detection methods: here the field seems to be thriving with new ideas, although admittedly agreed-upon applications to new physics searches are still few. There are a few examples to discuss here though.
In summary, I do not plan to be boring at all! If you have a chance, come to the school and let's have fun with this incredibly interesting new discipline!
Tommaso Dorigo is an experimental particle physicist who works for the INFN at the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He coordinates the European network AMVA4NewPhysics as well as research in accelerator-based physics for INFN-Padova, and is an editor of the journal Reviews in Physics. In 2016 Dorigo published the book “Anomaly! Collider physics and the quest for new phenomena at Fermilab”. You can get a copy of the book on Amazon.