Accelerating The Search For Dark Matter With Machine Learning

Yes, this is supposedly a particle physics blog, not a machine learning one - and yet, I have been finding myself blogging a lot more about machine learning than particle physics as of late. Why is that? Well, of course the topic of algorithms that may dramatically improve our statistical inference from collider data is of course dear to my heart, and has been so since at least two decades (my first invention, the "inverse bagging" algorithm, is dated 1992, when nobody even knew what bagging was). But the more incidental reason is that now _everybody_ is interested in the topic, and that means all of my particle physics and astroparticle physics colleagues.

Yes, this is supposedly a particle physics blog, not a machine learning one - and yet, I have been finding myself blogging a lot more about machine learning than particle physics as of late. Why is that?
Well, of course the topic of algorithms that may dramatically improve our statistical inference from collider data is of course dear to my heart, and has been so since at least two decades (my first invention, the "inverse bagging" algorithm, is dated 1992, when nobody even knew what bagging was). But the more incidental reason is that now _everybody_ is interested in the topic, and that means all of my particle physics and astroparticle physics colleagues.

A way to gauge the interest of the community on this topic is the number of gatherings to discuss advancements in the field and their impact in experimental research in fundamental science. If I look back at just the past few months, I attended a workshop at Fermilab in December last year, another at CERN in January, a conference in Saas-Fee in March. And this coming week I will be in Trieste at the Abdus Salam international center for theoretical physics, for a very exciting event called as per the title of this article.

What business do machine learning and dark matter searches have together? Well, for a starter there are a number of huge datasets, collected by particle physics, astroparticle physics, and astronomy instruments with colliders, with underground passive experiments, with satellite flights, with arrays of telescopes watching visible light, infrared light, X rays, gamma rays... All of these data contain information of what dark matter could be or not be, and while each collaboration does its best in interpreting it, there is the need to unify the effort and look at a bigger picture - but doing so entails huge challenges at several levels. And some of them can benefit from machine learning tools, of course.

But deep convolutional neural networks, to pick one tool, have proven to be crucial to decode automatically large numbers of images of distant galaxies; they may allow a seamless combination of those data with other datasets, still pixel-based views of the sky. Adversarial networks can offer ways to help construct successful generative models of the things we are studying. Recurrent neural networks can be of huge help in decoding time series, e.g. of photons from distant sources. Auto-encoders are also extremely promising as a way to distill the relevant information in those datasets.

And it is not all neural networks: for instance, clustering algorithms are very important when you are not sure what exactly you are searching for, and allow for effective preprocessing of large multi-dimensional datasets. Semi-supervised learning techniques, which are not necessarily based on NN architectures, are also a relatively new area of study which can offer specific advantages in searches for unknown signals.

All in all, there is an enormous amount of information to exchange and learn from, so spending a full week listening to the many new attempts, use cases, application of old tools to new tasks, from the experts who are tackling these challenges is indeed time very, very well spent. I have the intention of distilling some of the resulting information flow for this blog, something that I can do well while I listen to the talks if I only use 70% of my brain on the speaker... But I would like to spend 110% of my brain on these discussions, so there will have to be a compromise. We'll see how I manage, so stay tuned if you are interested in the bleeding edge of ML applications to research on what the universe is made of!

[And for the nosy ones among you, the list of talks can be accessed at this page, where you
might notice that I am one of the organizers in the scientific committee]

---

Tommaso Dorigo is an experimental particle physicist who works for the INFN at the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He coordinates the European network AMVA4NewPhysics as well as research in accelerator-based physics for INFN-Padova, and is an editor of the journal Reviews in Physics. In 2016 Dorigo published the book “Anomaly! Collider physics and the quest for new phenomena at Fermilab”. You can get a copy of the book on Amazon.

Tommaso Dorigo

Professor Tommaso Dorigo is an experimental particle physicist, who works for the INFN at the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He is currently a RECAT Guest Professor at Lulea University of Technology, and participates in the EIC-PATHFINDER project "PHINDER". Dorigo is the president of the USERN organization (https://usern.org), and the editor in chief of the journal "Brain, AI and cognition".  He is the author of Anomaly! Collider physics and the quest for new phenomena at Fermilab. Read more