Interpreting The Predictions Of Deep Neural Networks

CERN has equipped itself with an inter-experimental working group on Machine Learning since a couple of years. Besides organizing monthly meetings and other activities fostering the dissemination of knowledge and active research on the topic, the group holds a yearly meeting at CERN where along with interesting presentations on advances and summaries, there are tutorials to teach participants the use of the fast-growing arsenal of tools that any machine-learning enthusiast these days should master.

The 2018 event is taking place from April 9 to 12 at CERN, and I have already collected enough wow moments to blog about something. I wish to mention only Wojciech Samek's talk from today's afternoon session here. Wojciech is a researcher at the Heinrich Hertz institute and he is focusing his research these days on understanding the output of deep neural networks.

The talk discussed how it is important to understand what information the networks use for the correct classification of elements. He made the example of physicians wanting to figure out what makes a neural network classify a skin formation's image as a cancer or as a benign alteration. Then he took on the topic of showing in detail how to reverse-hack the classification of elements - focusing on the use case of images, which is fun and quite revealing.

One of the methods he discussed consists in sorting out the relevance of the information used by the network in classifying an element by navigating back from the network output, expressed as e.g. some classification score for a given class, to the input features. The relevance can be computed backwards layer by layer, considering the activated nodes and their weight in determining the output. This can be done with a not-too-hard mathematical calculation that is shown to be "relevance conserving", so that one can indeed map the input features and find out what data structures are the most important.

The speaker produced many examples of the application of the techniques, showing how some of them can capture more clearly the way the deep neural network "thinks". A very clear visual proof was given by considering the MNIST set of hand-written numbers, which is a testing standard for all algorithms operating multi-class discrimination of images. By considering the relevance of the pixels in these images one can really understand how a trained network can decide that a scribbled sign is a three or a nine, or another number.

There is in fact a super-duper cool web page that allows you to scribble a digit and verify in real time how the network picked up the features you drew, and assign a label to it (you actually get to see the probability of each digit assignment!). Below is a screen shot of my attempt to "fool" the NN by drawing something which could be interpreted in as many ways as possible with equal probability. It took me a while to find a decent one!

As you see in the heat map, the features that were considered the most informative to take a decision of interpreting my scribbling as a "8" (which won by a very narrow margin over 4,6, and 2) are the presence of coloured pixels along the horizontal line at the center (where the heat map shows reds), along with the presence of uncoloured pixels not far below that line (the zone in orange).

Getting to really understand what are the features that make NNs take the correct decisions is a crucial step in designing better ones, as well as a step toward acquiring more reliance in their working.

Another interesting technique discussed in the talk was how one may infer the information content of parts of the input by progressively remove the individual bits (zeroing the pixels, for instance). These techniques are also used to understand how much the "context" in an image is relevant to help the classification process - think e.g. at identifying aeroplane images, where the area around them is usually empty, as opposite to identifying furniture, which benefits from the presence of other furnitures nearby.

I apologize for being unable to report more of the wealth of information and interesting leads that has been provided in today's talks. Of course, if you are interested you can check by yourself the talk slides at this link.

----

Tommaso Dorigo is an experimental particle physicist who works for the INFN at the University of Padova, and collaborates with the CMS experiment at the CERN LHC. He coordinates the European network AMVA4NewPhysics as well as research in accelerator-based physics for INFN-Padova, and is an editor of the journal Reviews in Physics. In 2016 Dorigo published the book “Anomaly! Collider physics and the quest for new phenomena at Fermilab”. You can get a copy of the book on Amazon.

Related articles

Comments

Know Science And Want To Write?

Donate or Buy SWAG