Yesterday I was in Rome, at a workshop organized by the Italian National Institute for Nuclear Physics (INFN), titled "What Next". The event was meant to discuss the plan for basic research in fundamental physics and astrophysics beyond the next decade or so, given the input we have and the input we might collect in the next few years at accelerators and other facilities.

The workshop was a success, with a participation of several hundred colleagues (I believe well over 500). For the first time in many years, INFN discussed openly about its future in the light of the present situation. The discussion was generally interesting and sometimes lively, and in a couple of occasions the audience (otherwise reluctant to self-celebrate itself) underlined its appreciation of the points made by the commenters with spontaneous bouts of applause.

Such was the case, for instance, of a comment from CERN theorist Gian Giudice, who countered a few comments describing a depressing situation at the high-energy frontier (Bertolucci spoke of a PhD syndrome, the "Post-higgs Depression", but it was a joke - he himself did not imply that the sentiment was really justified) by explaining that he instead considers the theoretical situation confusing and fertile, and the experimental situation extremely interesting and lively, with HEP for the first time having a chance to study a fundamental scalar particle, and many bright years ahead with a clear plan of investigations.

Another recurring theme of the discussion was the mention of "ballistic" physics - a term which had been introduced by INFN president Nando Ferroni to distinguish experiments which were already flying and on a well-defined course from projects that were still on the ground waiting for a flight plan. I saw a lot of confusion on what exactly "ballistic" meant -some in the audience appeared to understand the word as a synonym for collider physics, others were even more confused. The point, however, was that INFN should look forward, to plans for the next decade that are not even yet defined, and endeavours that the researchers have expertise and means to pursue with great chances of success, if started now.

I have to say I did not hear many proposals which were clearly fitting the above definition. A generic call to arms to study axions, a reiterated stress on the importance of double beta decay studies, and a general guilty feeling of having no way to participate in cutting-edge investigations of the cosmic microwave background were three of the themes. A fourth one was whether the LHC should aim for high luminosity or an attempt at increasing its centre-of-mass energy beyond 14 TeV, by substituting some of the magnets with 11-Tesla ones or by studying new magnet technologies which are still not even on blueprint. While everybody wants more energy, one should not forget that the LHC is still in its infancy. Michelangelo Mangano correctly pointed out that the Tevatron discovered the top quark in 1995, but then did a lot more and almost 20 years later would have had a chance at discovering the Higgs if it had continued running (and if the LHC had not done that first). So these are machines with a long time scale and we have to be patient. Increasing the luminosity is a safe plan for the near future of the CERN machine; dismantling it to go fishing for heavier particles can be a good idea only after we have harvested enough with the present setup.

One interesting topic, only mentioned toward the end of the workshop, was the situation of INFN. The institute consistently demonstrates its worth by winning grants and classifying on top when official evaluations are carried out, and yet we are stuck in a situation where we cannot hire more young researchers (for each retirement, only 20% of the freed resources can be used for new hirings). Italian governments come and go, with no apparent intent to improve the situation until now. Ferroni correctly explained that to change the situation all the 1800 INFN employees should quit their job, after which it would indeed be possible to hire 500 younger and more proactive scientists, bringing in new energy. But I suppose it was not a real suggestion.

Among the sad notes, I was struck when an esteemed colleague nonchalantly talked about a measurement of a fundamental constant (whose value was quoted as 0.110+-0.0027) as a "40 sigma" measurement. He also referred to another quantity measured far from zero as a "25 sigma" result. The speaker fell in the same conceptual mistake when he referred to the sum of neutrino masses, measured at 0.36+-0.10 eV, as a "3.6 sigma" result.

I soon realized that the above claims raised no eyebrows in most of my colleagues, so I had to speak up to explain that I was disturbed by the loose language of the speaker, receiving further input that neither the speaker nor many in the audience really had a clue (he asked what was wrong with the statement, I replied I was not interested in lecturing there, and we dropped the issue). So I reason today that it would be better if I tried to explain what I am disturbed about here.

First of all let me explain that what I am concerned with the most is the fact that we, as scientists, invariably at some point also do science popularization, and in that capacity we have to be careful. And I also believe that who speaks bad, thinks bad. So we should try to be clear when we "explain" physics results, and not fall into pitfalls such as the ones above; this care should be applied when we talk to colleagues as much as when we talk to outsiders. Why the above mentioned claims are pitfalls? I am explaining that below.

The simple facts are the following. When you measure something to be 0.36+-0.10, you are saying two things. Given a measurement, you are quoting a central value and a confidence interval. The latter is by convention taken to include 68% of the possible values of the parameter of interest, according to your measurement. Is 0.10 a "sigma" ? It can be interpreted as such only if the error distribution is Gaussian, which it usually isn't. Worse than that, in the case of a neutrino mass measurement you can't ignore that you KNOW that the error distribution is NOT a Gaussian: in fact, a Gaussian distribution is defined from minus infinity to plus infinity, while the parameter you are estimating is positive-definite. Note, in passing, that you might be estimating the sum of neutrino masses by measuring something else which MAY have a distribution from minus infinity to plus infinity; but the estimate of the parameter involves a conversion from your measured quantity; the former is positive definite, and its PDF can't be Gaussian.

So it is wrong to call 0.10 a "sigma". Fine, but it is just a good approximation, right ? Wrong. If you take "0.36+-0.10" to mean, as the speaker yesterday, that we have a 3.6 sigma measurement of sum of neutrino masses, just because 0.36 divided by 0.10 is 3.6, you are misrepresenting things quite a bit. You want to say that your central value is incompatible with zero (and that neutrinos are thus measured to be massive) at a 3.6 sigma level, which corresponds to a p-value of a few hundreds of a percent; but you simply cannot say that from the numbers quoted, as you do not know what the tails of the PDF of the estimated parametes are. Strictly speaking, those two numbers only mean that the compatibility of the data with the "zero mass sum" hypothesis is below 16% (as +-0.10 is a central interval encompassing 68% of the probability), and nothing more; not really a hundredth of a percent. The rest is your wishful thinking: if you don't know how that PDF is distributed you are mistaken when you extrapolate.

If you understood the above argument, it should come as no surprise that saying that 0.1+-0.0025 (or whatever the exact numbers are) is a 40-sigma result (implying, again, that zero is excluded at 40-sigma level) is an irritating misrepresentation of the meaning of the measurement result. Nobody will ever be capable of knowing the PDF of a parameter that far away in the tails: it implies an impossible control of measurement errors.

You could argue that "sigma" is just a jargon word and that the speaker was just giving the scale of how far from zero the parameter had been measured to be, and that as such it was legitimate to talk of 40 sigma. I strongly disagree with that usage of the word. When we use the word "sigma", as e.g. "the null hypothesis is excluded at 4-sigma level", we are just using a surrogate for a p-value; counting p-values with sigma units is just a simple way (well a bit less simple than others) to convert very small numbers into manageable ones, as when we use micro, nano, and pico suffixes. We are saying that the probability of the data given the null hypothesis is a number which corresponds to the area under the tail of a Gaussian distribution from 4-sigma to infinity. Any other interpretation of what is "4-sigma" is fantasious. If we loosen up the way we use words that have a quantitative interpretation we harm ourselves -we create confusion- and we deceive everybody else. We have a responsibility to let the outside world (science reporters, interested laymen, funding agents) understand clearly what we mean when we publish our results, and we should be very careful about preserving the meaning of quantitative statements.

The workshop was a success, with a participation of several hundred colleagues (I believe well over 500). For the first time in many years, INFN discussed openly about its future in the light of the present situation. The discussion was generally interesting and sometimes lively, and in a couple of occasions the audience (otherwise reluctant to self-celebrate itself) underlined its appreciation of the points made by the commenters with spontaneous bouts of applause.

Above, part of the audience in the conference room of the Angelicum in Rome

Above, part of the audience in the conference room of the Angelicum in Rome

Such was the case, for instance, of a comment from CERN theorist Gian Giudice, who countered a few comments describing a depressing situation at the high-energy frontier (Bertolucci spoke of a PhD syndrome, the "Post-higgs Depression", but it was a joke - he himself did not imply that the sentiment was really justified) by explaining that he instead considers the theoretical situation confusing and fertile, and the experimental situation extremely interesting and lively, with HEP for the first time having a chance to study a fundamental scalar particle, and many bright years ahead with a clear plan of investigations.

Another recurring theme of the discussion was the mention of "ballistic" physics - a term which had been introduced by INFN president Nando Ferroni to distinguish experiments which were already flying and on a well-defined course from projects that were still on the ground waiting for a flight plan. I saw a lot of confusion on what exactly "ballistic" meant -some in the audience appeared to understand the word as a synonym for collider physics, others were even more confused. The point, however, was that INFN should look forward, to plans for the next decade that are not even yet defined, and endeavours that the researchers have expertise and means to pursue with great chances of success, if started now.

I have to say I did not hear many proposals which were clearly fitting the above definition. A generic call to arms to study axions, a reiterated stress on the importance of double beta decay studies, and a general guilty feeling of having no way to participate in cutting-edge investigations of the cosmic microwave background were three of the themes. A fourth one was whether the LHC should aim for high luminosity or an attempt at increasing its centre-of-mass energy beyond 14 TeV, by substituting some of the magnets with 11-Tesla ones or by studying new magnet technologies which are still not even on blueprint. While everybody wants more energy, one should not forget that the LHC is still in its infancy. Michelangelo Mangano correctly pointed out that the Tevatron discovered the top quark in 1995, but then did a lot more and almost 20 years later would have had a chance at discovering the Higgs if it had continued running (and if the LHC had not done that first). So these are machines with a long time scale and we have to be patient. Increasing the luminosity is a safe plan for the near future of the CERN machine; dismantling it to go fishing for heavier particles can be a good idea only after we have harvested enough with the present setup.

One interesting topic, only mentioned toward the end of the workshop, was the situation of INFN. The institute consistently demonstrates its worth by winning grants and classifying on top when official evaluations are carried out, and yet we are stuck in a situation where we cannot hire more young researchers (for each retirement, only 20% of the freed resources can be used for new hirings). Italian governments come and go, with no apparent intent to improve the situation until now. Ferroni correctly explained that to change the situation all the 1800 INFN employees should quit their job, after which it would indeed be possible to hire 500 younger and more proactive scientists, bringing in new energy. But I suppose it was not a real suggestion.

**Diatriba mode on**

Among the sad notes, I was struck when an esteemed colleague nonchalantly talked about a measurement of a fundamental constant (whose value was quoted as 0.110+-0.0027) as a "40 sigma" measurement. He also referred to another quantity measured far from zero as a "25 sigma" result. The speaker fell in the same conceptual mistake when he referred to the sum of neutrino masses, measured at 0.36+-0.10 eV, as a "3.6 sigma" result.

I soon realized that the above claims raised no eyebrows in most of my colleagues, so I had to speak up to explain that I was disturbed by the loose language of the speaker, receiving further input that neither the speaker nor many in the audience really had a clue (he asked what was wrong with the statement, I replied I was not interested in lecturing there, and we dropped the issue). So I reason today that it would be better if I tried to explain what I am disturbed about here.

First of all let me explain that what I am concerned with the most is the fact that we, as scientists, invariably at some point also do science popularization, and in that capacity we have to be careful. And I also believe that who speaks bad, thinks bad. So we should try to be clear when we "explain" physics results, and not fall into pitfalls such as the ones above; this care should be applied when we talk to colleagues as much as when we talk to outsiders. Why the above mentioned claims are pitfalls? I am explaining that below.

The simple facts are the following. When you measure something to be 0.36+-0.10, you are saying two things. Given a measurement, you are quoting a central value and a confidence interval. The latter is by convention taken to include 68% of the possible values of the parameter of interest, according to your measurement. Is 0.10 a "sigma" ? It can be interpreted as such only if the error distribution is Gaussian, which it usually isn't. Worse than that, in the case of a neutrino mass measurement you can't ignore that you KNOW that the error distribution is NOT a Gaussian: in fact, a Gaussian distribution is defined from minus infinity to plus infinity, while the parameter you are estimating is positive-definite. Note, in passing, that you might be estimating the sum of neutrino masses by measuring something else which MAY have a distribution from minus infinity to plus infinity; but the estimate of the parameter involves a conversion from your measured quantity; the former is positive definite, and its PDF can't be Gaussian.

So it is wrong to call 0.10 a "sigma". Fine, but it is just a good approximation, right ? Wrong. If you take "0.36+-0.10" to mean, as the speaker yesterday, that we have a 3.6 sigma measurement of sum of neutrino masses, just because 0.36 divided by 0.10 is 3.6, you are misrepresenting things quite a bit. You want to say that your central value is incompatible with zero (and that neutrinos are thus measured to be massive) at a 3.6 sigma level, which corresponds to a p-value of a few hundreds of a percent; but you simply cannot say that from the numbers quoted, as you do not know what the tails of the PDF of the estimated parametes are. Strictly speaking, those two numbers only mean that the compatibility of the data with the "zero mass sum" hypothesis is below 16% (as +-0.10 is a central interval encompassing 68% of the probability), and nothing more; not really a hundredth of a percent. The rest is your wishful thinking: if you don't know how that PDF is distributed you are mistaken when you extrapolate.

If you understood the above argument, it should come as no surprise that saying that 0.1+-0.0025 (or whatever the exact numbers are) is a 40-sigma result (implying, again, that zero is excluded at 40-sigma level) is an irritating misrepresentation of the meaning of the measurement result. Nobody will ever be capable of knowing the PDF of a parameter that far away in the tails: it implies an impossible control of measurement errors.

You could argue that "sigma" is just a jargon word and that the speaker was just giving the scale of how far from zero the parameter had been measured to be, and that as such it was legitimate to talk of 40 sigma. I strongly disagree with that usage of the word. When we use the word "sigma", as e.g. "the null hypothesis is excluded at 4-sigma level", we are just using a surrogate for a p-value; counting p-values with sigma units is just a simple way (well a bit less simple than others) to convert very small numbers into manageable ones, as when we use micro, nano, and pico suffixes. We are saying that the probability of the data given the null hypothesis is a number which corresponds to the area under the tail of a Gaussian distribution from 4-sigma to infinity. Any other interpretation of what is "4-sigma" is fantasious. If we loosen up the way we use words that have a quantitative interpretation we harm ourselves -we create confusion- and we deceive everybody else. We have a responsibility to let the outside world (science reporters, interested laymen, funding agents) understand clearly what we mean when we publish our results, and we should be very careful about preserving the meaning of quantitative statements.