Banner
    Single Top: tW Production Observed by CMS!
    By Tommaso Dorigo | September 19th 2012 09:02 AM | 15 comments | Print | E-mail | Track Comments
    About Tommaso

    I am an experimental particle physicist working with the CMS experiment at CERN. In my spare time I play chess, abuse the piano, and aim my dobson...

    View Tommaso's Profile
    In the last few days CMS released four new publications, which you can download from the Cornell arxiv site. These are exquisite new measurements in the fields of top physics, exotica searches, and higgs physics, and so I thought I would give you the coordinates here, and comment briefly on the first result in the list: paper 1, paper 2, paper 3, paper 4.

    The result I want to comment on is titled "Evidence for associated production of a single top quark and W boson in pp collisions at 7 TeV". As the title suggests, this is a search for a particular production of top quarks in proton-proton collisions, one which is both rare and difficult to extract even in the 7-TeV LHC collisions.

    We are accustomed to top quarks produced in pairs in hadron collisions. The interaction responsible for the creation of the heavy quark is in this case quantum chromodynamics (QCD), which is flavour-blind and cannot materialize one quark at a time. What this means is that since quarks come with a quantum number defining their flavour (e.g., top quarks have T=1), and since the colliding hadrons have zero net value of that quantum number, the strong interaction cannot "connect" the initial state (a proton-proton system with T=0) with a final state with non-null T. T is conserved by QCD!

    What QCD can do, and does at a reasonable rate in 7-TeV proton-proton collisions, is to produce a pair of top-antitop quarks, which have collectively T=0 since T(top)=1, T(antitop)=-1 (the flavour quantum number is additive). Indeed, one 7-TeV proton-proton collision every 500 million produces a top quark pair. A small fraction, one might argue; and yet these events are not hard to isolate, thanks to the very special decay characteristics of top quarks and their heavy mass.

    Single tops instead cannot be produced by QCD processes alone, for the same reason discussed above. Production of single top proceeds via several mechanisms, all involving at some stage the electroweak interaction, which turns a b-quark or a lighter down-type quark into a top, with the intercession of a W boson. W bosons in fact are the only means, in the standard model, by which flavour changes.

    So what is special about the new process isolated by CMS ? Well, it is a rare one, because it combines the electroweak mechanism (which is less frequent than the QCD one, mediated by a larger coupling constant) with a high demand of energy: two heavy objects are created, a top quark and a W boson. Other leading-order single-top production processes have one top and one or two lighter quarks in the final state, so they "turn on" at lower hard-process energies. You should recall that although the LHC center-of-mass energy of the studied collisions is 7 TeV (8 in 2012), what is actually available for the creation of the final state is most often much less than that: it depends on the fraction of the projectiles' energy carried by the quark or gluon which participates in the hard subprocess. So it does matter a lot whether the final state weighs 180 GeV (as in other more frequent single-top production processes) or 260 GeV (as in the one we are discussing).



    As shown in the Feynman diagrams above, the production calls in the fishing of a b-quark out of the initial state, and the emission of a W boson from the quark line, which turns the b into the top quark. The tW final state then decays into jets and leptons: CMS searched for these processes concentrating on the "dilepton" final state, in which the W directly produced in the subprocess and the other W (originated when the top quark disintegrates) both yield electron-neutrino or muon-neutrino pairs. One thus has a signature consisting of two energetic leptons, large missing transverse energy from the escaping neutrinos, and one single jet from b-quark hadronization.

    You may well imagine that the nastiest background to these events is top pair production. In fact, top pairs can yield the same signature when one b-quark escapes unmeasured, e.g. in the direction of the beam line, or is undetected because the kinematics make it too "soft" to generate a energetic hadronic jet. So CMS defines three event categories: events with two jets, both b-tagged; events with two jets, with only one jet containing a b-tag; and events with only one jet, b-tagged. Of course, the tW signal mostly populates the third category, while top pair production "prefers" the 2-jet, 2-b-tags category. The first and second category can be used as a nice "control region", where backgrounds must explain most of the observed events.

    The graph on the right shows the amount of signal (the empty part of the histogram bars) which is present in the three categories. The signal-rich one is the leftmost bin. Note how a little amount of signal exists also in the control regions, but is fractionally smaller.

    To isolate the signal further, CMS uses a multi-variable discriminant based on boosted decision trees, a technology that has become very popular in HEP applications in the course of the last decade. In practice the kinematical variables that are capable of discriminating signal from backgrounds are used to construct "trees" of possible selection strategies. These trees are classified according to their discriminating power, using simulated events; the data can then be discriminated using their collective output, which is a single output variable (generically taking values between zero and 1).

    The discriminant does not isolate very strongly the signal from top-pair production, because the kinematics of the two processes are extremely similar. However, some additional power is achieved by its use. The final cross section measurement is performed by fitting the data to the output of the kinematic discriminant (shown on the left for events in the "1 jet 1 b-tag"  signal bin). The result is a cross section σ(tW) = 16 (+5 -4) pb, in agreement with the standard model prediction.

    I find it quite nice that these rare processes are now systematically being observed at the LHC. Nobody really doubted of their existence, but each of them adds a stroke to the big picture of the standard model and the physics of hadronic collisions. If you were a theorist you could say you did not learn much from this measurement; but as an experimentalist, you can well say it increases your confidence in the power of your data analysis tools and strategies. I give more importance to these "sideline" measurements than to long searches for non-existing signals. As Galileo Galilei famously put it 400 years ago:

    "Io stimo piu' il trovar un vero, benché di cosa liggiera, che il disputare lungamente delle massime questioni, senza conseguir verità nissuna".

    Now ask Google translate to turn that into decent English for you ;-)

    Comments

    Tommaso,

    it seems that the links to the papers point to a CERN website which most readers of your blog will not have access to (I get a login page). Could you link to the arXiv instead?

    dorigo
    That's right Tobias, I'll fix it asap.
    Cheers,
    T.
    Tommaso,
    Thanks for the interesting news!
    Regarding the quote from Galileo, I'm afraid Google Translate does not do an adequate job - it leaves three or four words still in Italian, evidently because it doesn't recognize them. The partial translation does not convey significant meaning.

    "It 's more important to find a light truth that doing a long discussion of deep questions without finding any truth" (G.Galilei)

    Tommaso will correct me :-)

    Letizia

    rholley
    Perhaps "truth than doing"?

    If so, after making the change please feel free to delete this comment.

    Robert H. Olley Quondam Physics Department University of Reading England
    dorigo
    Yes Robert, "than" is correct.
    Cheers,
    T.
    rholley
    A bit of spelling change might help.  This gets close to it, and at least Google does turn it all into something like English.

    "Io stimo piu' il trovar un vero, benché di cosa leggera, che il disputare lungamente delle massime questioni, senza conseguire verità nessuna".


    Robert H. Olley Quondam Physics Department University of Reading England
    Amir D. Aczel
    Tommaso, Two questions (and please excuse my ignorance). First, I always though that quarks can only be "seen" as jets because of asymptotic freedom (the strong force). Does that apply only to first-generation quarks, or what? And second: All the elements in your Feynman diagram are very close to the Penguin diagrams that John Ellis once drew for me in the CERN Cafeteria. But there is no loop, or "penguin" here--only tree-level? Is there a secondary loop process too?Ciao,
    Amir
    Amir D. Aczel
    dorigo
    Hi Amir,

    the top quark decays faster than QCD has time to exert its influence, because Γ(top)>>Λ_QCD (about by an order of magnitude). So the top quark lives and decays as a free particle.

    There are loop diagrams, at higher order, but the ones drawn are the leading order ones - tree-level, in this case, suffices. Note also that if we were to consider NLO diagrams, we would lose the distinction between production of top-antitop pairs and the production of tW. But that is a rather subtle discussion, if you are curious though read the paper.

    Cheers,
    T.
    Amir D. Aczel
    P.S. The Italian is "Dantesque"--modern Google may not do it, but anyone who's read Dante in the original would understand (otherwise, as a previous commentator has suggested, change "i"s to "e"s at places, etc., and then submit to Google; but then, if you know what to change...you don't need Google!)
    Amir D. Aczel
    My intuition, who the original post seems to confirm poetically, but not in the more technical language of statistics, is that discovery of a predicted rare SM process like the tW decay at the expected frequency at LHC, and similar other processes, collectively, ought to place a statistical limit on the maximum level of systemic error that could be present in almost every LHC result. In other words, it ought to be proof that the LHC is more precise than it has claimed that it is.

    Since systemic error makes up a significant share of all of the margin of error in a typical LHC result, a reduction in systemic error ought to boost a lot of not quite notable almost two sigma and almost three or five sigma results, into actual two, three and five sigma results.

    Is my intuition about the role of accurate detection of rare decays like tW in calibrating the experimental appartus wrong?

    I know that the LHC considers "look elsewhere effects" when estimating the statistical significance of sampling errors in evaluating its searches sometimes, but is anyone doing wholistic, experiment-wide analysis of the entire set of results to calibrate systemic error estimates? Statistical significance enhancing limits on systemic error estimates from constraints derived from the entire universe of experimental results established at LHC ought to (incompletely) mitigate the statistical significance decreasing look elsewhere effects associated with doing very large numbers of experiments and then cherry picking the biggest deviations from SM expectations.

    Also, has anyone been doing whole LHC run comparisons of actual rates of deviations from SM expectations observed at LHC to expected rates of deviations from SM expectations due to sampling errors and other sources of error, presumably using non-parametric statistical tools like Chi-squared? This would be a very powerful critique or confirmation (as the case might be) of the accuracy of the methods that LHC papers have been using to estimate errors in individual result. Since the LHC has a large universe of results published (and more unpublished) already, where the SM expectation to observed data is stated together with explicitly stated error bars, making the comparison shouldn't be too terribly hard if a couple of bright people from the cast of thousands at LHC (or at supporting institutions worldwide) were assigned to the problem.

    If the actual amount of deviation of the results from the SM expectation is lower than the error bars estimate, then overall, systemic error and theoretica lexpectation error are probably being overestimated (taking it as a given the QM really is perfectly random so sampling error estimates must be right).

    If the actual amount of deviation of the results from the SM expectation is higher than the error bars estimate, then overall, either systemic error and theoretical expectation error are probably being underestimated, or there are BSM physics at work that are having a diffuse and subtle effect but will probably show up sooner or later.

    dorigo
    Interesting comment OhW. I think such an analysis is bound to be incomplete and biased, because it is impossible to know all results that the collaborations do not publish (blaming them as not well understood, and so waiting for more data or a check with a different method, etcetera). Mind you, this is a very small fraction, but it is a sizable fraction of the "deviant" ones you seek to quantify.

    I believe it is more interesting to play a-posteriori games such as the one I proposed a while ago with the top quark mass. It might be time to re-do it, now that the CMS experiment has reached for the first time a per-measurement error below 1 GeV on that parameter (more on that in a forthcoming piece). The thing I have in mind is discussed in this post.

    Cheers,
    T.
    My intuition, who the original post seems to confirm poetically, but not in the more technical language of statistics, is that discovery of a predicted rare SM process like the tW decay at the expected frequency at LHC, and similar other processes, collectively, ought to place a statistical limit on the maximum level of systemic error that could be present in almost every LHC result. In other words, it ought to be proof that the LHC is more precise than it has claimed that it is.

    Since systemic error makes up a significant share of all of the margin of error in a typical LHC result, a reduction in systemic error ought to boost a lot of not quite notable almost two sigma and almost three or five sigma results, into actual two, three and five sigma results.

    Is my intuition about the role of accurate detection of rare decays like tW in calibrating the experimental appartus wrong?

    I know that the LHC considers "look elsewhere effects" when estimating the statistical significance of sampling errors in evaluating its searches sometimes, but is anyone doing wholistic, experiment-wide analysis of the entire set of results to calibrate systemic error estimates? Statistical significance enhancing limits on systemic error estimates from constraints derived from the entire universe of experimental results established at LHC ought to (incompletely) mitigate the statistical significance decreasing look elsewhere effects associated with doing very large numbers of experiments and then cherry picking the biggest deviations from SM expectations.

    Also, has anyone been doing whole LHC run comparisons of actual rates of deviations from SM expectations observed at LHC to expected rates of deviations from SM expectations due to sampling errors and other sources of error, presumably using non-parametric statistical tools like Chi-squared? This would be a very powerful critique or confirmation (as the case might be) of the accuracy of the methods that LHC papers have been using to estimate errors in individual result. Since the LHC has a large universe of results published (and more unpublished) already, where the SM expectation to observed data is stated together with explicitly stated error bars, making the comparison shouldn't be too terribly hard if a couple of bright people from the cast of thousands at LHC (or at supporting institutions worldwide) were assigned to the problem.

    If the actual amount of deviation of the results from the SM expectation is lower than the error bars estimate, then overall, systemic error and theoretica lexpectation error are probably being overestimated (taking it as a given the QM really is perfectly random so sampling error estimates must be right).

    If the actual amount of deviation of the results from the SM expectation is higher than the error bars estimate, then overall, either systemic error and theoretical expectation error are probably being underestimated, or there are BSM physics at work that are having a diffuse and subtle effect but will probably show up sooner or later.

    Thank you Tommaso for picking our paper and for the great presentation!
    By the way, I noticed the link for paper 2 and paper 3 (the exotica ones) are identical.

    I guess that the other major CMS result Tommaso wanted to propose, recently appeared on arXiv, concerns searches for t-tbar resonances: http://xxx.lanl.gov/abs/1209.4397 (another "massima questione"...)