Many areas of research and medicine rely critically upon knowing a person’s individual immune system proteins, as they determine an individual’s ability to fight disease or mistakenly attack their own tissues. However, obtaining this information is costly and difficult.

In a new study, Listgarten et al demonstrated how statistical modeling can help researchers obtain this information more easily and cost effectively.

At the core of the human immune response is the train-to-kill mechanism in which specialized immune cells are sensitized to recognize small pieces of foreign pathogens (e.g., HIV). Following this sensitization, these cells are then activated to kill cells that display this same piece of pathogen. However, for sensitization and killing to occur, the pathogen must be “paired up” with one of the infected person’s specialized immune proteins—an HLA (human leukocyte antigen).

The way in which pathogen peptides interact with these HLA proteins defines if and how an immune response will be generated. Therefore, knowing which HLA proteins a person has is vital in transplant medicine, finding immunogenetic risk factors for disease, and understanding the way viruses like HIV mutate inside their host and evade the immune system.

The model uses a large set of previously measured, high-quality HLA data to find statistical patterns in this type of data. Using these patterns, the team from Microsoft Research, the National Cancer Institute, Massachusetts General, and the University of Oxford is able to take low-quality HLA data and clean it up so that it is of higher quality than that originally measured in the laboratory.

With this publication, Listgarten and co-authors have made a public tool available to the research community so that others can improve the quality of their HLA data and thus study individual immune systems more effectively.

Jennifer Listgarten, Zabrina Brumme, Carl Kadie, Gao Xiaojiang, Bruce Walker, Mary Carrington, Philip Goulder, David Heckerman, (2008) Statistical Resolution of Ambiguous HLA Typing Data. PLoS Comput Biol 4(2): e1000016. doi:10.1371/journal.pcbi.1000016