Now that the genome sequences of hundreds of bacteria and viruses are known, we can design tests that will rapidly detect the presence of these species based solely on their DNA. These tests can detect a pathogen in a complex mixture of organic material by recognizing short, distinguishing sequences—called DNA signatures—that occur in the pathogen and not in any other species.

Adam Phillippy and colleagues from the University of Maryland, USA, have developed a computer program that can identify these signatures with a higher degree of accuracy than ever before. They describe this new computational system, called Insignia, and the results of its successful application on 46 Vibrio cholerae strains this week in the journal PLoS Computational Biology.

Insignia uses highly efficient algorithms to compare known bacterial and viral genomes against each other and to background genomes including plants, animals, and humans. These comparisons are stored in a database and used to rapidly compute signatures for any particular target species. The program can have a wide range of applications, from diagnosing infections in humans to detecting harmful microbes in a water supply. To maximize its use by scientists in a variety of disciplines, Insignia is freely available on the authors’ website (http://insignia.cbcb.umd.edu).

Bacterial and viral pathogens have always represented one of the greatest threats to human health, and in recent times this threat increased due to the possibility of engineered biological agents. The genome sequencing field has targeted and sequenced the complete genomes of hundreds of bacteria and thousands of viruses over the past decade, which now make it possible to develop programs like Insignia, capable of detecting any given virus or bacterium by its DNA pattern in a sample.

CITATION: Phillippy AM, Mason JA, Ayanbule K, Sommer DD, Taviani E, et al. (2007) Comprehensive DNA signature discovery and validation. PLoS Comput Biol 3(5): e98. doi:10.1371/journal.pcbi.0030098