A genome sequence is a long sequence written in a four letter code—3 billion letters in the case of a human genome. How genomic code is deciphered is traditionally left to professional annotators who use information from a number of sources (for instance, knowledge about similar genes in other organisms) to work out where a gene starts, stops and what it does. Even the "gold standard" of professional annotation is an exceptionally slow process.

However, new crowdsourcing technology hopes to provide a faster solution. Don't cringe, scientists, but it's Wikipedia.

Andrew Su, John Huss III and colleagues have established a 'Gene Wiki', an online repository of information on human genes, within Wikipedia. They envision a network of articles, created by a computer program and enhanced by user comments, which will describe the relationship and functions of all human genes.

There is a lot of potential information about any given gene — its name, sequence, position on a chromosome, the protein(s) it encodes, other gene(s) it interacts with, etc., and presenting this information is referred to as 'gene annotation.' As information may come from many different researchers working independently, it is important that resources exist to collect the information together.

Existing annotation libraries include Gene Portals and Model Organism Databases; however, the information stored in these is considered to be definitive, which requires constant updates by specific experts and formal presentation of information. The new method allows a much more flexible, organic accumulation of science, with all readers also able to edit and add to the Gene Wiki pages.

In order to stimulate the development of this Wikipedia based resource, Andrew Su and colleagues developed a system that automatically posts information from existing databases as 'stub' articles on Wikipedia. A computer program downloads information from one system, formats it according to Wiki formatting and the 'stub' template that the authors have designed, and—if a page does not already exist for that gene—posts the information on Wikipedia.

The authors are confident that their stubs will seed the posting of more detailed information from scientists who encounter them on Wikipedia—and they report that, so far, they appear to be succeeding: the absolute number of edits on mammalian gene pages has doubled.

Citation: Huss JW III, Orozco C, Goodale J, Wu C, Batalov S, et al. (2008) A gene wiki for community annotation of gene function. PLoS Biol 6(7): e175. doi:10.1371/journal.pbio.0060175