Over 60 percent of the nearly 5,000 genome projects reported in the Genomes OnLine Database involve microbes.   It's no surprise.   Microbes are important in everything from bioenergy to agriculture and medicine and are involved in Earth’s biogeochemical cycles.

A lot could be done with microbial genomics, says DOE JGI Genome Biology head Nikos Kyrpides writing in Nature Biotechnology, if researchers go beyond the present anthropocentric focus and institute shared standards for genomic data collection and analysis.

According to Kyrpides,  nearly 1,000 microbial genomes have been sequenced over the past 15 years but the data obtained has been compromised by the lack of standards for so many critical procedures in the field, from simple data exchange to gene finding, function prediction, and metabolic pathway description. Echoing other researchers, most notably DOE JGI’s Patrick Chain and Miriam Land during the recent “Sequencing, Finishing, Analysis in the Future” Conference, Kyrpides calls for the development of genome annotation standards and their adoption by sequencing centers around the world — a necessity for meaningful genome comparisons, he says.

Nikos Kyrpides of the DOE Joint Genome Institute
Genome Biology head Nikos Kyrpides of the DOE Joint Genome Institute.  Credit: JGI

Kyrpides offers numerous suggestions to meet these and other challenges that face genomics research in the decade ahead. For example, the list of microbial genomes for potential sequencing, limited to the approximately one percent of the organisms that can be cultured in the lab, has been further biased by a focus on a few groups of particular impact on human health or activities. Thus, vast realms of biodiversity remain unexplored. Kyrpides applauds the effort to coordinate balanced sampling of the Tree of Life recently launched through GEBA: the Genomic Encyclopedia of Bacteria and Archaea. He also sees a way forward using single-cell genomics — a technique now being pursued in earnest by DOE JGI researcher Tanja Woyke and her colleagues — in partnership with environmental metagenomics to provide a more holistic understanding of microbial communities and their individual members. 

Kyrpides also suggests several innovative approaches for easing the data processing bottleneck accompanying the exponential increase in genomic data. All-versus-all gene comparisons — previously a common practice — will become infeasible. To reduce the size of the datasets, he proposes a proxy approach in which one protein from each protein family or one species from each genus represents the group. Taking this one step farther, all the genes from all the sequenced strains in a species — the pan-genome for that species — would constitute the genome representing that species for gene comparisons.

Sharing his vision for the future of microbial genomics, Kyrpides observes: “The remarkable number of microbes—already estimated to be several orders of magnitude greater than the number of stars in the universe—urgently calls for a transition from random, anecdotal, and small scale surveys towards a systematic and comprehensive exploration of our planet.” With new tools in hand and international initiatives for increased collaboration underway, the field of microbial genomics is poised for a decade of exciting advances. 

The U.S. Department of Energy Joint Genome Institute is committed to advancing genomics in support of DOE missions related to clean energy generation and environmental characterization and cleanup. DOE JGI is headquartered in Walnut Creek, CA.