A preliminary assembly and annotation of the soybean genome, Glycine max, has been made available by the U.S. Department of Energy Joint Genome Institute (DOE JGI), to the greater scientific community to enable bioenergy research.
The announcement was made by Eddy Rubin, DOE JGI Director, during his keynote remarks Jan. 15 at the Plant and Animal Genome XVI Conference in San Diego,CA.
The large-scale shotgun DNA sequencing project began in the middle of 2006 and will be completed in 2008. A total of about 13 million shotgun reads have been produced and deposited in the National Center for Biotechnology Information (NCBI) Trace Archive in accordance with the consortium’s commitment to early access and consistent with the Fort Lauderdale genome data release policy.
The current assembly (representing 7.23x coverage), gene, set, and browser are collectively referred to as "Glyma0". Glyma0 is a preliminary release, based on a partial dataset. This is expected to be replaced with an improved, chromosome-scale "Glyma1" version by the end of 2008. Early users of this data are encouraged to track their favorite genes by saving local copies of the DNA sequences of these loci, and not by identifier or sequence coordinate, as these will change in future versions.
DOE JGI’s interest in sequencing the soybean stems from its role as a principal source of biodiesel, a renewable, alternative fuel with the highest energy content of any alternative fuel.
Detailed knowledge of the soybean genetic code will enable crop improvements for more effective application of this plant for clean bioenergy generation. Knowing which genes control specific traits, researchers are able to change the type, quantity, and/or location of oil produced by the crop. Through utilization of the sequence information generated by DOE JGI, it may be possible to develop a customized biomass production platform for combining oil seed production for biodiesel with enhanced vegetative growth for ethanol conversion--doubling the energy output of the crop. In 2004, over 3.1 billion bushels of soybeans were grown on nearly 75 million acres in the US, with an estimated annual value exceeding $17 billio--second only to corn, and about twice that of wheat.
Several other individuals, projects, grants, and agencies have made this monumental project possible. These included the four major projects: Public Expressed Sequence Tags (ESTs), SoyMap (which includes BAC libraries, modern physical mapping, and clone-based sequencing), and the Genetic Map with funding from USDA, NSF, United Soybean Board (USB), and the North Central Soybean Research Program (NCSRP).
The Public EST Project, supported by USB and NCSRP, was led by Lila Vodkin of the University of Illinois at Urbana-Champaign; Randy Shoemaker of the USDA-ARS, Ames, Iowa; and P. Steven Keim of Northern Arizona University.
The original physical map development, funded by USB, was conducted by Jan Dvorak, from the University of California, Davis, along with the Washington University Genome Center in St. Louis, Missouri, and David Grant, USDA-ARS, Ames, Iowa.
The NSF SoyMap team, comprising principal investigator Scott Jackson, Gary Stacey and Henry Nguyen, Jeff Doyle of Cornell University, William Beavis of the National Center for Genome Resources (NCGR) in Santa Fe, New Mexico, and Iowa State, Gregory May (NCGR), Will Nelson and Rod Wing of the University of Arizona, with Randy Shoemaker, anchored the map and conducted quality control.
The team devoted to genetic mapping and physical map anchoring, yielding several thousand sequence-based markers, included USDA-Agricultural Research Service (ARS) investigators, including Perry Cregan and Dave Hyten of Beltsville, Maryland; Randy Shoemaker, David Grant, and Steven Cannon of USDA-ARS Ames, Iowa; along with James Specht of the University of Nebraska, Lincoln.
The annotation of the soybean genome was carried out by a team of researchers from the DOE JGI and the University of California Berkeley’s Center for Integrative Genomics, with support from the DOE, USDA, NSF, and the Gordon and Betty Moore Foundation.
The U.S. Department of Energy Joint Genome Institute, supported by the DOE Office of Science, unites the expertise of five national laboratories -- Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest -- along with the Stanford Human Genome Center to advance genomics in support of the DOE missions related to clean energy generation and environmental characterization and cleanup. DOE JGI’s Walnut Creek, CA, Production Genomics Facility provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges.
The preliminary data can be accessed at http://www.phytozome.net/soybean.