The research team led by Steven Salzberg developed special-purpose software to assemble 35 million DNA sequence fragments into the 30 chromosomes that make up the Bos taurus genome. The algorithms use paired-end sequence information, mapping data, and synteny with the human genome to detect errors, correct inverted segments and fill in sequence gaps. The resulting assembly has around 91% of the assembled genome anchored onto chromosomes.
The researchers believe their assembly is the best available, thanks to its completeness and the algorithm's ability to smooth out thousands of errors. Their comparisons demonstrate that the new cow genome assembly has better agreement with independent genetic maps, and a more complete representation of cow genes, than alternative assemblies.
The new assembly places some 150 million nucleotides (6%) more DNA sequence data onto chromosomes than the other draft assembly currently available, BosTau4.0 from the Baylor College of Medicine (BCM4). A new, expanded cow-human synteny map increases the number of syntenic breakpoints by approximately 30%. Salzberg's team also pinpointed a portion of the Bos taurus Y chromosome for the first time.
"Until the assembly is truly finished - a state that no mammalian genome, including human, has yet reached - we will continue to incorporate new data to fill in gaps, to correct the mis-oriented regions, and to place more sequences onto chromosomes," says Salzberg. The alpaca and sheep genomes are currently being sequenced, and should provide a rich source for making further improvements between these closely related mammals.
Although sequencing and assembly of mammalian genomes has become commonplace since the human genome was first sequenced seven years ago, assembling large genomes accurately remains a challenge.
The complete assembly has been deposited at GenBank (accession DAAA00000000) and is also available at ftp://ftp.cbcb.umd.edu/pub/data/Bos_taurus
Article: 'A whole-genome assembly of the domestic cow, Bos taurus' , Aleksey V Zimin, Arthur L Delcher, Liliana Florea, David R Kelley, Michael C Schatz, Daniela Puiu, Finnian Hanrahan, Geo Pertea, Curtis P Van Tassell, Tad S Sonstegard, Guillaume Marcais, Michael Roberts, Poorani Subramanian, James A Yorke and Steven L Salzberg, Genome Biology (in press)