A group of the world's leading sequencing centers have announced plans to sequence 1000 human genomes
. The cost of the first human genome project was about $3 billion
; by comparison, the next 1000 will be a steal at possibly only $50 million dollars (and that's total cost, not per genome). But that's still a lot of money - why are we investing so much in sequencing genomes?
It may be a lot up front, but the benefits, in terms of both economics and
medical research, easily outweigh the cost of such a large project. By pooling sequencing resources and making large amounts of genome sequence data available up front, we can avoid inefficient and redundant sequencing efforts by groups of independent research groups trying discover gene variants involved in disease. In fact, it would probably be worthwhile to sequence 10,000 human genomes. With 1000 genomes, we're at least making a good start.
The 1000 genomes project
comes at a time when new, genome-wide disease studies have created a need for an extensive catalog of human genetic variation, and new technology has provided a potentially efficient way to fill that need.
Genome-wide association studies have scanned the genomes of thousands of people, sick and healthy, to find regions of DNA that may be involved in diseases like diabetes and heart disease. These studies have highlighted regions of our chromosomes that could be involved in disease, but the genome scans used are often still too low-resolution for researchers to pinpoint that exact genetic variant that might be involved.
In many cases there dozens or more genes inside of a disease-linked region, any of which could be the disease gene of interest. As a result, groups that do these genome-wide studies have to invest considerable time and resources mapping genetic variants at higher resolution - which of course significantly increases the effort required to get anything useful out of genome-wide association studies. What we need is a much better, much more detailed catalog of all the spots where humans vary in their genomes.
Although we're all roughly 99.9% similar in our genomes, that still leaves millions of DNA positions where we vary. And often we don't just vary at single DNA base-pairs (instances where in one position you might have a 'T', while I have a 'C'); small deletions and insertions of stretches of DNA also exist, and can be involved in disease. The 1000 genomes project intends to map both single-base changes and these 'structural variants.'
Many genetic variants important in health show up in just 1%, 0.1%, or even 0.01% of the population. Imagine a medically important genetic variant present in just 0.1% of the U.S. population; 300,000 people will have this variant and will be possibly at risk for a particular disease. Knowing about such variants could help us understand how the disease develops, and possibly design prevention and treatment strategies.
For 300,000 people in the U.S, and millions more around the world, it would therefore be a good thing to know about this variant. But to find those rare variants, we can't just sequence 100 human genomes, and even with 1000, we would miss a lot.
So it's obvious that 1000 genomes is just a start. To go more aggressively after rare variants that are likely to be medically important, the 1000 genomes group is going to also focus in on the small gene-containing fraction of the human genome. This will enable them to put more resources into finding rare variants near genes - places where we expect medically important variants to show up.
And let's not forget the technological benefits: by using next-generation sequencing technology, which is still not quite mature, this consortium hopes to develop innovative ways to effectively and cheaply re-sequence human genomes. In both physical technology and data analysis methods, we'll see benefits that will lead to more widespread use of this technology, hopefully in clinical diagnostics as well as research.
It has been nearly 20 years since the Human Genome Project officially began. We're still waiting for the promised medical benefits to emerge, but in the mean time, this effort has transformed biological research - in my opinion, at least as much as the foundational discoveries of molecular biology in the 50's and 60's. Future medical benefits will be based on this science.