The chemical marks littering the DNA inside our cells have been like trees in front of us - important, but we couldn't see the whole forest so we could study one gene at a time.
New high-throughput DNA sequencing technology has enabled researchers at the Salk Institute for Biological Studies to map the precise position of individual DNA modifications throughout the genome of the plant Arabidopsis thaliana, and chart its effect on the activity of any of Arabidopsis’ roughly 26,000 genes.
The Salk study, which appears today in Cell, paints a detailed picture of a dynamic and ever-changing, yet highly controlled, epigenome, the layer of genetic control beyond the regulation inherent in the sequence of the genes themselves.
Being able to study the epigenome in great detail and in its entirety will provide researchers with a better understanding of plant productivity and stress resistance, the dynamics of the human genome, stem cells’ capacity to self-renew and how epigenetic factors contribute to the development of tumors and disease.
“For a long time the prevailing view held that individual modifications are not critical,” says Joseph Ecker, Ph.D., a professor in the Plant Biology laboratory and director of the Salk Institute Genomic Analysis Laboratory. “The genomes of higher eukaryotes are peppered with modifications but unless you can take a detailed look at a large scale there is no way of knowing whether a particular mark is critical or not.”
Discoveries in recent years made it increasingly clear that there is far more to genetics than the sequence of building blocks that make up our genes. Adding molecules such as methyl groups to the backbone of DNA without altering the letters of the DNA alphabet can change how genes interact with the cell’s transcribing machinery and hand cells an additional tool to fine-tune gene expression.
“The goal of our study was to integrate multiple levels of epigenetic information since we still have a very poor understanding of the genome-wide regulation of methylation and its effect on the transcriptome,” explains postdoctoral researcher and co-first author Ryan Lister, Ph.D.
The transcriptome encompasses all RNA copies or transcripts made from DNA. The bulk of transcripts consists of messenger RNAs, or mRNAs, that serve as templates for the manufacture of proteins but also includes regulatory small RNAs, or smRNAs. The latter wield their power over gene expression by literally cutting short the lives of mRNAs or tagging specific sequences in the genome for methylation.
But before Lister could start to unravel the multiple layers of epigenetic regulation that control gene expression, he had to pioneer new technologies that allowed him to look at genome-wide methylation at single-base resolution and to sequence the complete transcriptome within a reasonable timeframe.
Collaborating scientists at the ARC Centre of Excellence in Plant Energy Biology at the University of Western Australia in Perth developed a powerful, web-based genome browser, which played a crucial role in unlocking the information hidden in the massive datasets.
Cells employ a whole army of enzymes that add methyl groups at specific sites, maintain established patterns or remove undesirable methyl groups. When Lister and his colleagues compared normal cells with cells lacking different combination of enzymes they discovered that cells put a lot of effort in keeping certain areas of the genome methylation-free.
On the flipside, the Salk researchers found that when they knocked out a whole class of methylases, a different type of methylase would step into the breach for the missing ones. This finding is relevant for a new class of cancer drugs that work by changing the methylation pattern in tumor cells.
“You might succeed in removing one type of methylation but end up with increasing a different type,” says Ecker. “But very soon we will be able to look and see what kind of compensatory changes are happening and avoid unintended consequences.”
Previous studies had found that a subset of smRNAs could direct methylation enzymes to the region of genomic DNA to which they aligned. Overlaying genome-wide methylome and smRNA datasets confirmed increased methylation precisely within the stretch of DNA that matched the sequence of the smRNA. Conversely, heavily methylated smRNA loci tended to spawn more smRNAs.
“We looked at a plant genome but our method can be applied to any system, including humans,” says Lister. Although the human genome is about 20 times bigger than the genome of Arabidopsis – plant biologists’ favorite model system not least because of its compact genome – Ecker predicts that within a year or so, sequencing technology will have advanced far enough to put the 3 billion base pairs of the human genome and their methyl buddies within reach.
“This really is just the beginning of unmasking the role of these powerful epigenetic regulatory mechanisms in eukaryotes,” says Ecker.
This work was supported by grants from the National Science Foundation, the Department of Energy, the National Institutes of Health and the Mary K. Chapman Foundation.
Scientists who also contributed to the study include postdoctoral researcher and co-first author Ronan C. O’Malley, Ph.D., and postdoctoral researcher Brian D. Gregory, Ph.D., in Ecker’s lab, graduate student and co-first author Julian Tonti-Filippini and Professor A. Harvey Millar, Ph.D., both at ARC Centre of Excellence in Plant Energy Biology at the University of Western Australia, Perth, and professor Charles C. Berry, Ph.D., in the Department of Family/Preventive Medicine at the University of California, San Diego.