Evolution And Personalized Medicine: How Genes Build Traits
    By Michael White | March 4th 2009 01:06 PM | 6 comments | Print | E-mail | Track Comments
    About Michael

    Welcome to Adaptive Complexity, where I write about genomics, systems biology, evolution, and the connection between science and literature,


    View Michael's Profile

    Show Me The Science Month Day 22

    How do genes work together to build body traits? This is one of the hottest questions in genetics today, and the answer holds implications not only for our understanding of evolution, but also health, agriculture, and wildlife conservation. A recent paper in Science (by Scientific Blogging's own Redneck Geneticist) takes a look at how genetic variants work together to generate the physical diversity that we see in living organisms.

    In high school biology class, most of us learned very simple genetics, using examples of Mendel's round or wrinkled peas, and red, white, or pink snapdragons. By working with simple cases like these, Mendel discovered some of the most fundamental principles of genetics, such as the fact that genes are inherited as discrete units (genes are digital). These fundamental principles are powerful, but like Newtonian mechanics when confronted with the three-body problem, simple, fundamental principles quickly get very tricky when faced with the complexity we encounter in real-world situations.

    Discrete Traits
    The inherited traits we learn about in high school biology generally fall into discrete categories: peas are either round or wrinkled; snap dragons are red, white, or maybe pink; you have black hair, blonde hair, brown hair, or red hair; you can either roll your tongue or you can't.

    The genetics behind these discrete traits is fairly easy to understand, because discrete traits are generally controlled by different variants of just one gene. For most of us with blue eyes, our blue eyes are caused by a genetic variant of one gene, OCA2 (inherited in two copies, one from each parent). If you have two copies the 'blue-eyed' variant of OCA2, you have blue eyes, otherwise you don't (with a few exceptions).

    Diseases like cystic fibrosis and hemophilia fall into this single-gene category as well - if your parents are carriers of a cystic fibrosis genetic variant ('carrier' here means that they each have one good copy and one bad copy of the relevant gene, and thus don't have cystic fibrosis themselves), there is a 25% chance that you'll get two bad copies of this gene (one from your mother and one from your father) and thus get cystic fibrosis.

    In other words, for discrete traits, there is nearly a 1:1 correlation between a genetic variant and a physical trait - if you've got the genetic variant, you've got the trait. The genetics is straightforward.

    Quantitative Traits
    Look around and you'll see that most traits are not discrete - people are not either short or tall, obese or skinny, fast runners or slow; instead, such traits fall into a continuous range of values, such as a bell curve pattern. Traits with continuous values are called quantitative traits, and the genetics behind these traits is much more complex than discrete traits, because these continuous traits are not produced by variants of just one gene. Quantitative traits are controlled by multiple genes, and thus there is no simple 1:1 correlation between a trait and a particular genetic variant.

    Why is it important to understand the genetics behind quantitative traits? Other than the fact that almost all interesting traits are quantitative, scientists are interested in quantitative traits because they are key to our understanding of both evolution and many common human diseases. To understand the evolution of quantitative traits and complex diseases, ones that are affected by variants of multiple genes, we need to answer a common set of questions: How many genes (and which variants of those genes) have an impact on the trait? How large of an impact does each gene have? How does a variant of one gene impact the effect of another gene on the trait? Without knowing the answers to questions like these, we can't accurately predict your genetic risk of getting diabetes, for example, and the growing field of personalized medicine will have little hope of success.

    A key point to keep in mind about quantitative trait genetics is that we are here interested only in genes that come in different variants, because we're interested in what causes differences among individuals. There may be a gene that is critical for height, but which is 100% identical in all humans; that gene then does not account for the differences in human height. When we study quantitative traits, we're interested in differences among individuals.

    Quantitative Traits and Evolution
    How are quantitative traits important for our understanding of evolution? They are fundamental in our quest to understand the connection between genotype and phenotype, how our genetic information produces our physical, physiological makeup, and this genotype-phenotype connection is what makes evolution possible. For instance, if we are interested in understanding how natural selection produces faster African antelopes (running speed is in part a genetically controlled, quantitative trait), we need to know how many genes have an impact on running speed, how much of an impact each gene has, what variants of these genes exist in the antelope population, and how the different variants interact with each other.

    In the case of discrete traits, where there is a strong correlation between a trait and a single gene, natural selection has an easy task: if variants in only one gene determine whether an antelope is fast or slow, selection that favors fast antelopes will cause the 'fast' variant of that single gene to predominate in the antelope gene pool. When it comes to quantitative traits, the effects of natural selection are more complex: maybe one genetic variant produces faster runners only in the presence of a second variant of a different gene; in that case natural selection can favor these two variants only when they are together in the same organism, because only when they are together do these variants make a faster runner.

    And because genetic variants are shuffled around each generation, there is not always a strong correlation between the fitness of the parent and its offspring. One speedy antelope with the right combination of 'fast' genetic variants may not pass on that entire lucky genetic combination to all or any of its offspring. The same thing is true of height in humans - short parents sometimes produce tall kids, and not all tall parents have children who are as tall as they are. By understanding the genetics behind such quantitative traits, we can get a better handle on how these traits are shaped by evolution.

    Quantitative Traits and Your Health
    The genotype-phenotype connection at the heart of quantitative traits is closely tied up with common human diseases. Our risk for diseases genetically complex disease like heart disease or diabetes is itself a quantitative trait, in a sense: it is affected by variants in multiple genes, and the impact of each variant in many cases depends on interaction with other variants. For example, if you have Variant 1 instead of Variant 2 in gene X, maybe you have a 5% risk of heart attack (all other things being equal). But you may also have Variant 1 in gene Y, which actually decreases your risk of heart attack to 2% on average, but in the presence of Variant 1 in gene X, you gene Y variant is doubly effective at reducing your risk, and thus you only have a 1% risk of heart attack. In other words, knowing only one genetic variant will not give you an accurate picture of your genetic risk for heart disease.

    Clearly, it gets complicated quickly. Understanding the genetic architecture of these complex, quantitative traits has major implications for the potential of personalized medicine. Will we ever be able to accurately predict your disease risk for certain disease by looking at your DNA? Will we be able to prescribe drugs based on your genetic profile? All of this depends on our ability to make sense of the genetics of quantitative traits.

    Playing with a Model Quantitative Trait
    These are the types of questions that Justin, our Redneck Geneticist, set out to answer in his recent paper. One way to get at the genetics of a quantitative trait is to take individuals at each extreme value of the trait, have them produce offspring (which will generally exhibit a wide range of values for the trait), and then track down the genes that make the offspring different. Justin could have tried having a very short person and a very tall person hook up and produce several dozen children of widely varying heights. A researcher could then go in and begin to identify genetic variants that affect human height.

    Instead of trying to find volunteers crazy enough to agree to this experiment, Justin chose an experimental system that is a little more manageable, and one that would produce results in a reasonable period of time. He chose to study spore formation in yeast. Like many microorganisms, yeast form spores as a survival strategy, and spore formation is a quantitative trait: some yeast strains form spores very effectively, others hardly form spores at all, and many strains cover the whole range in between. Justin found strains at the two extremes and crossed them: he took a very good sporulator, a yeast strain isolated from an oak tree, and crossed it with a very poor sporulator, a strain scraped out of a wine barrel. From this cross, he obtained over 300 yeast offspring strains, which covered the entire range of sproulation efficiencies, the classic signature of a quantitative trait.

    Yeast cells forming spores - notice the 4-cell cloverleaf pattern. Photo Credit: Justin Gerke.

    Using some fancy genetic techniques, Justin then tracked down the genetic variants involved in controlling sporulation efficiency, and he could then answer some basic questions about quantitative traits geneticists are interested in:

        How many genes? Justin found that five genes accounted for almost 90% of the differences in sporulation efficiency among strains, with three genes of those five accounting for most of the variability in the trait. Some geneticists have speculated that quantitative traits are in general controlled by dozens of genes (and thus hopelessly complex), but in Justin's case, just a few genes can produce a very wide range of variability.

        What kinds of genetic variants are involved? Justin identified four individual DNA changes that account for almost all of the sporulation variability in his strains. Two of these changes occurred in non-coding regulatory regions of the DNA; in other words, these changes didn't alter the protein-coding genes themselves, only how they were regulated. Two other changes were in the protein-coding regions. One gene had a key variant in the non-coding region, and another in the coding region. So this quantitative trait involves changes to both protein structure and gene regulation.

        What kinds of genes? This is a hot question in evolution. How does natural selection change a quantitative trait? Do you change the molecular machinery involved in building that trait, or do you change how that machinery is regulated? In the case of spore formation, all sorts of enzymes and structural proteins are involved in building spore walls, but Justin did not find any important genetic variants in these genes. All of his genetic variants were found in regulatory genes, transcription factors which control the expression of the spore-making machinery. This makes sense: it's better to tweak the software of a trait, rather than risk hopelessly breaking the hardware.

        Do these genes interact? If none of the genetic variants involved in spore formation interacted, you should be able to simply add up the effects of each variant for a final score. If this were true in our heart disease risk example, you would simply add up the risk for each variant you have for your final heart disease risk: at gene 1 your risk is +3%, gene 2 -1%, and gene 3 +2%, so your final genetic risk would be 4%. But this is not how things typically work in nature, as Justin found in his yeast strains. For example, a variant in one gene promoted good sporulation only in the presence of a certain variant of a second genes; otherwise, the first gene had almost no effect. The lesson is that interaction between genetic variants is important in quantitative traits.

    The Big Lesson
    These results are great for our understanding of yeast, but what implications does this hold for evolution or personalized medicine? This is just one example of the genetic underpinnings of a quantitative trait; to draw firm general conclusions, we need to study more examples (which Justin is now busy doing). But this work is an outstanding case study, worked out in amazing molecular detail. Getting down to this level of detailed understanding in humans is a daunting task, but Justin's work provides some clues about what we might find in humans. Quantitative traits can be built with variants in only a few genes, each with large effects, instead of dozens with tiny effects, which means that there might be hope yet for personalized medicine. And many of the critical genetic variants will probably be found in regulatory genes, meaning that the physiological diversity in a species is in large part due not to differences in the molecular machinery responsible for physiology; it's due to differences in how that machinery is regulated.

    As you may have noticed, the 30 Days of Evolution Blogging sputtered to a stop temporarily, following the well known (non-quantitative) relationship between work, marriage, kids and blogging: 2 weeks of my work hell + 2 weeks of spouse's work hell + kids = blogging problems. But I'm not giving up! This week I'll finish out our evolution blogging month with some more amazing evolution research published just in the last few weeks. Evolution as a science is alive and well. Each day I will blog about a paper related to evolution published in January or Feburary of 2009.

    Are you a blogger and want to join in? Here's how.

    Front Page Image: Associated Press


    Interesting post, thanks! How does epigenetics fit into quantitative traits? In high school biology class, most of us learned very simple genetics, using examples of Mendel's I do remember that in high school. As a class, we each had about 5 dominant & 5 recessive genes, & we went around & put those genes into a paper bag, shook it up, & picked out genes randomly in order to determine the traits that our child would have. I guess the only difference here is that genes represents %'s of traits rather than nominal differences? I guess, along these lines, I have to ask a pretty basic question, but the more I think about it, the more it stumbles me: Why are there any bad genes whatsoever?
    How does epigenetics fit into quantitative traits? 
    That's a very good question, and the answer isn't very clear at this point. Epigenetics could make things much messier, because we can't measure potential epigenetic changes like DNA methylation as easily as we do DNA sequencing.
    Why are there any bad genes whatsoever?
    Also a very good question. There are several answers: 
    1) Whether a gene is good or bad is context dependent. For example, certain 'bad' genetic variants that put us at risk for diabetes today might have been beneficial 10,000 years ago when we weren't eating the way we eat today. Whether a variant is good or bad depends on the environment and the genetic background of the individual.
    2) If you do the math (using population genetics models), it's clear that mildly harmful genetic variants can predominate in a population just by sheer chance. In other words, natural selection isn't always strong enough to weed out all bad mutations. Also in some cases you get balancing selection - like the sickle-cell gene in sub-Saharan Africa. People with one 'good' and one 'bad' (sickle-cell) copy of a particular hemoglobin gene are more resistant to malaria than people with two 'good' copies of the gene. Thus, the sickle cell variant is beneficial enough to be preserved in the population, even though an unfortunate fraction of the population ends up with two copies of the sickle-cell variant.

    3) New mutations - as researchers start sequencing more and more individual genomes, they're finding more rare, sometimes harmful genetic variants. These are newly arisen mutations. Each particular variant is generally found in only a few people, but all together, every one of us has dozens or hundreds of harmful, recently-arisen mutations. Fortunately, most of them seem to be recessive, as evidenced by the fact that most of us are still alive and walking around. 

    Re: genes are digital. Yes, but but digital does not necessarily imply that only paired combinations are possible. Suppose that, instead of all genes being viewed as dominant-recessive pairs, we view at least some of them as mutually excitatory/inhibitory groups. The possible combinations are no longer restricted to simple pairings. Just an idea, to accept or reject - but then, on such foundations is built the whole edifice of science.
    The possible combinations are no longer restricted to simple pairings.
    You're correct. Simplified high school biology lessons can be misleading on this point, because in those lessons we usually only talk about two variants, a dominant and a recessive one.  
    Genes typically have many different variants. At any one position in the DNA where human beings vary, there are of course 4 possible variants: A, T, C, G, although all 4 are not always found in a population. In any given gene, there may be dozens or hundreds of DNA positions where that gene varies, since a gene is generally 1000's of DNA bases long. The possible combinations add up, although again, not every possible combination is found in human populations.

    So it's an oversimplification to think in terms of paired dominant/recessive or good/bad variants. (It makes sense to talk about paired variants in introductory genetics though, because in each individual, there are only two copies of a particular gene - the maternal and paternal copies, no matter how many other possible variants exist in the population.) There are lots of possible combinations, and the effect of one variant can depend the presence of another, like you say.
    Gerhard Adam
    In discussing quantitative traits, I'm struck by the fact that no one seems to mention those physical manifestations or characteristics that are environmentally determined and may not be directly expressed.  In other words, it seems that some quantitative traits operate within certain boundary conditions, so that there only the "average" result is expressed unless acted upon externally.

    For example, while many humans may share the same genes for muscles, it is clear that the actual strength or speed with which those muscles will be expressed is dependent as much on exercise and nutrition as genetics.  As a result a bodybuilder's muscles are significantly different from the proverbial "couch potato", even though both could have identical genes.

    Therefore it would seem that a concept like "speed" in animals may manifest more significantly when that animal has a greater need to run and therefore those with "better" genetics might be able to develop the appropriate muscles better for running.  Whether it be in energy storage, or nerve response, or recovery time .... any of these could make a huge difference in speed or strength and yet there might be virtually no distinguishable genes involved from one animal to the next.  Without the need to exert those muscles, such a gene expression could literally like dormant for generations and never be pushed to its "limits".

    I'm equally sure that some genetic traits (such as height) may well be regulated by the nutrition available while growing and potentially many other environmental factors that could "push" the boundary conditions of the gene.  Successive generations would only need to slightly expand the "boundaries" to begin a trend towards a specific trait or expression.  Once again, this wouldn't necessarily indicate a difference in genes between individuals, since it is unlikely that the basic "blueprint" has been altered as much as its regulation.

    One would also have to consider that since such physical expression does have boundary conditions, whether there may be limits placed by other structural developments regulated by other genes too.  In other words, the limit of muscle development may well lie with the expression of bone/joint strength rather than the muscles themselves.  Therefore, it is unlikely that we'd find a genetic difference between individuals on that basis alone.

    I realize that I'm oversimplying a very complex area, but the main point I wanted to raise was that we tend to consider genetic expressions in absolute terms, rather than the fact that they operate over a range (within the same individual), often based on external environmental factors of stress and usage.
    Mundus vult decipi
    You raise good points. It's true that genetic variants that impact a trait aren't necessarily the obvious ones - as you say, genetic variants with an impact on running speed may not necessarily be in muscle genes. That's actually one justification for genome-wide association studies: researchers want to look at the potential effect of all genetic variants on disease risk or height, not just study the obvious candidate genes that we would guess have an effect.
    And to study quantitative traits, we have to do our best to keep the environment constant, because environment certainly has an effect. This is why it is so much easier to study the genetic architecture of these traits in model organisms, where we can keep the environment constant. In model organisms, as Justin's research shows, we can get at this problem in great detail. If we can get a handle on how the genetics works in model organisms, we'll be much better prepared to understand what's going on in humans, or in wild animals that we can't study in the lab.

    In humans, the environment problem is more difficult, obviously. Right now there are two main approaches to tackle the problem: large studies with thousands or tens of thousands of individuals, where the environment averages out and we have statistical power to detect the effects of individual genetic variants; and studies of differences within families, where heritable, non-environmental differences are more obvious - this approach is closer to the classical Mendelian genetic approach, but it's more difficult with quantitative traits.