Part 1 on The Plausibility of Life

Darwin is famous for convincingly arguing that natural selection can explain why living things have features that are well-matched to the environment they live in. In the popular consciousness, evolution is often thought of as natural selection acting on random mutations to produce the amazing tricks and traits found in the living world. But “random mutation” isn’t quite right - when we describe evolution like this, we pass over a key problem that Darwin was unable to solve, a problem which today is one of the most important questions in biology. This key problem is the issue of variation, which is what biologists really mean when they talk about natural selection acting on random mutations. Variation and mutation are not the same thing, but they are connected. How they are connected is the most important issue covered Kirschner and Gerhart’s The Plausbility of Life. It is an issue Darwin recognized, but couldn’t solve in those days before genetics really took off as a science.

Natural selection really works on organisms, not directly on mutations: a particular cheetah survives better than other cheetahs because it can run faster, not because it has a DNA base ‘G’ in a particular muscle gene. A domesticated yeast can survive in wine barrel because of how it metabolizes sugar, not because of the DNA sequence of a metabolism gene. I know what you’re thinking: this is just a semantic game over proximal causes. But this is not just semantics, it is a real scientific problem: what is the causal chain that leads from genotype to phenotype, that is, from an individual organism’s DNA sequence, mutations included, to the actual physical or physiological traits of the complete organism?

If you look around your office or your home, you’re bound to see natural variation in phenotype in the form of your coworkers or even your family. We’re all different, but what accounts for those differences? How much is genetic, and how much is environment? Or how much of is due to the environment acting on the genetics?

For evolution to work, certain genes for success have to be preferentially passed on to the next generation, but for centuries, biologists have not been able to look at genes directly. Ingenious biologists probed the properties of genes by looking for mutants, flies with white eyes, or bread mold that could not make certain amino acids. During the 20th century, brilliant geneticists worked out a great theory of heredity, explaining the patterns of genetic inheritance, without really knowing what genes were physically made of, or how mutations physically occurred.

Now, in the era of torrents of cheap DNA sequence data, we can know better than ever what kinds of random mutations or sexual shuffling of genes take place inside cells. Identifying the genotype of an organism is now trivial, but we still don’t really understand how that genotype, how the combinations of many genes, with many mutations, come together to produce a unique individual.

There are several fields of biology focused on this problem. Two of the most important are quantitative genetics and systems biology.

Quantitative Genetics

Some cheetahs run faster than others, but it’s not just one gene that makes a difference; most likely several different genes are involved in producing different cheetah running speeds. The same is true of human height: we don’t just have tall people and short people; we see a range of heights in the human population. How many genes are involved in this range of phenotypes? What kinds of mutations are involved? Those are the kinds of questions asked in quantitative genetics.

One key idea to keep in mind is that we’re only looking at genes that vary in a population, genes with “mutations” or (more technically) polymorphisms: where I have an ‘A” in my DNA, you may have a ‘G’ (or some other type of mutation). Some genes do not vary: in cheetahs, there may be an absolutely critical gene involved running speed, but it is 100% identical in all cheetahs, and thus it is not responsible for the differences in cheetah running speed. Quantitative geneticists are only interested in the genes that can be different in different individuals.

So quantitative geneticists look for variation in nature, such as differences in running speed or height, or the ability to form spores in yeast, and then they use the tools of genetics, statistics, and DNA sequencing to find the genes. They may find, for example, that variants of six different genes in a cheetah population are responsible for almost all of the differences in running speed. Quantitative geneticists are gene finders: they find the genes and mutations involved in producing the physiological differences in individuals.

Systems Biology

Once you have the (currently hypothetical) six different genes responsible for the differences in cheetah running speed, the next problem is to understand how those genes actually work together inside of a cell. This has classically been the work of biochemists and molecular biologists, who studied what the various physical pieces of a cell do. But now we are running into some limitations of this classical approach:

First, many biochemists and molecular biologists have only studied one gene or protein at a time. This is great for understanding how that one protein works, and it is absolutely necessary work. Yet if we have six varying genes working together to make cheetahs run, we want to know how those six genes work in concert, not as individuals.

And second, even though molecular biologists and biochemists have often gone beyond single proteins, and studied chains of interacting proteins in an information processing pathway or a metabolic system, these pathways and systems are often so complex that verbal, intuitive reasoning isn’t enough to understand how they work. We need mathematical models.

This is where systems biology comes in. Recently, a group at the Rockefeller University analyzed how a set of genes works together when a yeast cell commits to copying its DNA. It turns out that a positive feedback loop is involved, which drives the cell forward through the process of cell division, and prevents the cell from sliding back into its previous, non-DNA-copying state. Some aspects of this positive feedback loop can be understood by verbal reasoning, but a deeper understanding comes from the mathematical model. And in this case, the modeling produces new ideas about how the system should work, which researchers can then test.

Coming back to The Plausibility of Life, we can see that the issue of how genes (and the random mutations of them) produce the variation in nature that is directly responsible for how well an organism does. How genetic variation produces phenotypic variation in an organism is now one of the central problems in biology, one that we are at last well-equipped to tackle. Fields like quantitative genetics and systems biology heavily rely on technology: genetic technology in the lab, DNA-sequencing technology, and the number-crunching technology that makes desktop computers faster than the supercomputers of several decades ago. Darwin would have been envious.

This is the somewhat delayed first installment of a series of posts on an interesting recent book by the accomplished biologists Marc Kirschner and John Gerhart. In this book, the authors lay out what they see as the most important research agenda for molecular biologists in the 21st century. The next installments are below:

Part 2