Junk DNA And The Onion Test
    By T. Ryan Gregory | June 1st 2008 05:00 AM | 8 comments | Print | E-mail | Track Comments
    About T. Ryan

    I am an evolutionary biologist specializing in genome size evolution at the University of Guelph in Guelph, Ontario, Canada. Be sure to visit


    View T. Ryan's Profile

    One copy of the human genome is more than 3 billion nucleotides in length, and weighs in at about 3.5 picograms (pg, or trillionths of a gram). Only about 1.5% of this is composed of our 20,000 or so protein-coding genes, though other data suggest that at least 5% has been conserved by natural selection, suggesting that a notable portion of the non-coding majority is also functional. On the other hand, it is now apparent that much of the genome residing in our cells is made up of sequences like transposable elements -- "parasites" of the genome that can move about wwithin and be copied independently of the "host" genome -- and especially their extinct remnants. One such element, Alu, is present in more than one million copies. Even if some of these prove to be functional, it is not necessary to invoke function at the organism level to explain the existence of all transposable sequences any more than one needs to identify functions for the host to explain the existence of viruses.

    Nevertheless, and though a significant amount is known regarding the constituents of the genome and the mechanisms by which they accumulate, there has long been a tendency to assume that much -- perhaps even most or all -- of the DNA in the genome serves a function. There are two sources of this view: creationism (an intelligent designer would not make something useless) and extreme adaptationism (if it were not functional it would have been removed by natural selection). Proponents of both viewpoints can be seen overstating the importance of new findings regarding function in small portions of the genome -- extrapolating from a discovery that a few percent is conserved or that a particular transposable element serves a function, to arguing that the entire genome is indeed functional.

    It is in response to this line of thinking that I usually bring up the onion.  The onion, Allium cepa, is a diploid (2n = 16) plant with a haploid genome size of about 17pg, about five times larger than our own genome. The reason this is relevant is that while it may be that most of the non-coding DNA in the human genome is necessary for gene regulation, chromosome structure, protecting against mutations, or some other function(s), simply assuming that this is the case runs into the immediate problem of explaining why an onion requires so much more regulatory, structural, protective, or otherwise useful non-coding DNA.

    I have called this the Onion Test, a term that I am pleased to see has been useful in discussions about junk DNA. For example, it has appeared in interesting articles published in New Scientist (Pearson 2007) and Seed (Myers 2008). As I summarized it,

    The onion test is a simple reality check for anyone who thinks they have come up with a universal function for non-coding DNA. Whatever your proposed function, ask yourself this question: Can I explain why an onion needs about five times more non-coding DNA for this function than a human?

    What's so special (or bad) about onions? Nothing. They were simply chosen because they have considerably more DNA than humans and few people would find it easy to assume that onions require much more DNA than humans do. Moreover, they have close relatives with radically different genome sizes but rather similar biological features (Ricroc et al. 2005).

    Left, Allium altyncolicum (7pg); centre, A. cepa (17pg); right, A. ursinum (31.5pg).

    The Onion Test is not original, it is just a specific example of the kind of diversity that exists in DNA amount among species. Richard Dawkins gave the example of salamanders in The Selfish Gene (first published in 1976). Similar examples of organisms that would not be expected to possess more DNA than humans, but in fact do, have been listed since the earliest discussions of genome size variation. Comings (1972), for example, wrote:

    Being a little chauvinistic toward our own species, we like to think that man is surely one of the most complicated species on earth and thus needs just about the maximum number of genes. However, the lowly liverwort has 18 times as much DNA as we, and the slimy, dull salamander known as Amphiuma has 26 times our complement of DNA. To further add to the insult, the unicellular Euglena has almost as much DNA as man.

    And, going back to the first major survey of genome sizes in animals, Mirsky and Ris (1951) noted that:

    Comparing the largest and one of the smallest examples among vertebrates, one finds that a cell of amphiuma, a urodele, contains 70 times as much DNA as is found in a cell of the domestic fowl, a far more highly developed animal. It seems most unlikely that amphiuma contains 70 times as many different genes as does the fowl or that a gene of amphiuma contains 70 times as much DNA as does one in the fowl. To make a somewhat different comparison: a cell of amphiuma contains 170 times as much DNA as does a cell of a relatively closely related animal, the trigger fish, whereas a cell of the latter contains only nine times as much DNA as does a cell of a sponge, which is far removed phylogenetically from any vertebrate.

    Amphiuma means, which has around 25 times more DNA than humans.
     Takifugu rubripes, which has only 1/10 as much DNA as humans.

    The search for functional components of the human genome is an important endeavour. However, the results of such research must be taken in context, both in terms of the human genome itself -- namely that evidence for function in one small component is not evidence of function for all -- and in terms of the diversity that exists among species. Even if functions could be attributed to most of the human genome (which requires evidence), this would not answer the question of why onions and salamanders have so much more of it or why a pufferfish can survive just fine with only 1/10 as much.


    Comings, D. E. (1972). The structure and function of chromatin. Advances in Human Genetics 3: 237-431.

    Dawkins, R. (1976). The Selfish Gene. Oxford, Oxford University Press.

    Mirsky, A. E. and H. Ris (1951). The desoxyribonucleic acid content of animal cells and its evolutionary significance. Journal of General Physiology 34: 451-462.

    Myers, P.Z. (2008). Random acts of evolution. Seed June.

    Pearson, A. (2007). Junking the genome. New Scientist 14 July: 42-45.

    Ricroc, A., R. Yockteng, S.C. Brown, and S. Nadot. (2005). Evolution of genome size across some cultivated Allium species. Genome 48: 511-520.


    If you enjoyed this post, please consider subscribing to the Genomicron feed to receive future posts.


    Becky Jungbauer
    Junk DNA is one of those interesting scientific mysteries that could be explained in a number of ways; you do a nice job here. I've wondered that myself - why certain "lower" organisms have as much or more DNA than those of us perched atop the food chain. While I would tend to be swayed by the argument of extreme adaptationism, aren't there other examples of things in our bodies that don't seem to have a modern purpose (the appendix, for example)?
    I enjoyed this post Ryan, as always. I'm currently working on my PhD in a lab that studies mobile elements, and we're finding out lots of interesting ways that these things have modified our genome and the genomes of other organisms. As you indicated, even though some very neat mechanisms have been forwarded that argue for the functionality of some portion of these elements, this shouldn't be confused with support for the notion that every bit of the genome is functional.

    The C-value paradox, which I believe is related closely to your point, is a stickler indeed. I think perhaps that there may be no paradox here at all. Maybe it only appears that there is a paradox because of the unfounded expectation we have for some relationship between complexity and genome size. If there really is no such relationship, then ... well ... there's just no such relationship, and hence, no foundation for a paradoxical relationship.

    I wonder if the differences in genome sizes, or more specifically, the differences in the numbers of repetitive elements found in different genomes, could simply be the results of chance events? Alu elements, for instance, can be subdivided into different subfamilies that can be shown to have had different degrees of success propagating themselves throughout primate evolutionary history. One subfamily may be found in high copy number while another is only found in low copy number; one may have been active for a long time in the genome, while another may have undergone only a short period of activity. We're still trying to figure out why these differences exist, but it could boil down to simply serendipitous mutations that enhance the retrotranspositional efficiency of a new subfamily, allowing for a relatively rapid expansion. Perhaps a point mutation that enhances the attractiveness of the element to nearby transcriptional machinery? Or, maybe a chance insertion near an already-existing promoter?

    My point is that, perhaps, such serendipitous events may explain the difference in the amounts of junk DNA between onions and humans (or, in fact, any two genomes). It could be that the mobile elements in the onion lineage just happened to hit the retrotranspositional jackpot a few more times than we did.

    Gerhard Adam
    Let me begin by stating that I am not a biologist, but rather a computer specialist. Therefore, I will apologize for using a "high risk" analogy that, nevertheless, seems worthy of consideration. If one were to analyze a computer system in the same fashion as DNA, one would also discover a huge amount of non-functional components, because in fact, they contain data only and would never be seen in any functional capacity. In effect, might it not be possible that the non-coding DNA represents a sort of database that could be exploited by the immune system (as one example), by acting as a template? In a similar fashion to programs, isn't there ultimately a need for some sort of "IF THEN ELSE" logic that could be used within the DNA coding sequences? For example, there seems to be a clear indication that something of the sort must occur when humans go through puberty, since this is a conditional state. If so, then this could also relate directly to the genome size for different species. Since this "data" would provide a mechanism for dynamic reactions to circumstances, then it stands to reason that the more complex an organism is, the fewer novel solutions could be tolerated and still promote survival, so excess "solutions" would have been selected out. In a similar fashion, lower organisms that may be capable of more radical adjustments could easily retain these capabilities and consequently have much larger genomes than their complexity might initally suggest. In studies where non-coding DNA snips have been removed in mice, there wouldn't be any effect on the functional development, since this would be analogous to the act of deleting information from a database; it wouldn't affect program functioning at all, but only the data to which the program could react. I'm suggesting that it might be interesting to see if the non-coding portions of DNA would have played a role in the research using mice, if they had been exposed to different environmental factors or diseases to determine if there was a developmental difference between the two sets. In any case, I apologize for the computer analogy, since I'm aware that this isn't accurate, but it seemed a useful model for purposes of the discussion. I also apologize if I'm covering familiar terrain, since as a non-biologist, I'm not very current on the latest information.
    Mundus vult decipi
    Gerald Adam I think your thread made the most sense of all, most people are afraid to think outside the box, and you have some valid observations.

    Gerhard i personally think your observations on this topic are valid, and are largely ignored by the mainstream. Someone who can think for themselves is very refreshing to know of.

    I appreciate your comments, but current bilogy research is, weekly, discovering the functions of my non-coding DNA. The most profound one for us humans is the non-coding DHA which allows the embyonic sac to be permanently attached to a mother's uterine wall... basicially without that DNA no human person would be born. "junk DNA" is now an incorrect statement. You might check out very recent research by Dr. Jonathan Wells ( and his new book The Muth of Junk DNA.

    What if humans, and onions for that matter, do not contain all the information in their DNA needed for their construction? You may think this idea is preposterous but I will explain below why it is obviously true. And since it is true, DNA size doesn't matter that much.

    Think of the human body as being made of parts which are in turn made from smaller parts; and those smaller parts are in turn made by assembling even smaller parts, and on and on. The information for specifying how to make many of these parts is in the coding sections of DNA which specify which amino acids to string together to make proteins. These protein parts are in turn assembled to make more complex parts or tools for making other types of parts.

    The information for making some parts is not contained in human DNA. One example of such a part is Vitamin K. Vitamin K1 is used to help fold some proteins used to make blood into the proper shape and is synthesized by plants. Vitamin K2 is involved in bone metabolism and is its various forms are synthesized by bacteria that live in our intestines. A short list of parts made by other organisms (and not humans) which are essential for making humans would be the essential amino acids, the essential fatty acids, and many vitamins. Professor Gregory, I suspect could list hundreds of such parts and there are likely thousands more which have yet to be discovered.

    Some of the parts made by onions which are required by humans include all but one of the essential amino acids, some essential fatty acids, and vitamins A, B1, B2, B3, B5, B6, C, E and K. The information for making these parts is in onion DNA. Onions also absorb from the environment raw materials such as calcium, copper, iodine, iron, magnesium, manganese, molybdenum, phosphorus, potassium, selenium, sodium and zinc; and we can acquire those minerals by eating them. So you see, some of the information required for making humans and other life forms is contained in onion DNA.

    So what's for dinner tonight? A variety of raw or lightly cooked fruits and vegetables I hope.

    One problem with using the onion, or any plant material, in a discussion about functional vs. non-functional DNA, is that plants are frequently more vigorous after a genome duplication event, whereas animals are usually non-viable. Though there is evidence of several genome duplication events in man's far-distant past, that pales in comparison to the number of duplications throughout the plant world. Dawkins was right to use salamanders for his example which you referenced in your article, instead of onions.

    Example: All types of wheat can be divided into tetraploid varieties (4x duplication) and hexaploid varieties (6x duplication), indicating 2 and 3 species crosses respectively. Strawberries are 10x!