Information processing and entropy management - that's what organisms are about, right? Information and entropy are terms that get people excited, and yet it's extremely difficult to integrate formal ideas about information, free energy, entropy, etc. (much of this from modern statistical mechanics) into a meaningful biological framework. People (including myself) love to toss around terms like entropy and information, but in most cases I have encountered, efforts to apply these concepts to molecular/cellular biology are hopelessly vague and unhelpful. Once you get beyond the level of individual proteins in biology, it's difficult to apply some of the traditional concepts of physical chemistry.

There are some exceptions, like flux-balance analysis of metabolic systems and thermodynamic models of transcriptional regulation, but for the most part, systems biologists struggle to think about biology using the very beautiful, quantitative, higher-order concepts used by scientists to study other (non-biological) complex systems. It's an area rife with perilous shoals, and no shortage of kooky papers.

So I always turn to this field, wanting to be hopeful, but steeled for disappointment. And recently I've been reading some papers that have raised my hopes. They're by Eric Smith, at the Santa Fe Institute. This is a series of three papers in the venerable Journal of Theoretical Biology, a journal that's been willing to publish cool and crazy ideas for a long time.

Smith lays out the problem:

Suppose that we wish to account for the emergence and stability of the biosphere that is observed on earth. Surely this should be one of the foundational problems of biology, establishing the context within which many particularities of form and function must be understood. For a variety of reasons, it is natural to describe the emergence of life physically as a process of self-organization, and to suppose that its robustness and resilience arise from the same forces responsible for its emergence. However, such a claim is difficult to evaluate or even to express within the current paradigms or practices of biology...

Energy and information are not treated within common paradigms within most of biology, indeed the contexts in which they appear have often been mutually exclusive.


My thoughts exactly.

The goal of Smith's first paper in this series (PDF) is to look at the limits placed on biological processes (at any scale) by thermodynamics - limits in the capacity to 'reject entropy', to maintain order, to process information. There are physical limits on computation, limits that come out of thermodynamics. Entropy and computation are tied together in statistical mechanics; entropy and computation are both clearly relevant to the functioning of biological processes. Smith proposes to integrate our concepts of computation and entropy with biology by applying the known limits to information processing and entropy rejection:

Thus the willingness to study limits rather than models allows us to relate self-organization in the biosphere to computation, not by analogy but by homology of their constraints. The manner in which we can derive limits on the energetic cost of arbitrary computations, without necessarily knowing details of their internal steps and without ruling out innovation, is the manner in which we can limit energy-driven changes in the biosphere, because limits have a path-independent logic for composition which most models do not have.


I know this sounds cool, but before you get too excited, realize that this is just the introduction. Although Smith's writing is clear, nothing here is specific enough (yet) for any calculations or experiments. People tend to get excited by this kind of language and think that we've solved the problem or made major progress. It's an exciting topic, but the bar is high if this area is going to have an impact on how the vast majority of working molecular biologists define research problems and carry out their day to day research.

OK, now that we've cooled our heels a little, let's see what the argument is.

The advantages of ideal systems

First, Smith compares his project to an analysis of the Carnot Cycle. The key here is that the Carnot heat engine is an idealization, never actually possible to create physically, yet it provides a very fruitful basis for thinking about real, irreversible systems. By analyzing the Carnot Cycle, we can learn about the thermodynamic limits on ideal systems. We then have a much easier time analyzing non-ideal systems as we study how they deviate from the ideal.

Smith is arguing that we can do the same thing with biological systems. By thinking about the limits on entropy management and computation in ideal systems, we can make some predictions about what we expect to see in biology.

Thermodynamic Limits in Biology

So what kinds of limits are we talking about? The argument gets fairly involved, so I'm not going to fumble my way through it here. At this point, let's just settle for Smith's summary of one finding resulting from the Second Law connection between work, information and entropy:

Living systems have generation times ranging from minutes to hundreds of years, and life cycles ranging in complexity from the indefinitely repeated division of prokaryotic cells, to the highly stereotyped birth, maturation, and death of animals. The remarkable observation reviewed in this section is that the cumulative metabolic energy required for a wide variety of processes is predicted at leading order by the biomass formed, and is approximately independent of other factors. The coefficient of proportionality is different for real systems than for their reversible idealizations, but is again not strongly dependent on which observation is considered, or on timescale or aggregate level of complexity. Thus it appears as if the active constraints on growth, development, and aging take the form of a limiting work required per bit of information gained, with a fixed factor of inefficiency relative to idealized processes.


That's fairly dense, but here is another summary:

The distinguishing feature of the bound developed in this and the next two papers is that the limit on the amount of information that can be put into a biological system is determined by the cumulative work done on the system and by its temperature as an energy scale, but not on the time taken to do the work, on thermal effects of degradation, or on any other ‘‘material’’ aspect of the system or its transformations. This dependence of information solely on work and energy scale is a necessary consequence of any bound derived from reversible transformations.


Hopefully this will become a little more clear in a moment, but, for this first paper in the series, this is the primary biological conclusion - a thermodynamic framework that suggests that the metabolic energy required for any process (this includes a calculation of energy required to select for the genes that carry out a given process) is proportional only to the biomass generated in that process.

I haven't looked at the energy calculations that go into this in any detail, but it's worth noting that these aren't new calculations done for this paper; this kind of energy/entropy estimation has been done for a long time. That doesn't make it right necessarily, but for the sake of argument here, we'll take these calculations as roughly correct.

Incidentally, what Smith is talking about is potentially testable in a lab. You can measure the energy of a culture of bacteria, of the bacterial growth medium, and of the growth-medium-to-biomass conversion using calorimetry. If bomb calorimetry appeals to you, you might want to consider this line of research.


The Energy to Build a Cell

On to the thermodynamics: A highly organized cell is obviously a very low-entropy configuration of molecules. The second law of thermodynamics tells us that there is a cost to building such a low entropy beast - in the process of building a cell, the entire system has to give off waste heat so that the total entropy of the universe increases. So the entropy of the molecules that go into making a cell decrease, while the entropy of the environment increases. In Smith's terminology, the cell has to "reject entropy" as it goes about its business. Rejecting entropy out into an environment at a given temperature T costs a certain amount of heat energy. In other words, some of the energy put into the system has to be used to pay the cost of rejecting entropy.

Way back in the 1950's, a biochemist named Harold Morowitz made a calculation of this heat cost for the formation of an entire bacterial cell. (Those of you have taken chemistry will have done calculations similar in spirit - calculations that involved looking up values of standard free energy, enthalpy, entropy, etc. in big tables of thermodynamic measurements of various molecules.)

Morowitz calculated that the entropy difference between the highly ordered atomic system of the cell and a random configuration of those same atoms is on the order of Avagadro's "number times the dry weight of the cell, divided by an average gram-molecular weight—about 10 g—of the cell’s constituent elements." To achieve that kind of entropy reduction, there is a heat cost, and Smith says that this calculated heat cost is impressively similar to calorimetric measurements of bacterial growth - with a factor of 1/3. Thus, "bacteria under optimal growth conditions waste little metabolic energy beyond what is required for growth."

Smith says this is moderately suggestive of a thermodynamic scaling law, of the idea that the energy cost of formation scales only with biomass, and not with the complexity of the system. For example, the energy cost for formation of a 10g dry weight of E. coli should be the same as the energy cost to form 10g dry weight of the more complex, more highly organized eukaryotic yeast S. cerevisiae - a proposition you can test in the lab with calorimetry.

Even more persuasive, he says, is evidence from multicellular organisms:

A more interesting and diverse test of the scaling of energy cost with information comes from growth and aging of metazoans. For these the apparent allocations of energy to new cell formation and existing cell maintenance vary strongly with body size, as does organism developmental time and lifetime. Yet when the complementary variations in all these characteristics have been combined, it will turn out that metazoans demonstrate the same overall scaling as bacteria, with modestly different constants of proportionality describing their inefficiency.


Once again I'm going to skip over the more technical arguments, which involve calculating the energy cost to generate a certain amount of biomass over the lifetime of an animal. The punchline is that the "characteristic energy per unit mass required to produce a new cell from growth medium" is a number that independent of the body size and temperature of an animal:

Body mass varies by 14 orders of magnitude, and body temperature by 35 degrees C within the data set, while organizational complexity spans the range from zooplankton to mammals and birds. Across all these ranges, growth appears to be constrained by a fixed energy cost per bit of information gained by the organism’s biomatter.



Conclusion to Part I

That's the conclusion, at the end of this first paper in a series of three. The idea is intriguing. You can be skeptical about the measurements that go into these calculations, and the simplifications made to calculate the energy consumption of an animal over its life time (I haven't really looked into them). I don't care so much; my primary source of disappointment so far is that the conclusion isn't really that exciting. What do we do with this conclusion? It's too coarse-grained to make much of a difference in our understanding of the information-processing abilities of specific biological processes.

When I think of integrating entropy, information, and energy into biological thinking, I want to answer questions like the following: How much information does a given signal transduction pathway convey (given a particular environmental background)? What is the energetic cost to maintain and respond to that signal transduction pathway, and then, what is the energy cost to the cell per bit of information obtained from a signal transduction pathway (in a given environment)? Can this energy cost per bit of information tell us anything about why biological signaling systems are structured the way they are? Is there a trade-off between the cost of detecting a food signal, given a signal-to-noise ratio of a food source in an environment (I'm thinking of a single-celled organism here), and the energy cost of maintaining the signal detection system?

This is only the first paper in the series, so I'll withhold judgment and just say that at this point I'm both impressed and frustrated.

Read the feed: