Low standards breed poor results.

I have been waiting for the biological revolution for quite a while now, and it always seems to be just around the corner.  Synthetic biology, genomics, protein crystallography, proteiomics have all promised to revolutionize the way medical discoveries are made and change the way we view biology.  And it would be foolish not to acknowledge that these fields have changed the way biology is studied, but the change seems to be happening at a somewhat disappointing rate. 

This may be because the systems that these fields are examining are so complex that pulling useful information out of them, really is a problem that can only be cracked with the big data approaches now being developed. 

But I don’t think that is the problem and I don’t think big data will be able to overcome the problems that are contributing to the lack in reproducibility of biological studies, the same problems that have stymied the development of new drugs using classical approaches. 

Arguably there have been great advancements in computer assisted analysis and automation of drug discovery.  The initiative by the NIH Chemical Genomics Center, titled "Bringing Biopharmaceutical Technologies to Academic Chemical Biology and Drug Discovery" is a prime example of this.  This project focuses on the automation of high-throughput screening with the goal of "efficiently and comprehensively describe the biological activity of a chemical library"(and yes those words are bolded in the slides).  However, it ends up illustrating the low standards used in compound analysis that is supported by the FDA and other institutions of the U.S. department of Health and Human services.  Rather than comprehensively describing the interactions of this library of chemicals, the analysis is reduced to IC50 curves. 

The development of theories for drug interactions have been ongoing for around 100 years and the this project has chosen to ignore all of it in favor of a form of analysis that predates and strips away any of the finer detail or information that may produce a comprehensive understanding.  

That is the use of IC50s equates every biological process to an on/off switch.  This would be a fantastic way of analyzing things if all biological processes were static logic gates, and I imagine it would integrate into computers quite well.  Unfortunately, biological molecules are dynamic and results produced at the fixed concentration of one substrate can’t be assumed to be representative of the system as a whole, but IC50s work on this assumption.  (Is this a source of irreproducibility in biological studies? Who knows no one is collecting enough data.) 

However, the use of IC50s in this project isn't really surprising, as submissions to the FDA, for drug approval do not require any sort of in-depth analysis beyond IC50s.  The guidances suggest you submit detailed information if you have it, but IC50s can be submitted in their stead, and with the cost of reagents these days that's hardly profitable and who has the funding or the time. 

With such low standards for research, quality it’s not surprising that there is now a booming market for kits to help get that coveted IC50 value. What incentive is there to examine things in detail when an IC50 produces results sufficient for publication and if the IC50 is deemed good enough pharmaceutical investment. 

The discovery that nitric oxide was a key messaging molecule in the regulation of vascular tone came about when Robert Furchgott realized that the preparation method used by researchers to examine the arterial walls were stripping away the endothelial cells that produced the nitric oxide.  One can only wonder what will happen when researchers stop stripping away the finer details of molecular interactions with IC50s.

The dearth of drug development suggests that IC50s are not significantly contributing to our understanding of disease mechanism or providing practical approaches for the treatment of disease.  Unfortunately this is the level of detail that is avaliable for the developers of big data analytic, so while big data analysis may be wonderful for finding new risk factors for disease I think it will encounter the same problems already entrenched in the biological sciences. 

There is a lot of talk about the problems with reproducibility in biological sciences these days and suggestions on how to solve the problem abound, from improved training in statistical analysis to reproduction of high profile studies and journals uniting to improve quality control. However, this oversimplification of our understanding of molecular interactions is conspicuously absent.  Since there doesn’t seem to be much support for change from within, hopefully, pressure from groups interested in using big data to cure aging may provide the catalyist for change.