What percentage of plants in the world are woody?

Good luck finding out. Though scientists have been distinguishing between woody and herbaceous plants for over 2,000 years, ever since Plato's student Theophrastus - the "father of botany"  made the distinction in 300 B.C. Researchers already know when the first woody plants came to be, how wood develops and decomposes, and that woody plants like trees and shrubs evolve slower than herbs.

DURHAM, N.C. – Growing numbers of researchers are making the data and software underlying their publications freely available online, largely in response to data sharing policies at journals and funding agencies. But in the age of open science, improving access is one thing, repurposing and reproducing research is another. In a study in the Journal of Ecology, a team of researchers experienced this firsthand when they tried to answer a seemingly simple question: what percentage of plants in the world are woody?

They thought the answer would be easy to find. After all, scientists have been distinguishing between woody and herbaceous plants for over 2000 years, ever since Plato's student Theophrastus -- often considered the "father of botany" -- made the distinction in 300 BC., a world of Google and Web of Science are not much help. Google hurts science understanding - they put Wikipedia and whatever else has been linked to the most at the top of results, so unless you are a researcher you are likely to get the same rehashed content farm stuff in numerous places. 

Can open science fix that?

Perhaps, perhaps not. Expert opinion didn't help, found  Will Cornwell of the University of New South Wales, co-author of a new paper on the issue. An informal survey of nearly 300 researchers from 29 countries revealed little consensus even among trained scientists, with guesstimates ranging from 1% to 90%.

Public data is of limited help, even for a persistent researcher.  Even the largest plant trait database to date, a global woodiness database containing nearly 50,000 species, contains less than 20% of the more than 300,000 plant species known to science. Simply calculating the fraction of species in the database that are woody gave misleading results, due to missing data and sampling bias towards economically important or temperate species.

By applying statistical tricks to account for sampling bias, the researchers were able to determine that between 45 - 48%, or just under half, of the world's plants are woody. "[The take home lesson is that] all big databases are biased, but by acknowledging that bias is universal and accounting for it we can make better use of them," said co-author Rich FitzJohn of Macquarie University

The researchers learned another lesson when they published their work. Their goal was to make enough information about their methods available such that other researchers could retrace their steps. Could someone -- using the same data and code, but a different computer -- get the same or similar results?

In an ideal word, reproducing the analyses should be as simple as installing the necessary software, downloading the data and hitting 'run' but we know that isn't realistic. Software changes, analysis standards evolve. Analyses that run on one machine don't always work on another. 

Making a study easily reproducible, they found, requires a significant amount of time and technical skill. They made sure that everything needed to download and manipulate the data and even create the figures, was written into the code, and explained the thinking behind each snippet of code. They also provided links to tools that would enable researchers to compare changes between different versions of software and restore and run previous versions if need be.

"Nobody denies that researchers should try to make their work reproducible so that others can check their results, but actually making that feasible is easier said than done," FitzJohn said.