A new estimate says that up to 80 percent of scientific data is lost within two decades.
The culprits? Old e-mail addresses and obsolete storage devices.
For the analysis, the scholars attempted to collect original research data from a random set of 516 studies published between 1991 and 2011. They found that while all datasets were available two years after publication, the odds of obtaining the underlying data dropped by 17 per cent per year after that.
"Publicly funded science generates an extraordinary amount of data each year," says Tim Vines, a visiting scholar at the University of British Columbia. "Much of these data are unique to a time and place, and is thus irreplaceable, and many other datasets are expensive to regenerate. The current system of leaving data with authors means that almost all of it is lost over time, unavailable for validation of the original results or to use for entirely new purposes.
"I don't think anybody expects to easily obtain data from a 50-year-old paper, but to find that almost all the datasets are gone at 20 years was a bit of a surprise."
Vines is calling on scientific journals to require authors to upload data onto public archives as a condition for publication, adding that papers with readily accessible data are more valuable for society and thus should get priority for publication.
"Losing data is a waste of research funds and it limits how we can do science," says Vines. "Concerted action is needed to ensure it is saved for future research."