A recent study found academic institutions are failing to preserve digital materials “including science paid for with taxpayer money,” Ars Technica reports, and calls for improved archiving standards and accountability in the digital age. emphasizes the need for From the report: This work was done by Crossref developer Martin Eve. This is the organization that organizes the DOI system, which provides permanent guidance to digital documents, including almost all scientific publications. If updated properly, the DOI will always resolve to the document, even if the document is moved to a new URL. But there are also ways to deal with documents disappearing from their expected locations, which can happen if a publisher goes out of business. There are some so-called “dark archives” that are not available to the public, but should contain a copy of what the DOI has been assigned. If something goes wrong with the DOI, the dark archive is triggered to open access and the DOI is updated to point to the copy in the dark archive. However, for this to work, all published copies must be archived. So Eve decided to see if that was true.
Using the Crossref database, Eve retrieved a list of over 7 million DOIs to see if the document could be found in the archive. He has published well-known works such as his Internet Archive at archive.org as well as academic works such as LOCKSS (Lots of Copies Keeps Stuff Safe) and her CLOCKSS (Controlled Lots of Copies Keeps Stuff Safe). We have also included those specific to. The result…was not very good. When Eve broke down the results by publisher, she found that less than 1% of her 204 publishers had the majority of their content stored in multiple archives. (The cutoff was 75 percent of the content that was in three or more archives.) For those who had at least half of their content stored in two archives, the cutoff was less than 10 percent. And a complete he third seemed to be doing no systematic archiving at all. At the individual publication level, less than 60 percent appeared to be present in at least one archive, and more than a quarter did not appear to be present in any archive at all. (The remaining 14% were published too recently to be archived or had incomplete records.)
The good news is that major academic publishers seem to be doing a reasonably good job of getting material into their archives. Most of the unarchived issues come from small publishers. Eve acknowledges that there are limitations to this research, primarily in that there may be additional archives that he has not checked. There are also notable dark archives that he didn't have access to, and others like Sci-hub, which infringes the copyrights of commercial publishers' materials in order to make them available to the public. Finally, individual publishers may have their own archiving systems in place to prevent the loss of publications. The risk here is that access to some academic research may eventually be lost.