Tech

The Internet Archive Has a New Tool to Save Research Papers From Vanishing

Following a shocking report that hundreds of journals have been lost online, the Internet Archive Scholar lets users search open-access works and add their own to a catalog.
Getty Images / People in a library.
Image via Getty Images 
Vote_Asset
Registration, education and action. We're supporting voting in partnership with iamavoter.com, a nonpartisan movement encouraging voting, and civic engagement.

With the move from physical academic journals to digitally-accessible papers, research became more accessible to more people around the world—but at the same time, more precarious to preserve. Decades ago, librarians were the careful custodians of research in the sciences and humanities; today, if an institution stops paying for web hosting or changes servers, the research within could disappear.

Between 2000 and 2019, nearly 200 open-access journals and the research papers they published have vanished from the internet, according to a new study published on arxiv preprint server. Nine-hundred more inactive, open-access journals are also at high risk of vanishing in the near future, the researchers found.

Advertisement

Of the 176 journals they identified, around one-third vanished from the web within one year of the last publication, taking their articles and research down with them.

Institutions and publishers usually archive articles using services like Stanford's LOCKSS ("Lots of Copies Keep Stuff Safe") project or the Portico archive, but when they don't, archivalists like the Internet Archive and volunteers have to step in to save the research. Now, with the new Internet Archive Scholar search platform and contributions to its Fatcat catalog, anyone can help save the science at risk of vanishing—and read it, too.

Since 2017, archivists at the Internet Archive have worked to preserve open-access journals permanently. "Of the 14.8 million known open access articles published since 1996, the Internet Archive has archived, identified, and made available through the Wayback Machine 9.1 million of them," Bryan Newbold at the Internet Archive wrote on Tuesday.

To expand those efforts, IA launched the Fatcat editable catalog with an open API for anyone to contribute open-access scholarly works, as well as a new platform for searching through those archives.

“There shouldn’t really be any decay or loss in scientific publications, particularly those that have been open on the web,” Mikael Laakso, an information scientist at the Hanken School of Economics in Helsinki and a co-author of the study, told Nature.

Recently, the Internet Archive has been battling a lawsuit from five of the world’s largest book publishers; it closed its National Emergency Library on June 16 in response to the publishers' lawsuit, reverting to a traditional controlled digital lending system. It's also endured threats and accusations from a North Carolina senator who seems to hate IA's Great 78 Project—which is focused on the preservation of 3 million 78rpm discs produced between 1898 and 1950—for no reason other than his love of America's draconian copyright laws.