As the Russian invasion of Ukraine accelerates, professional and hobbyist archivists alike are rushing to preserve Ukraine’s online history, cataloging and storing everything from Ukrainian government and university websites, to the torrent of news and social media posts related to the accelerating conflict.
The Internet Archive has been archiving the broader conflict in Ukraine since 2014. But as Ukraine government websites face prolonged outages due to sustained cyber attack—as well as the looming risk of defacement or deletion—the organization has taken on another monumental task: backing up the entirety of the Ukrainian Internet.
Using the crowdsourced auto-archiving software running on a virtual machine they’ve dubbed Archive Team Warrior, the organization has leveraged volunteers around the world, many of whom have donated countless terabytes of storage capacity for the project. These volunteers have been steadily backing up the Ukrainian Internet since before the war began.
All told, 68 million items (web pages, documents, and other files) comprising more than 2.5 TB of data have already been hoovered up from various websites across the .ua top level Ukrainian domain. A second project dubbed Ukr-net aims to preserve tens of millions of additional items and terabytes of additional data across the Ukrainian Internet.
Elsewhere, organizations like the Center For Information Resilience have built a crowdsourced map attempting to document every single war-related post to social media made in the region, ranging from civilian photos of the movement of heavy Russian weaponry, to Ukranian government claims of alleged bombing raids on kindergardens.
Hobbyists over at the Data Hoarder subreddit have also been busy sharing various pet projects, including using the Archive.org Wayback Machine to ensure Ukraine’s ten largest universities’ websites are cataloged and stored, as well as collecting and archiving daily online press reports on the Russian invasion of Ukraine.
Reddit user Detz says they’ve cooked up a beta project to snapshot news websites and store them permanently on the blockchain. The project takes two screen grabs of 35 major websites daily, providing a bird’s eye view of ongoing news coverage as the snapshots are automatically uploaded to the filecoin network.
“The plan is to use this framework to allow anyone to record and save history, whether it's websites, documents, roll call votes, etc,” the user noted. “I wish I had started it early as the past couple of years has proved to be a very interesting time in history.”
Historians and archivists face no limit of headaches in the quest to document modern history, especially involving military conflict. Military equipment historians have several times now found their Google Drive accounts locked because the company’s automatic copyright filters inadvertently flagged simple images of tanks and other hardware as terrorism related.
Separating useful information from gibberish and misinformation can also be an uphill battle, especially when tracking invasion-related news on social media.
Numerous users have shared “Russian invasion” footage that was actually from the video game Arma 3. Social media platforms like TikTok are also awash in grifters seeking donations by claiming to be live streaming the invasion from Ukraine, when in reality they’re just streaming the view off their UK balcony overdubbed with gunshot and siren sounds.
At the same time, open source intelligence reporters are running face-first into Twitter’s erratic content moderation practices, finding their accounts temporarily banned after their efforts to document pre-invasion Russian troop build up were falsely flagged as misinformation.
As the Russian government invades a sovereign neighbor and attempts to supplant the democratically-elected Ukrainian government, distorting and dismantling history will be clear priorities. Countless “data hoarders” are quietly doing their best to make that task more difficult.