The life of a dark web site can be fleeting.
As quickly as one drug site emerges, it can disappear shortly after, taking with it all sorts of information on what items were on offer, the posts on the site's forums, and plenty of other useful pieces of data for journalists or researchers, who are left empty handed in their attempts to keep tabs on the dark web.
Now that might change, thanks to an independent researcher who has released a colossal archive containing around 52Gb of data from over 80 different dark web marketplaces.
Since 2013, Gwern Branwen has collected the item listings, customer feedback, images, forum posts and much more of all English-language dark web markets. Branwen did this on a weekly or sometimes daily basis, according to a write-up on his site.
"This uniquely comprehensive collection is now publicly released as a 52GB (~1.6TB) collection covering 89 [Darknet Markets] & 37+ related forums, representing <4,438 mirrors, and is available for any research," he wrote.
Gwern Branwen has collected the item listings, customer feedback, images, forum posts and much more of all English-language dark web markets
Most of the files in the collection are HTML or CSS, Branwen told Motherboard in a Twitter message, and the archive doesn't just contain Branwen's own data sets, but also those from several other people as well. These include academics such as Nicolas Christin from Carnegie Mellon University, and even dark web personalities such as the vendor known as "El Presidente," who made a backup of drug dealers' contact details and PGP keys in the event of marketplaces disappearing.
Branwen has contributed plenty of pieces of original research looking at the dark web markets. He's worked out the relative risk of running or participating in a market, judging by the number of known arrests from that particular market, and also mapped how long each market existed for.
A vibrant academic community exists around these sites too, with some researchers looking at the size of different marketplaces, and others investigating whether drugs on Silk Road were sold mostly to individuals, or to other drug dealers.
"I want to enable all sorts of research and analysis."
Branwen's archive is now open for anyone to use, and he gives some suggestions for research that could springboard from his work. It might be possible to calculate the total number of sales per day on a market, and see whether there are any correlations with the Bitcoin price, he wrote. One could analyse the data for indicators of an "exit-scam," when a vendor or site owner disappears with a customer's cash. Or perhaps a researcher could collate all of the posts which talk about drug purity and see if any trends have developed over time.
Jamie Bartlett, author of the Dark Net, is already thinking of ways to use the data set, such as "working with some drugs charities to see what benefit there could be for the harm reduction groups and health professionals," he told Motherboard in an email. "It's such an incredible resource."
That's exactly what Branwen had in mind when publishing the massive collection.
"I want to enable all sorts of research and analysis," Branwen told Motherboard.