Reams of apparent names, telephone numbers, email addresses and job titles of law enforcement employees scroll on my screen. I've been provided the supposed details of 20,000 FBI staff and contractors, as well as 9,000 DHS employees by an anonymous hacker, and he plans to release them to the public.
I recently spent my weekend verifying that dump, and came to the conclusion that it was largely accurate, and was very likely non-public data that a third-party had obtained.
But checking the authenticity of hacked data can be hard. Sometimes people get it wrong, saying publicly-available information was stolen, or the hacker seriously oversells what it is they've got hold of.
A data breach may not hit all of the sweet spots: confirmation from the company, independently verifiable data, and interviews with victims who can corroborate information. Often, reporting on hacks is a patchwork of different techniques, in an attempt to provide as accurate a picture as possible.
Here's a sketch of that process, or the anatomy of hack reporting, if you will.
CONTACT THE VICTIMS
The most obvious starting point is what journalists do everyday: pick up the phone, and start calling people. If the data contains contact information for victims, that means getting in touch with them, and asking them to confirm the other details in the breach, such as their name, email address, and especially any information that would not be available from breaches of other websites, if possible.
In my experience, victims are often grateful that they've been contacted by a journalist, but, understandably, are also concerned that their data has been exposed by hackers. Explaining to them what technical steps they can take, such as resetting their password, goes a long way to getting the corroboration a journalist needs.
This was the approach for verifying stolen Uber accounts being sold on the dark web for $1. Several sources I called confirmed private details about themselves, including their password and last four digits of their payment card, lending weight to the idea that the stolen accounts were, indeed, legitimate.
The FBI and DHS data was largely similar. Although only a handful of people answered my calls, being a weekend and all, many went through to the correct voicemail boxes. Other victims did confirm their names. (One FBI intelligence analyst contacted as part of the employee dump nervously confirmed her name, before saying "I have no idea what you're talking about," and hanging up the phone).
Another tactic is to independently contact people who mightbe in the dump, and ask them to provide some private details, which you can then check in the data. This is especially useful if there aren't any contact details for victims in the dump itself, for whatever reason.
CHECK THE DATA AGAINST THE AFFECTED SITE
Then there is checking the data against the website itself. If the breach involves customer accounts, for example, a reporter can try and sign up for the service with the hacked email addresses. Typically, the site will say that this operation is not possible, because an account with that email is already in use—a good sign for the legitimacy of the dump.
A journalist should always sign up to the affected site themselves as well, perhaps with a disposable email address. That way, it'll become clearer whether verification is required to create an account. On extra-marital dating site Ashley Madison, no verification was needed, meaning that anyone could sign up with anyone else's email address, resulting in plenty of dubious 'Barack Obamas' appearing in the leaked data.
If the dump apparently contains passwords, one way of checking whether the site does store them in plaintext is by signing up for the service and then asking for a password reset. Sometimes, companies will automatically send back your full, unencrypted password.
CHECK WHETHER IT'S ALREADY PUBLIC
Pretty often, hackers release data that is already fully or partially in the public domain.
Checking this sometimes just boils down to using Google: Searching for seemingly-unique phrases in documents, supposedly private email addresses, and filenames can return enlightening results. A quick dig around revealed that much of the data released by pro-Islamic State hackers was just from publicly available spreadsheets. And although a recent hack of the Fraternal Order of the Police (FOP) was legitimate, some of the types of files included were already published online.
GET EXTRA INFORMATION FROM THE SOURCE
If a journalist is in contact with the hacker responsible, they can ask for extra information to verify the breach, such as screenshots of the hacker's access, or details of how the site or system was broken into. Any images sent over would ideally have a timestamp as well as your recent chat with the hacker included within them, to check that they were taken recently.
This helped with a further Department of Justice breach: the hacker provided screenshots of his penetration into the network, and checking these against a reverse Google image search provided no relevant results.
A journalist must obviously never ask or encourage a source to do anything illegal, such as pulling more data out of a target, or hacking a site in the first place.
APPROACH THE TARGET OFFICIALLY
The last stage is contacting the affected company or organisation officially, and asking them to confirm the data. If the previous steps have already corroborated the dump to a fair degree, getting the target's confirmation is just the icing on the cake. Other times their comment is very important to running the story or not.
A dump from the DHS/FBI hacker involved forensic reports related to a DEA investigation, and a DEA spokesperson largely confirmed the legitimacy of the data. Naturally, targets aren't always so willing to help reporters out, however.
This is the final step because, to be safe, you don't want to tip-off a potentially-antagonistic company before you've had a chance to complete your confirmation: websites might be changed, or taken offline altogether; or staff might be told to keep their mouths shut.
Then again, when the organisation knows of the breach, and is approached by a member of the media, they are probably more likely to take steps to address the problem. This might come in the form of a forced password reset or statement sent out to customers, and if you've signed up to the site, you should get these too.
COMMUNICATE ALL OF THIS TO THE READER
After all of this work has been done, it's crucial to get the process across to the reader. It's not good enough to simply say that the data is legitimate, appears to be real, or is fake. How do you know that? What steps did you take? Did any of the contact details not work? If so, say that. Is any of the data already public? If so, make that clear. If you've only verified a small set of an apparently much larger data set, clarify that. Including caveats is vital for a reader to gain an informed understanding of the hack at hand, which is especially important if they are victims of the breach themselves.
These caveats become even more important if full confirmation cannot be obtained. In those instances, being as explicit and transparent as possible to the limitations of your knowledge are crucial.
This is important because hacks vary greatly in quality, depth and importance, which can be equally said for the media coverage of them as well. Without accurate information, people swept up in data breaches have little chance of truly understanding how and why they're affected.
In many cases, data breaches can only be reported in a matter of degrees. When scoping out the veracity of breaches—be those of company emails, internal documents, or leaky databases—approaches vary depending on what exactly the data is, but the ultimate goal is always the same: to present a transparent case to the reader for how legitimate this data is, and what exactly can be ascertained from it.