FYI.

This story is over 5 years old.

Tech

The NSA Collects Billions of Interactions a Day to Map Social Networks

Even if the NSA doesn't know who you are, it might know who you talk to.
An analyst at work at the NSA's Threat Operations Center, via the agency

Even if the NSA doesn't know who you are, it might know who you talk to. That's because, according to a new report, the NSA's collection systems are designed to model vast networks of social connections based on digital interactions.

From the fore, the NSA—among others, including the Department of Justice—have argued that vast data collection schemes are legal because they rest on collecting metadata, anonymous identifiers of digital communication, including phone numbers and call times, but not things like account info and call content. Regardless of the numerous legal and privacy questions surrounding those claims, there's a pragmatic question, too: How do phone numbers lead to intelligence?

Advertisement

According to a report from the New York Times, the NSA has been utilizing a sophisticated data analysis system that can connect myriad interactions into a map of social connections, a system that has been in place since at least 2010. That's notable because the NSA's vast data collection programs were in place years before that, but a policy shift at the agency in November 2010 allowed collected phone and email logs to be analyzed to track Americans' connected with foreign nationals.

Better yet, the NSA's systems are designed to mix in data from public, commercial, and law enforcement sources to try to add more clarity to the agency's social maps. That includes everything from Facebook posts to travel logs and insurance information. Essentially, any data that the NSA can legally get its hands on is mixed in with its massive metadata collection schemes to build portraits of who's talking to who.

An NSA presentation slide detailing the system, obtained in the Snowden leaks and published by the Times, details the system. Essentially, when an analyst selects a person or group to investigate (known as an "agent"), the system develops a portrait of that person's social network based on nodes developed through data collections. So if the agent calls Person A on a number known to be the agent's, Person A's follow-up connections can also be mapped to see who's talking to who.

On the most basic level, it's a method of getting around the NSA's restrictions on metadata collection. Even if the agency isn't fully sure who Person A or B is, if their phone numbers eventually link the agent to another person of interest, the agency has now established a network worth investigating. It sounds like Six Degrees of Kevin Bacon, but involves massive quantities of data from anywhere they can be collected in the hopes that no social connections between persons of interest can be buried. (The programming and computing capabilities of these systems must be unfathomably immense.)

Advertisement

Yet even if the system is based on legally-collected anonymous metadata that's augmented with publicly available data, there's still a major concern: the newly-detailed system can involve Americans' data. Allowing Americans to be involved in intelligence operations is nominally illegal for the NSA, whose broad spying powers are supposed to be limited to investigation of foreign countries and individuals.

But, perhaps considering that everyone from employers to police use the treasure troves of publicly available data we post about ourselves, the NSA decided to change its policies to allow its analysts to take advantage as well. According to the Times:

The policy shift was intended to help the agency “discover and track” connections between intelligence targets overseas and people in the United States, according to an N.S.A. memorandum from January 2011. The agency was authorized to conduct “large-scale graph analysis on very large sets of communications metadata without having to check foreignness” of every e-mail address, phone number or other identifier, the document said. Because of concerns about infringing on the privacy of American citizens, the computer analysis of such data had previously been permitted only for foreigners.

The NSA has said it follows court-prescribed "minimization" techniques, in which privacy concerns are allegedly balanced with intelligence needs by keeping data collection as focused and anonymous as possible. As the Times explains, that's hard to reconcile with the facts: Just one of the programs tasked in the NSA's social graph building, known as Mainway, was touted in internal NSA documents as collecting over a billion US phone records a day. Overall, budget documents released by Snowden suggest that the NSA's system is can assimilate 20 billion data points a day.

The goal is clearly to allow the NSA's algorithms to find as many potential patterns as possible, but it's impossible to argue that collecting every piece of available data falls under minimization guidelines. And yet the secret court that oversees the NSA's secret programs has ruled that metadata collection isn't an illegal privacy breach precisely because such metadata is presumed anonymous.

This is despite said court, known as the Foreign Intelligence Surveillance Court, knowing full well that the NSA has lied about the scope of the collection, and the NSA itself admitting its employees have abused its capabilities to tap directly into phone conversations.

The agency has attempted to develop a system we've seen in nearly every political thriller in the last two decades: type a person's name into a computer and it'll spit out everyone they know and interact with, while tracking down particular points of interest.  That the NSA even has such massive data analysis capabilities isn't a surprise anymore. That it has invested in being able to map out a person's social connections based on any points of data, which requires data collection on an unprecedented scale, actually is. And while the NSA may be able to argue its collection is legal, as it has in the past, the question we've yet to have answered is whether or not it's even effective.

@derektmead