How I Discovered the First Big Mobile Privacy Scandal
That smartphones can track our every movement seems obvious today, but in 2011 we weren't quite used to the idea.
Image: Alasdair Allan/Motherboard
Motherboard staff is exploring the cultural, political, and social influence of the iPhone for the 10th anniversary of its release. Follow along.
On Friday, March 11, 2011, a magnitude nine earthquake occurred near the Japanese island of Honshu, shutting down power generation at the Fukushima Nuclear Plant. Fifty minutes later the tsunami arrived, overtopping the protective wall around the plant and flooding the standby diesel generators in the basement. Unfortunately, with the reactors themselves shut down, it was these generators that were providing power for cooling the reactor cores. The subsequent meltdown at the plant made news around the world, and caused $187 billion worth of damage. It also led Pete Warden and I to discover what later became known as "locationgate," one of the first of the big mobile privacy scandals.
In the process of working on the data, I'd become rather enamoured with Pete's then-new visualization tool OpenHeatMap. Designed to take tabular geographical data and build an interactive map in seconds, it had been an amazing help putting together the visualizations of the data. It also seemed like a perfect fit for visualizing some other data I'd recently become interested in: the data hidden on your cell phone, or what you might refer to as tertiary data. Data that is generated about us, rather than by us.
I'd been sitting in the audience of a panel on posthumans, big data, and new interfaces at the Strata conference in Santa Clara a couple of weeks before the Fukushima incident when Toby Segaran mentioned how we carried distributed big data in our pocket all the time. It was a throwaway remark, but it had started me thinking.
A couple of weeks after Fukushima, and holed up over a long weekend in a hotel in northern California, I'd finished reading the books that I'd brought with me and was getting bored. I decided to start looking for data hidden on my cell phone to throw at Pete's new visualisation tool, which meant poking around inside the backups of my iPhone on the Mac.
I was looking for data that the phone generated, but wasn't exposed in the user interface. Hidden data, in other words. I'd been looking for the positioning cache file—a file full of temporarily stored location data—something I pretty much knew had to be there both to speed up satellite fixes for high accuracy GPS positioning, and to help with lower accuracy WiFi triangulation. Finding what I thought was the right cache file, I naively threw it at Pete's visualisation tool. It crashed. The cache file was large, much larger than I had expected. In the end, I emailed Pete and we worked on it together. In what was a classic case of what's called "data leakage," there was more than a year of location data in the cache file. Sitting there, unencrypted on my laptop.
While I explored the data in an attempt to figure out what was going on, Pete quickly threw together a desktop tool based around his OpenHeatMap code. It was the quickest, dirtiest piece of code he could put together. He wrote it the way it was written mostly so I'd stop hammering his servers with visualization requests. We'd later release the tool alongside the news of the data leak and, unknown to us at the time, that decision would prove crucial in the way the story unfolded.
The following week, both Pete and I were attending Where 2.0, a conference on data and location. The reaction to the post announcing the discovery was interesting. I don't think either Pete or myself were expecting to see the story strike such a nerve. We certainly didn't anticipate Senator Al Franken's letter to Steve Jobs, the subsequent class action law suits, and the senate hearings around location privacy that pulled in both Apple and Google. We certainly didn't anticipate getting a mention on South Park.
The hostility from the security community, which at the time I wasn't involved with in the same way I am today, was also unexpected. The presence of the file was known inside the community, but hadn't been widely publicised outside it. To them, our find was neither new, nor a 'discovery,' and it was dismissed by most insiders as irrelevant.
But as Pete put it at the time, "The main reason we went public with this was exactly because it already seemed to be an open secret among people who make their living doing forensic phone analysis, but not among the general public."
Apple's immediate response to the story was also perhaps somewhat disingenuous. "The iPhone is not logging your location," it said. "Rather, it's maintaining a database of Wi-Fi hotspots and cell towers around your current location, some of which may be located more than one hundred miles away from your iPhone, to help your iPhone rapidly and accurately calculate its location when requested." This ignored that fact that if the phone is storing a list of access points and cell towers around your location then the center of those separate points will be a good approximation of your location. After all, that was the whole point of storing them in the first place. Contrary to Apple's claims in its initial response, the phones continued to store location data even when location services were disabled.
Apple did however acknowledge that the reason the phone was storing so much data was that there was a bug, and it thought that the phone needed to store no more than "seven days of data." The iOS 4.3.3 operating system update, released soon afterwards, was more-or-less entirely dedicated to fixing the bug that caused locationgate.
The Rise of Data Journalism
Soon after the release, both Pete and I were wilting under the weight of media requests. We were interviewed live on more news stations than I can now remember, and our inboxes were clogged with emails asking for more—more information, more quotes, more of anything. We were badgered to justify our position, interrogated about our motivations. We were cast as heroes, and as villains. In the end we stopped answering email. Stopped taking interview requests, and let the media write what they wanted.
In hindsight it's pretty obvious that the visualisation tool that Pete wrote for us to use, and we'd then released, was crucial in how the news of the data leak spread.
The tool we released allowed you to access and visualize your own data, which meant that journalists looking for a story could plug their own phones into their Mac and not only confirm for themselves that the story was real, but track themselves.
It was interesting how quickly our names started to drop out of the stories, how quickly it became "two data scientists" or "two researchers." How the story became about the data, and the stories it told, rather than about us.
It quickly became clear that most journalists led interesting lives. For instance Alexis Madrigal wrote an amazing piece in the Atlantic which showed data recorded while he was flying around with James Fallows.
We released an application that allowed anyone to visualise their own data, to see their own life through the lens of the story. Suddenly everyone was part of the narrative. It was hailed as the future of journalism. Instead of talking about privacy, we showed people how it affected them, we gave people the opportunity to look at their own data.
"Data, on its own, locked up or muddled with errors, does little good," Alex Howard wrote in 2012, "Cleaned up, structured, analyzed and layered into stories, data can enhance our understanding of the most basic questions about our world, helping journalists to explain who, what, where, how and why changes are happening."
Among the most impressive data projects I saw in the aftermath of our release was called Crowdflow. The brainchild of Michael Kreil, the project collected over a thousand location datasets. People donated their own data mostly because, just like Madrigal, they found the existence of the data "fascinating."
The data was anonymised, sanitised, and then aggregated. Then, from the aggregated data, Michael went ahead and built some amazing visualisations showing the movement of people across Europe and the rest of the world. In the end, he even made the aggregated data available for general download. Access to the original donated data, before anonymization and aggregation, was only available to validated academics for research.
However, he wasn't alone in being obsessed with playing with the data. James Bridle, a London-born artist, made a book out of his own data. Each page a single day, with the book showing his movements throughout the year. I still have a copy sitting on my shelf at home.
The story spread not because Pete and I sought out publicity. Instead, people cared because they were able to see their own data. Everyone was the story.
Looking Back and Looking Forward
By 2013, Apple was still collecting location data. But this time they were exposing it in the user interface and allowing users to manage it. These days, locationgate wouldn't even be a story.
Since then, people have become a lot more comfortable with the idea of sharing location data, while at the same time becoming a lot more nuanced about how that data is shared. Recent privacy scandals, such as when Uber updated its app asking users to share their location all the time, even when the app wasn't running—is illustrative. People are OK with their phone tracking their location, but want control over how it's shared.
Apple's own view on the matter is rather obvious after the recent release of iOS 11 beta to developers in the wake of this year's WWDC. In the past, developers like Uber could effectively force users into granting 'Always' permissions by just not providing the option of 'Only When Using the App.' But the new release of the operating system changes that. Developers can no longer demand all or nothing when asking for a user's location, they have to offer the user a more limited option to share their location only when using the app.
I think the debate that locationgate started was valuable. In fact, I think it was essential. It has taken us 20 years to begin to have a serious debate about privacy on the Internet, and locationgate was one of the things that got us here.
Get six of our favorite Motherboard stories every day by signing up for our newsletter.