Chris Whong, a self-proclaimed “urbanist, mapmaker, and datajunkie," was recently granted government access to taxi trip data from 2013 through the Freedom of Information act, yielding a deluge of information about Yellow Cab rides throughout the five boroughs. Maybe he had one too many of those days where he couldn't catch a ride anywhere, and wanted proof that it wasn't his fault. Maybe he wanted to affirm, once and for all, that Uber is worth the cash. Regardless, thanks to his government request, Whong was given access to data that detailed every single cab ride from last year, including pickup and dropoff times, as well as GPS information about each taxi's route.
Advertisement
Armed with 50 gigabytes of governmental data, Whong partnered with social computing researcher Andrés Monroy to sift through it, and Eric Fischer produce a detailed map of the intense onslaught of cabbie info. Now we could possibly infer things such as where it's hardest to hail a cab, or which neighborhoods include residents most likely to hitch a ride to work (we'd guess Tribeca or Soho, for obvious financial reasons).While the resulting map is visually stunning, it only utilized some of the data provided to the pair by the government. In order to preserve the anonymity of the taxi riders, the government encrypted the license and taxi medallion numbers of each driver.However, the government’s attempt at anonymity didn’t stand a chance against the skills of an average tech whiz or hacker. As software engineer Vijay Pandurangan points out:“The personally identifiable information (the driver’s licence number and taxi number) hasn’t been anonymized properly — what’s worse, it’s trivial to undo, and with other publicly available data, one can even figure out which person drove each trip.” Pandurangan goes on to explain in heavy detail how to figure out each driver’s name, estimated salary, and medallion number with the information made publicly available. So every cabbie in the city was unintentionally doxxed by the government.Pandurangan concludes his findings with a mildly scolding message to the government: “The cat is already out of the bag in this case, but hopefully in the future, agencies will think carefully about the method they use to anonymize data before releasing it to the public.”
ORIGINAL REPORTING ON EVERYTHING THAT MATTERS IN YOUR INBOX.
By signing up, you agree to the Terms of Use and Privacy Policy & to receive electronic communications from Vice Media Group, which may include marketing promotions, advertisements and sponsored content.