In a world of ubiquitous smart-phone cameras, drones, and Google Street View cars, there's probably never been a more important time to start protecting the identities of people unwittingly captured in photos and videos.
But while websites like YouTube have started offering tools to obscure faces and other objects appearing in digital media, researchers have found that those protections can be defeated at an alarming rate thanks to recent advances in artificial intelligence.
In a paper released earlier this month, researchers at UT Austin and Cornell University demonstrate that faces and objects obscured by blurring, pixelation, and a recently-proposed privacy system called P3 can be successfully identified by a neural network trained on image datasets—in some cases at a more consistent rate than humans.
"We argue that humans may no longer be the 'gold standard' for extracting information from visual data," the researchers write. "Recent advances in machine learning based on artificial neural networks have led to dramatic improvements in the state of the art for automated image recognition. Trained machine learning models now outperform humans on tasks such as object recognition and determining the geographic location of an image."
For the experiment, the researchers trained the neural network on four different datasets to focus on defeating three types of image obfuscation: Pixelation (also called "mosaicing"), blurring (the gaussian blur you've probably seen applied to faces and signs in Google Street View), and a new method called Privacy Preserving Photo Sharing, or P3, which splits images into a "public" and "private" version, with the latter having sensitive elements removed.
After being trained on a dataset of 530 individuals, the researchers' neural net was able to recognize pixelated images at a rate of 57 percent, increasing to as high as 72 percent when the system was fed the top five guesses. The neural net could also identify blurred images at over 50 percent accuracy after being trained on 40 black-and-white photos blurred by YouTube, and 40 percent accuracy when tested against images obscured by P3.
"The key reason why our attacks work is that we do not need to specify the relevant features in advance," the researchers explain. "We do not even need to understand what exactly is leaked by a partially encrypted or obfuscated image. Instead, neural networks automatically discover the relevant features and learn to exploit the correlations between hidden and visible information."
In the future, the authors say that people designing privacy systems will need to step up their game to consider not just how to obscure sensitive parts of a photo or video, but how to prevent visible data from being used by neural networks to reconstruct or infer the missing information.
"Unfortunately, we show that obfuscated images contain enough information corre-
lated with the obfuscated content to enable accurate reconstruction of the latter," the authors conclude. "Modern image recognition methods based on deep learning are especially powerful in this setting because the adversary does not need to specify the relevant features of obfuscated images in advance or even understand how exactly the remaining information is correlated with the hidden information."