Analyzing Bing Searches Could Predict a Cancer Diagnosis, Study Finds

And potentially warn individual users that they should talk to a doctor.
June 12, 2016, 9:00pm

A recent study of Bing search queries once again revealed both the power of search data analysis and that data's potential for encroaching on personal privacy in unexpected ways.

Working with Microsoft, researchers from Columbia University did statistical analysis on "anonymized" searches by millions of users that contained terms that might be related to cancer symptoms. They concluded that user search behavior could potentially predict the appearance of something as dangerous as pancreatic cancer, one of the top killer cancers in the U.S, and often slow to diagnose, even before it's diagnosed by a doctor.

The study, published in the Journal of Oncology Practice, focused on pancreatic cancer specifically, and noted that one potential application of this predictive analysis would be to warn "individual searchers about the value of seeking attention from health care professionals." The study hints at a future system that could assemble data points from searches and provide what amounted to a medical early warning system for users. It could even advise patients on what to say when consulting with doctors.

Researchers began by looking at searches that indicated a user had already been diagnosed with pancreatic adenocarcinoma, and worked backwards from these "landmark queries" to find searches that suggested users were worried about related symptoms. Then they analyzed early search patterns to predict at what point searches about actual cancer diagnosis might show up.

They discovered that early search patterns regarding symptoms commonly identified with pancreatic cancer "can predict the future appearance of queries that are highly suggestive of a diagnosis."

This is just the latest example of analyzing search behavior to predict health issues. In late 2015 scientists at the University of Illinois concluded they could use Google search data to track the spread of infectious disease in almost real-time — finding, as NPR reported at the time, "a jump in gonorrhea might coincide with more people searching 'painful urination' or other symptoms."

Though researchers are focused on the positive potential of big data for public health, it also raises unsettling questions about user privacy—questions that were raised in 2012, when Target famously used a "pregnancy prediction model" of its customer search behavior to figure out a teen girl was pregnant, and then sent pregnancy-related coupons to her home. The woman's father received the coupons before she'd even told him she was pregnant.

It was a public relations nightmare for the retailer, and a sign of potential things to come. While it's fascinating to see the power of carefully rendered analysis on such important data, it's also unnerving to think of a day when a pop-up might warn you that you should seek out a cancer specialist.