Of the many small humiliations heaped on a young oncologist in his final year of fellowship, perhaps this one carried the oddest bite: A 2-year-old black-and-white cat named Oscar was apparently better than most doctors at predicting when a terminally ill patient was about to die. The story appeared, astonishingly, in The New England Journal of Medicine in the summer of 2007. Adopted as a kitten by the medical staff, Oscar reigned over one floor of the Steere House nursing home in Rhode Island. When the cat would sniff the air, crane his neck and curl up next to a man or woman, it was a sure sign of impending demise. The doctors would call the families to come in for their last visit. Over the course of several years, the cat had curled up next to 50 patients. Every one of them died shortly thereafter.
No one knows how the cat acquired his formidable death-sniffing skills. Perhaps Oscar’s nose learned to detect some unique whiff of death — chemicals released by dying cells, say. Perhaps there were other inscrutable signs. I didn’t quite believe it at first, but Oscar’s acumen was corroborated by other physicians who witnessed the prophetic cat in action. As the author of the article wrote: “No one dies on the third floor unless Oscar pays a visit and stays awhile.”
The story carried a particular resonance for me that summer, for I had been treating S., a 32-year-old plumber with esophageal cancer. He had responded well to chemotherapy and radiation, and we had surgically resected his esophagus, leaving no detectable trace of malignancy in his body. One afternoon, a few weeks after his treatment had been completed, I cautiously broached the topic of end-of-life care. We were going for a cure, of course, I told S., but there was always the small possibility of a relapse. He had a young wife and two children, and a mother who had brought him weekly to the chemo suite. Perhaps, I suggested, he might have a frank conversation with his family about his goals?
But S. demurred. He was regaining strength week by week. The conversation was bound to be “a bummah,” as he put it in his distinct Boston accent. His spirits were up. The cancer was out. Why rain on his celebration? I agreed reluctantly; it was unlikely that the cancer would return.
When the relapse appeared, it was a full-on deluge. Two months after he left the hospital, S. returned to see me with sprays of metastasis in his liver, his lungs and, unusually, in his bones. The pain from these lesions was so terrifying that only the highest doses of painkilling drugs would treat it, and S. spent the last weeks of his life in a state bordering on coma, unable to register the presence of his family around his bed. His mother pleaded with me at first to give him more chemo, then accused me of misleading the family about S.’s prognosis. I held my tongue in shame: Doctors, I knew, have an abysmal track record of predicting which of our patients are going to die. Death is our ultimate black box.
In a survey led by researchers at University College London of over 12,000 prognoses of the life span of terminally ill patients, the hits and misses were wide-ranging. Some doctors predicted deaths accurately. Others underestimated death by nearly three months; yet others overestimated it by an equal magnitude. Even within oncology, there were subcultures of the worst offenders: In one story, likely apocryphal, a leukemia doctor was found instilling chemotherapy into the veins of a man whose I.C.U. monitor said that his heart had long since stopped.
But what if an algorithm could predict death? In late 2016 a graduate student named Anand Avati at Stanford’s computer-science department, along with a small team from the medical school, tried to “teach” an algorithm to identify patients who were very likely to die within a defined time window. “The palliative-care team at the hospital had a challenge,” Avati told me. “How could we find patients who are within three to 12 months of dying?” This window was “the sweet spot of palliative care.” A lead time longer than 12 months can strain limited resources unnecessarily, providing too much, too soon; in contrast, if death came less than three months after the prediction, there would be no real preparatory time for dying — too little, too late. Identifying patients in the narrow, optimal time period, Avati knew, would allow doctors to use medical interventions more appropriately and more humanely. And if the algorithm worked, palliative-care teams would be relieved from having to manually scour charts, hunting for those most likely to benefit.
Avati and his team identified about 200,000 patients who could be studied. The patients had all sorts of illnesses — cancer, neurological diseases, heart and kidney failure. The team’s key insight was to use the hospital’s medical records as a proxy time machine. Say a man died in January 2017. What if you scrolled time back to the “sweet spot of palliative care” — the window between January and October 2016 when care would have been most effective? But to find that spot for a given patient, Avati knew, you’d presumably need to collect and analyze medical information before that window. Could you gather information about this man during this pre-window period that would enable a doctor to predict a demise in that three-to-12-month section of time? And what kinds of inputs might teach such an algorithm to make predictions?
Avati drew on medical information that had already been coded by doctors in the hospital: a patient’s diagnosis, the number of scans ordered, the number of days spent in the hospital, the kinds of procedures done, the medical prescriptions written. The information was admittedly limited — no questionnaires, no conversations, no sniffing of chemicals — but it was objective, and standardized across patients.
These inputs were fed into a so-called deep neural network — a kind of software architecture thus named because it’s thought to loosely mimic the way the brain’s neurons are organized. The task of the algorithm was to adjust the weights and strengths of each piece of information in order to generate a probability score that a given patient would die within three to 12 months.
The “dying algorithm,” as we might call it, digested and absorbed information from nearly 160,000 patients to train itself. Once it had ingested all the data, Avati’s team tested it on the remaining 40,000 patients. The algorithm performed surprisingly well. The false-alarm rate was low: Nine out of 10 patients predicted to die within three to 12 months did die within that window. And 95 percent of patients assigned low probabilities by the program survived longer than 12 months. (The data used by this algorithm can be vastly refined in the future. Lab values, scan results, a doctor’s note or a patient’s own assessment can be added to the mix, enhancing the predictive power.)
So what, exactly, did the algorithm “learn” about the process of dying? And what, in turn, can it teach oncologists? Here is the strange rub of such a deep learning system: It learns, but it cannot tell us why it has learned; it assigns probabilities, but it cannot easily express the reasoning behind the assignment. Like a child who learns to ride a bicycle by trial and error and, asked to articulate the rules that enable bicycle riding, simply shrugs her shoulders and sails away, the algorithm looks vacantly at us when we ask, “Why?” It is, like death, another black box.
Still, when you pry the box open to look at individual cases, you see expected and unexpected patterns. One man assigned a score of 0.946 died within a few months, as predicted. He had had bladder and prostate cancer, had undergone 21 scans, had been hospitalized for 60 days — all of which had been picked up by the algorithm as signs of impending death. But a surprising amount of weight was seemingly put on the fact that scans were made of his spine and that a catheter had been used in his spinal cord — features that I and my colleagues might not have recognized as predictors of dying (an MRI of the spinal cord, I later realized, was most likely signaling cancer in the nervous system — a deadly site for metastasis).
It’s hard for me to read about the “dying algorithm” without thinking about my patient S. If a more sophisticated version of such an algorithm had been available, would I have used it in his case? Absolutely. Might that have enabled the end-of-life conversation S. never had with his family? Yes. But I cannot shake some inherent discomfort with the thought that an algorithm might understand patterns of mortality better than most humans. And why, I kept asking myself, would such a program seem so much more acceptable if it had come wrapped in a black-and-white fur box that, rather than emitting probabilistic outputs, curled up next to us with retracted claws?
Siddhartha Mukherjee is editor-at-large at Tonic. He is the author of The Emperor of All Maladies: A Biography of Cancer, winner of the 2011 Pulitzer Prize in general nonfiction, as well as 2016's The Gene, and The Laws of Medicine. Mukherjee is an assistant professor of medicine at Columbia University and a cancer physician and researcher.
This piece originally appeared in the New York Times magazine.
Sign up for our newsletter to get the best of Tonic delivered to your inbox.