This AI Can Diagnose a Rare Eye Condition as Well as a Human Doctor

In some cases, the convolutional neural network blew human ophthalmologists out of the water.
January 30, 2017, 4:00pm

Diagnosing medical conditions is among the more classic examples of actually useful, achievable real-world machine learning. Machines have data, lots of it, and they have the capacity to process all of that data in ways that humans can't. Crucially, machines should be able to pick up on-the-edge cases, the rarest diseases that may go undiagnosed for simple lack of experience on the part of even the most exceptional doctors. Here, machines are to augment humans, rather than replace them.


To this end, a group of Chinese ophthalmologists and computer scientists has demonstrated a machine learning algorithm for identifying congenital cataracts, a rare eye disease that's nonetheless responsible for some 10 percent of all vision loss in children worldwide. The algorithm was able to catch the disease with accuracies exceeding 90 percent, putting it on par with individual human ophthalmologists. The new algorithm is described in the current issue of Nature Biomedical Engineering.

It's oft remarked that medicine is an art as much as it's a science. The interaction between a doctor and a patient is difficult to quantify—it's complex and shaded by intuition. Of particular concern when it comes to applying computational methods to medicine is the prospect of being wrong, which can have a far different meaning when that wrongness is the result of bad calculations rather than human judgement.

A doctor that makes a mistake is a human—possibly a grossly negligent human, but not necessarily—while a machine that makes a mistake is itself a mistake. A bug.

But the case for a machine role is there. "Machines have the advantages of automation, objectivity and precision, but the human ability to communicate and interact effectively is indispensable for medical treatment," study co-author Haotian Lin, a professor of ophthalmology at Sun Yat-sen University, told me.

"For doctors, technology is not sufficient to determine the best course of treatment with 100 percent certainty, and doctors should therefore make good use of the machine's suggestion to identify and prevent the potential misclassification and complement their own judgment," he said. "The results of our comparative analysis showed that both artificial intelligence and human intelligence have strengths and limitations."


Missed or mistaken diagnosis is thus common among rare-disease patients, and this is especially true among large populations in developing countries, like China. Congenital cataracts are an especially compelling test-case because of the possibility of reversing the illness given timely intervention and rigorous follow-up care.

The algorithm is based on convolutional neural networks (CNNs), a class of machine learning models that attempts to imitate the neural processing that occurs in the visual cortex of animals. CNNs are widely used for visual recognition tasks but also other domains, like playing Go, natural language processing, and drug discovery. The very basic idea is of feeding the network large sets of images, including those of known congenital cataract cases, until eventually it learns an abstract representation that can be used to successfully analyze new images.

Here the researchers came up with three different networks useful for three variations on the cataract recognition task. The first is used for screening patients from the general population; the second is used for "risk stratification" among cataract patients; the third is used in assisting ophthalmologists with treatment decisions. All three are bundled up into a cloud-based platform known as CC-Cruiser.

The real-world test of the platform was based on 50 cases selected by an expert panel and consisting of a wide range of "challenging clinical situations." The performance of CC-Cruiser was stacked up against three categories of doctor: novice, competent, and expert. The 90 percent statistic cited above becomes more impressive when the results are broken down further. In normal cases, CC-Cruiser actually had no missed diagnoses and no false positives. No category of human doctor had that kind of performance. It only started to falter when tasked with making decisions about follow-up care, where the network registered a relatively large number of false positives.

CC-Cruiser is promising but not quite yet ready for IRL use. "Currently, our agent has been used in three non-specialized collaborating hospitals to further validate the feasibility of real-world clinical implementation," Lin said. "However, due to the need to respect and protect human life, medical fields always hold conservative and cautious attitudes when facing cutting-edge technology. Further rigorous clinical trials were still needed before we put the AI into regular clinical practice."