FYI.

This story is over 5 years old.

Tech

Anatomy of a Globe-Spanning Google Outage

How an Indian ISP made a small mistake that snowballed all over the world and took Google down
​Image: Ruslan Absurdov/Shutterstock

​On Thursday morning, many—perhaps millions—of people around the world, from Brazil to France, faced what could be best described as a modern-day nightmare.

Google was down.

Google is #down ? La fin du monde … ! #apocalypse

— T0M_LUCA (@T0M_LUCA) March 12, 2015

corram para as montanhas, Google is down. (pelo menos aqui para mim)

— Guilherme Munnhoz (@guimunhoz) March 7, 2015

As it turns out, the 20-minute outage was caused by a simple mistake made by an Indian internet service provider that snowballed all over the world and ensconced at least 28 other ISPs, making Google unreachable for "millions" of people, according to Doug Madory, a researcher at internet monitoring firm Dyn.

Advertisement

And, most likely, it was all caused by a human mistake.

"It's a big internet. There's a lot of engineers doing a lot of work and it comes down to people typing commands into routers."

"It's a big internet," Madory told Motherboard. "There's a lot of engineers doing a lot of work and it comes down to people typing commands into routers and if they make a mistake, routes leak out and traffic can be misdirected."

In a blog po​st, Madory explained that the mistake was a "routing leak," which happens when a network provider mistakenly sends its internal routing tables to other peered networks, redirecting internet traffic the wrong way. Think of it as traffic officers incorrectly telling a lot of drivers to go the wrong way, through smaller roads, which ends up creating a gridlock.

In this case, Indian ISP Hathway mistakenly published routes to 300 GoogleIP addresses, to its backbone provider Bharti Airtel, which in turn passed these routes to "the rest of the world," according to Madory. At least 28 ISPs, including major ones like Level 3, Cogent, Orange and Pakistan Telecom, took those routes, in some instances over their direct links to Google, and incorrectly directed traffic through Hathway, creating the outage.

The outage was detected also by BGPmon, another firm that monitors internet traffic. BGPmon explained how this incident happened in its own blog po​st illustrating it with a graphic.

Advertisement

It's unclear exactly why Hathway leaked these routes. Hathway did not respond to Motherboard's request for comment and Google hasn't responded yet either. But Madory said that this kind of leak isn't that unco​mmon, and there have been several cases in the past.

In 2010, China Telecom leaked bogus routes that a US academic backbone provider took as legit, redirecting internet traffic from many US universities through China. At the time, Forbes —tongue in cheek—desc​ribed the incident as a "cybernuke" test. Last year, Madory added, South Americans who tried to connect to Yandex, Russia's equivalent of Google (which has a campus in California, too), went through Belarus instead of the US.

"The internet is indeed a 'series of tubes,' and traffic flowing through those tubes sometimes gets all screwed up by the humans directing it."

In many cases, these mistakes don't even get detected if they don't knock off a giant like Google, Madory said.

But when they do, like today, they are a great reminder that yes, the internet is indeed a "series of t​ubes," and traffic flowing through those tubes sometimes gets all screwed up by the humans directing it.