Barricades in Kiev in January. Image: Sasha Maksymenko/Flickr
With a seemingly endless list of conflicts raging all over the globe—be it the rebel war in Ukraine, Israel’s offensive in Gaza, or the situation in Syria and Iraq—one might start to wonder how all of these wars are emerging.
The Global Database of Events, Languages, and Tones (GDELT), an open source online tool with over 250 million worldwide events logged since 1979 and counting, provides an accessible platform to monitor global news at the scale of Big Data. In other words, it makes world conflict (and other geopolitical events) computational, revealing new sets of data that could hold the key to predictive conflict analysis in the future.
“The ultimate goal is to monitor public information around the world” via news media, said Kalev Leetaru, a Georgetown University researcher, Foreign Policy columnist, and creator of GDELT (he also happens to have a background in supercomputing).
Leetaru designed GDELT to use text-analysis and algorithms that scrape news media sites in over 100 languages all over the globe, compiling them in a single database with daily reports. The database breaks events into 59 different categories and geolocates them. By plugging the data into algorithmic forecasts, you could then create some form of predictive model.
So far, it's had the added benefit of showing biases in news: Leetaru said that while the MH17 incident in Ukraine dominated Western print, the Gaza conflict interested the rest of the world.
Essentially it’s a dataset, available on Google BigQuery, with additional plans for a live map where users can intuitively view, in real-time, emergent events as they unfold all over the world.
While Leetaru admits news media isn’t a perfect insight into living history, with its tendency for bias and mistakes, it does offer an insight into the emotional motivations behind news events along with geographically pinpointing emerging zones of conflict as they unfold.
"The public information that people put out everyday; there's an enormous wealth of indicators that give you intel into how populations are feeling, what's important and what the touch point issues are," said Leetaru. His method is to take that information and intelligently relate it to give a picture of society "as a whole."
“Forecasting is obviously the Holy Grail. There’s a ton of work being done on it right now,” said Leetaru. He said within weeks GDELT will be releasing details of “very simplistic algorithms” used as predictive forecasting models that yielded “very useful data and forecasts.”
According to him, traditional forecasting has always accepted that “black swans” like the Russian standoff in Crimea or the Arab Spring in Egypt are difficult to explain. By relying on more quantifiable data like GDP, infant mortality, or factual things like total numbers of protests, traditional forecasting ignores more qualitative measurements like emotion.
“But if you look at things like the thematic and emotional dimensions, what you start seeing is the huge amount of signals that precede a collapse,” said Leetaru. For example, he says pro-Russian chatter in Crimea started to appear in GDELT during the protests in Kiev, long before the Russian invasion.
“You’re never going to get to a point where [the map] says there’s going to be a riot at 5:05 next Friday… Where we are and what we can do right now is certainly the data says 'hey, you might want to take a look at this,’” said Leetaru.
GDELT isn’t the only database pooling information about conflict and news media through a data lens, with an eye for predictive forecasts.
The Defense Advanced Research Projects Agency (DARPA) has its Worldwide Integrated Crisis Early Warning System (W-ICEWS), developed by Lockheed Martin, which pools from 17 million news stories with “aggregate forecast accuracy of more than 80 percent.”
Having computable and reliable intelligence informing government decision-makers on questions of where to allocate resources represents an enormous strategic advantage in the future, especially if these models become more accurate.
I asked Leetaru if any private security outfits were using his database for their own purposes, seeing as some private intelligence firms produce risk assessments for corporate or governmental clientele. "I can say GDELT is widely used throughout the world, is my answer to that," he said.
Leetaru emphasized that one of the uses for his database he envisions is as an early warning system for any citizen living in a geopolitical hot-zone.
But computers aren’t perfect. Leetaru concedes GDELT makes mistakes computing sarcasm or noticing editorial skew. For example, in the case of the downed MH17 civilian airliner in Ukraine, the finger-pointing from Russian and Ukrainian news sources muddied the story in GDELT.
That being said, other input, like mixing the algorithmic forecasting with more scientific data like water availability or climate issues, could potentially improve the accuracy of dealing when global events.
"That's the future I'm interested in. The scientific community has done a terrific job putting all these satellites in the sky and sensors in the ground that when an earthquake happens within seconds we know where it is and who's going to be impacted. We don't have that for society," said Leetaru.
Ultimately, he envisions a future where the entire world is computationally modeled: a single computer model blending earth data and human information to produce predictive data on conflict. In that world, we'd hopefully see wars coming before they turn in to such great humanitarian disasters.
Whether or not larger powers would ever act on predictive forecasts like this is another question. For that reason, an open-source early warning system might just be of more benefit to the average person looking for warning signs of trouble, or the average journalist looking to report on that same situation.