This story is over 5 years old
Tech by VICE

A Haiku Bot Mines the New York Times for Poetry

Bots are finding and creating kooky content everywhere you look. Soon we'll all be out of jobs.

by Austin Considine
Apr 1 2013, 5:40pm

One day, we’ll all have

everything we read online

written by robots.

Well, maybe not everything. But a bit of clever coding already goes a long way these days towards finding and creating some pretty interesting reading material. The latest example, as reported this morning by the Nieman Journalism Lab, is a bot developed by New York Times senior software architect Jacob Harris, which takes stories from the Times homepage and turns them into wholly intelligible haikus

The haikus are posted to a Tumblr page hosted by the Times, where Harris explains a little bit about how the bot works. It’s important to note, first, that the bot does not write the haikus; it finds them. Periodically, the bot scans articles on the Times homepage and matches little chunks to data from an online dictionary that breaks down words by syllable count. When the bot finds a complete sentence that fits the standard 17-syllable format, it snags the sentence, automatically skipping over sensitive articles about sensitive subjects and tossing out particularly awkward constructions.

Still, “the machine has no aesthetic sense,” Harris notes. The algorithm is its muse. “It can't distinguish between an elegant verse and a plodding one," he continues. "But, when it does stumble across something beautiful or funny or just a gem of a haiku, human journalists select it and post it on this blog.”

For example, for an article entitled, “What Pet Owners Must Do to Get New York Apartments,” the bot found:

Barking and howling

can make life miserable

for everybody.

And for an installation of the Styles section’s “Modern Love” column, entitled “The Fear of Surrendering Again,” it produced this particularly poetic gem:

He has a mind as

fascinating to me as

the city itself.

As noted in the Nieman article, the Times haiku maker was inspired, in part, by a project called Haikuleaks, a similar bot that mined the massive 2010 Wikileaks documents for haiku. But it’s also just the latest in a burgeoning bunch of bots designed to find, create, or aggregate content. On the website Tiny Subversions, for example, programmer Darius Kazemi offers a number of cool content-generating bots (and their open-source code) like RapBot, which randomly generates 1980s-style raps, or Startup Generator, which, as the name implies, creates random startup ideas like “Reddit for parochial” or “Google Docs for uncomfortables,” poking fun at the absurdity and uselessness of so many web-based startup ideas.

How far can automatic content creation go? We’ve seen a few hints at the future, and the future could look bleak for some creative types. An obvious and widely reported example is when Netflix used big data aggregated from its millions of users’ preferences and viewing habits to decide choose its television series, House of Cards (David Fincher + Kevin Spacey + Adaptation of successful British Drama = success). No need for the creative genius or the hack to waste the producer’s time. The computer does the brainstorming.

More recently, Yahoo paid tens of millions of dollars to 17-year-old developer Nick D’Aloisio, who created Summly, a news summary app that uses bots to find and aggregate news stories, then summarize them into 400-character paragraphs. Such software is is still in the early stages of development and far from perfect. But it portends ill for blogger/aggregators who make their livings doing something close to what Summly does without the first-person or the usual snark. 

Which is to say it’s bad news for a lot of journalists I know, who are struggling to keep up with the demands for constant content—who rely a bit (or more) upon secondary-sourcing (as this article does) to make keep things flowing. A potential upswing? Faced with robotic competition for straight news, perhaps bots will force publisher resources back toward original reporting and in-depth analysis. On its own, all the snark and first-person (often ill-conceived) analysis can't be worth that much. 

Or perhaps we’ll all just be out of jobs. The future is unwritten; here’s hoping bots aren’t the only ones writing it.

Lead image by Mirko Tobias Schäfer via Wikimedia Commons
Darius Kazemi
Jacob Harris
Nick D’Aloisio
startup generator