We have taught computers to do some amazing and horrible things, as a species. But nothing summarizes both of these facets quite like a machine-learning-generated snippet of Kanye West rapping Eminem's "Lose Yourself" with what sounds like a mouthful of stockpiled quarantine Nutella.
This is just one example of the thousands of cursed yet compelling song snippets generated by Jukebox, machine learning software developed by independent research organization Open AI and released to the world on Thursday. The fine details (which you can read in an accompanying paper) are complicated but the general idea is the researchers trained machine learning models capable of parsing music on audio from more than 1 million songs pulled from the web. From this fuzzy internal picture of what constitutes listenable music, Jukebox generates new songs in various genres and in the style of specific artists. The final product includes AI-generated music and vocals.
Open AI highlighted a few of the best products in its blog announcing Jukebox, which include an Elvis Presley-esque song that sounds like a sleep paralysis aural hallucination and a country song in the style of Alan Jackson that is really not all that bad. While even the best examples sound like low-bitrate MP3 rips thrown in a blender with cough syrup, they do pretty much sound like music! It appears the lyrics in the highlighted examples may have had some help, though, since they are credited to both the machine learning program GPT-2 and researchers.
This isn't the first time that someone has tried generating music with AI using different approaches, or synthesizing celebrity soundalike voices. A YouTube channel called Vocal Synthesis recently ran into copyright trouble with Jay-Z for allegedly "impersonating" his voice with machine learning.
One of the cooler aspects of Jukebox is hearing what it picks up on in singer's voices without being too exact. None of the voices in the generated examples are perfect reproductions, but the program has clearly picked up on Celine Dion's characteristic vibrato, for example.
One highlighted song is listed as being "in the style of Rage" (presumably, against the machine), and the generated voice has somehow picked up some distinctive quirks from Metallica's James Hetfield.
Clearly, the whole enterprise of generating music with computers has a ways to go. For one thing, it takes nearly nine hours to generate one minute of audio. But if you don't listen too closely, you can almost hear the sounds of tomorrow's AI SoundCloud hits.
Correction: An earlier version of this article stated that Jukebox generates lyrics, when in fact the lyrics were generated by a separate Open AI model, GPT-2. Motherboard regrets the error.