If someone gave you an untitled recipe and told you to read the instructions and draw a picture of the finished dish, your success might depend on how much you cook. If that’s a lot, you’d probably get what the food is supposed to look like, even if you’re held back by your drawing skills. If you don’t cook, though, the steps of turning a chunk of cauliflower into the base of a “pizza,” for example, might result in some incoherent scribbles.
As the present draws closer to a Blade Runner-esque future, it shouldn’t surprise you to learn that a computer can do the same task now, too, given that screens can bypass cashiers at McDonald’s and robots in Boston can assemble salads. The results are a little uncanny, but overall, surprisingly legit.
The study was done by researchers at Israel’s Tel Aviv University, who published “GILT: Generating Images from Long Text” last month. If you don’t have a background in computer science, the details of the paper might be overwhelming, but lead researcher Ori Bar El described the project to MUNCHIES in simpler terms.
The researchers fed the algorithm over 50,000 recipes and associated images. Then, the algorithm turned the text and images from each into unique numerical representations. Using that information, the algorithm can make a new image by trying to match the numerical representation of a specific recipe. The project relies on a huge dataset, which the TAU scientists got through similar research from last year that scraped over 1 million recipes and 13 million food pictures from dozens of recipe sites.
In order to truly test the algorithm, the task needed to not be that easy, so the researchers didn’t give the algorithm the corresponding recipe title. “If you have a recipe that says pasta Bolognese, you know the image is going to be of pasta Bolognese,” Bar El told MUNCHIES. “If you see a recipe that says take mince meat, pasta, you’re not sure if it’s going to be a pasta Bolognese or a lasagna.”
Beyond separating the ingredients and the instructions into two sections, he said that the recipes also weren’t standardized or simplified. According to Bar El, they trained the algorithm using both easy recipes and difficult ones. “The recipes were pretty complex, some of them consisting of thirty lines,” he said.
The results look, for the most part, overwhelmingly edible, even if they don’t make up the prettiest plate. They might not earn the “#foodporn” moniker, but I’ve definitely seen worse, like Martha Stewart’s old school Instagrams or that one person in your feed who still thinks the Amaro filter is a worthy photo treatment.
According to Bar El, the algorithm generates “porridge-like” foods well, but it seems to struggle with foods that have a defined shape, like burgers or chicken. He hopes to continue the recipe training. “It’s like humans: If you show me one recipe or two, it’s harder for me to understand what food is described, but if you show me thousands of images, it’s much easier,” he said.
And though the team worked on food for this training, Bar El sees an immediate application in the world of children’s books or posters, since the study, he said, is proof that the algorithm can generate images from any long string of text.
The larger implication, Bar El told MUNCHIES, is that the research proves that computers might be more capable than we thought. “I believe that this is a good test for the capacity of abstraction that computers have and their sense of what we call imagination,” he said. “It’s more of a test of how human a computer can be. This task, as compared to other tasks in artificial intelligence, is hard also for humans.”
So it’s clear: Computers are taking over the world, one blurry food picture at a time.