How a Hacker Used Python to Extract the Source Code for ‘Super Mario Bros.’

Hacker Matthew Earl used Python to extract raw visual data from Nintendo’s code.
Image: Matthew Earl

Programmer Matthew Earl needed level data from the original Super Mario Bros. for an upcoming project, so he decided to get that data the hard way. Earl wanted the background imagery for each distinct level—everything except the moving sprites and the HUD elements, such as the life total and coins.

There are many easy and well-worn ways to get that data, and people have been messing with the sprites from Super Mario Bros. for decades. As first spotted by Hackaday, Earl went the long way around. Instead of pulling the rendered assets out of the game, Earl dug into the source code itself and used an emulator in Python to extract the raw assets from the game and render it himself.

Nintendo has never released the official source code for the NES or any of its games, but industrious hackers have reverse engineered the code by pulling it apart and putting it back together on their own. Earl ran this code through a Python library called py65emu that emulates the NES machine’s assembly code. From there, he built a program that intercepts the visual data when it’s on its way from memory to to the picture processing unit and rendered it using Python. “And with that, we have managed to extract level imagery from SMB, purely in Python,” Earl wrote.

It seems like a labor intensive process, but Earl’s github post breaks it down and his work gives us a rare window into the inner workings of one of the most popular video games of all time.