Researchers Hid Malware Inside an AI’s ‘Neurons’ and It Worked Scarily Well

In a proof-of-concept, researchers reported they could embed malware in up to half of an AI model’s nodes and still obtain very high accuracy.
July 22, 2021, 2:42pm
Researchers Hid Malware Inside an AI's 'Neurons' And It Worked Scarily Well
Artist's conception of an AI computer vision model. Image: AerialPerspective Images via Getty Images

Neural networks could be the next frontier for malware campaigns as they become more widely used, according to a new study. 

The study, which was posted to the arXiv preprint server on Monday, found that malware can be embedded directly into the artificial neurons that make up machine-learning models in a way that keeps them from being detected. The neural network would even be able to continue performing its set tasks normally.

Advertisement

“As neural networks become more widely used, this method will be universal in delivering malware in the future,” the authors, from the University of the Chinese Academy of Sciences, write. 

Using real malware samples, their experiments found that replacing up to around 50 percent of the neurons in the AlexNet model⁠—a benchmark-setting classic in the AI field⁠—with malware still kept the model’s accuracy rate above 93.1 percent. The authors concluded that a 178MB AlexNet model can have up to 36.9MB of malware embedded into its structure without being detected using a technique called steganography. Some of the models were tested against 58 common antivirus systems and the malware was not detected.

Other methods of hacking into businesses or organizations, such as attaching malware to documents or files, often cannot deliver malicious software en masse without being detected. The new research, on the other hand, envisions a future where an organization may bring in an off-the-shelf machine learning model for any given task (say, a chat bot, or image detection) that could be loaded with malware while performing its task well enough not to arouse suspicion. 

According to the study, this is because AlexNet (like many machine learning models) is made up of millions of parameters and many complex layers of neurons including what are known as fully-connected "hidden" layers. By keeping the huge hidden layers in AlexNet completely intact, the researchers found that changing some other neurons had little effect on performance.

Advertisement

In the paper, the authors lay out a playbook for how a hacker might design a malware-loaded machine learning model and have it spread in the wild:

"First, the attacker needs to design the neural network. To ensure more malware can be embedded, the attacker can introduce more neurons. Then the attacker needs to train the network with the prepared dataset to get a well-performed model. If there are suitable well-trained models, the attacker can choose to use the existing models. After that, the attacker selects the best layer and embeds the malware. After embedding malware, the attacker needs to evaluate the model’s performance to ensure the loss is acceptable. If the loss on the model is beyond an acceptable range, the attacker needs to retrain the model with the dataset to gain higher performance. Once the model is prepared, the attacker can publish it on public repositories or other places using methods like supply chain pollution, etc."

According to the paper, in this approach the malware is "disassembled" when embedded into the network's neurons, and assembled into functioning malware by a malicious receiver program that can also be used to download the poisoned model via an update. The malware can still be stopped if the target device verifies the model before launching it, according to the paper. It can also be detected using “traditional methods” like static and dynamic analysis.

“Today it would not be simple to detect it by antivirus software, but this is only because nobody is looking in there,” cybersecurity researcher and consultant Dr. Lukasz Olejnik told Motherboard. 

Advertisement

Olejnik also warned that the malware extraction step in the process could also risk detection. Once the malware hidden in the model was compiled into, well, malware, then it could be picked up. It also might just be overkill. 

 "But it's also a problem because custom methods to extract malware from the [deep neural network] model means that the targeted systems may already be under attacker control," he said "But if the target hosts are already under attacker control, there's a reduced need to hide extra malware." 

"While this is legitimate and good research, I do not think that hiding whole malware in the DNN model offers much to the attacker,” he added.

The researchers noted in the study that they hoped that this could “provide a referenceable scenario for the defense on neural network-assisted attacks.” They did not return Motherboard’s request for comment.

This isn't the first time that researchers have looked into how neural networks can be exploited by malicious actors, such as with images designed to confuse them or by embedding backdoors that would cause models to misbehave. If neural networks really are the future of hacking, this could become a new threat to large companies as malware campaigns increase. 

“With the popularity of AI, AI-assisted attacks will emerge and bring new challenges for computer security. Network attack and defense are interdependent,” the paper notes. “We hope the proposed scenario will contribute to future protection efforts.”