Tech

A Code Glitch May Have Caused Errors In More Than 100 Published Studies

The discovery is a reminder that science is collaborative and ideally self-correcting, but that nothing can be taken for granted.
A Code Glitch May Have Caused Errors In More Than 100 Published Studies
Image: Getty

Scientists in Hawaiʻi have uncovered a glitch in a piece of code that could have yielded incorrect results in over 100 published studies that cited the original paper.

The glitch caused results of a common chemistry computation to vary depending on the operating system used, causing discrepancies among Mac, Windows, and Linux systems. The researchers published the revelation and a debugged version of the script, which amounts to roughly 1,000 lines of code, on Tuesday in the journal Organic Letters.

Advertisement

“This simple glitch in the original script calls into question the conclusions of a significant number of papers on a wide range of topics in a way that cannot be easily resolved from published information because the operating system is rarely mentioned,” the new paper reads. “Authors who used these scripts should certainly double-check their results and any relevant conclusions using the modified scripts in the [supplementary information].”

Yuheng Luo, a graduate student at the University of Hawaiʻi at Mānoa, discovered the glitch this summer when he was verifying the results of research conducted by chemistry professor Philip Williams on cyanobacteria. The aim of the project was to "try to find compounds that are effective against cancer,” Williams said.

Under supervision of University of Hawaiʻi at Mānoa assistant chemistry professor Rui Sun, Luo used a script written in Python that was published as part of a 2014 paper by Patrick Willoughby, Matthew Jansma, and Thomas Hoye in the journal Nature Protocols . The code computes chemical shift values for NMR, or nuclear magnetic resonance spectroscopy, a common technique used by chemists to determine the molecular make-up of a sample.

Luo’s results did not match up with the NMR values that Williams’ group had previously calculated, and according to Sun, when his students ran the code on their computers, they realized that different operating systems were producing different results. Sun then adjusted the code to fix the glitch, which had to do with how different operating systems sort files.

Advertisement

Willoughby, the first author of the 2014 study who wrote the script, called the new study “a beautiful example of science working to advance the work we reported in 2014.”

“They did a tremendous service to the community in figuring this out,” he said.

Williams said that the original study is an elegantly written paper that is “incredibly useful to a large group of people, and the error is very subtle." Nevertheless, the researchers believe that the glitch could have produced serious downstream effects.

For example, if the code led Williams to wrongly identify the contents of his sample, chemists trying to recreate the molecule to test as a potential cancer drug would be chasing after the wrong compound, Williams said.

“The process of science here worked exactly as it’s supposed to"

It is unclear how many papers this glitch may have affected—researchers do not typically disclose the operating system they use for their analyses since it should be irrelevant, Williams said. According to metrics from Nature Protocols, the 2014 paper has been accessed nearly 1,900 times and cited in 158 other studies. However, not every study that cited the paper may have used the script.

Rob Keyzers, a chemistry lecturer at Victoria University of Wellington in New Zealand who had cited the protocol in a study published this year, said in an email that he was not aware of the glitch. He added that he was not “unduly worried” about his results since his group did not use the script containing the glitch. “I will certainly check our data carefully to make sure that we are not making any undue claims though,” he said.

Advertisement

In late July, Williams and Sun reached out to the authors of the original paper, alerting them to the glitch. Williams said he hoped that by working together, they could bring the problem to the attention of researchers who had used the code.

“I personally haven’t had the opportunity to go to the original authors [of a study] and announce a bug, so I wasn’t sure what to expect,” Sun said, adding that the authors were “very gracious” and encouraged him and Williams to publish their findings.

“The process of science here worked exactly as it’s supposed to,” Williams said.

Process chemist Lucas Moore, who tweeted about the study, said in an email that “news of a bug like this is something that we in the scientific community tend to spread far and wide–we really want to get it right.”

Willoughby and Hoye plan to update the Nature Protocols study acknowledging the glitch and providing Sun’s fixed code, according to correspondence from Hoye quoted in the new paper.

In a statement, a spokesperson for Nature Protocols said that they are looking into the issues raised in the new study, but for confidentiality reasons cannot comment on individual cases.

The incident is a reminder that science is collaborative and ideally self-correcting, but that nothing can be taken for granted.

“This is a very small and subtle error,” Williams said. “We all kind of assume that a computer program always spits out the correct answer.”