Facebook Pulls Its New ‘AI For Science’ Because It’s Broken and Terrible

The demo of Galactica generated fake research and filtered out searches for ‘racism’ and ‘AIDS.’
Janus Rose
New York, US
A screenshot of the Galactica website with the title "Our Mission: Organize Science"
Image: Meta / Papers with Code

Facebook parent company Meta has pulled the public demo for its “scientific knowledge” AI model after academics showed it was generating fake and misleading information while filtering out entire categories of research.

Released earlier this week, the company described Galactica as an AI language model that “can store, combine and reason about scientific knowledge”—summarizing research papers, solving equations, and doing a range of other useful sciencey tasks. But scientists and academics quickly discovered that the AI system’s summaries were generating a shocking amount of misinformation, including citing real authors for research papers that don’t exist.


“In all cases, it was wrong or biased but sounded right and authoritative,” Michael Black, the director of the Max Planck Institute for Intelligent Systems, wrote in a thread on Twitter after using the tool. “I think it's dangerous.”

Black’s thread captures a variety of cases where Galactica generated scientific texts that are misleading or just plain wrong. In several examples, the AI generates articles that are authoritative-sounding and believable, but not backed up by actual scientific research. In some cases, the citations even include the names of real authors, but link to non-existent Github repositories and research papers.

Others pointed out that Galactica was not returning results for a wide range of research topics, likely because of the AI’s automated filters. Willie Agnew, a computer science researcher at Washington University, noted that queries like “queer theory,” “racism,” and “AIDS” all returned no results.

Early Thursday morning, Meta took down the demo for Galactica. When reached for comment, the company directed Motherboard to a statement it had released via Papers With Code, the project responsible for the system. 

“We appreciate the feedback we have received so far from the community, and have paused the demo for now,” the company wrote on Twitter. “Our models are available for researchers who want to learn more about the work and reproduce results in the paper.”


Some Meta employees also weighed in, implying the demo was removed in response to the criticism.

“Galactica demo is off line for now,” tweeted Yann LeCun, Meta’s chief AI scientist. “It’s no longer possible to have some fun by casually misusing it. Happy?”

It isn’t the first time Facebook has had to explain itself after releasing a horrifyingly biased AI. In August, the company released a demo for a chatbot called BlenderBot, which made “offensive and untrue” statements as it meandered through weirdly unnatural conversations. The company has also released a large language model called OPT-175B, which researchers admitted had a “high propensity” for racism and bias—much like similar systems, like OpenAI’s GPT-3.

Galactica is also a large language model, which is a type of machine learning model known for generating exceptionally believable text that feels like it was written by humans. While the results of these systems are often impressive, Galactica is another example of how the ability to produce believable human language doesn’t mean the system actually understands its contents. Some researchers have questioned whether large language models should be used to make any decisions at all, pointing out that their mind-numbing complexity makes it virtually impossible for scientists to audit them, or even explain how they work.

This is obviously a massive problem, especially when it comes to scientific research. Scientific papers are grounded in rigorous methodologies that text-generating AI systems clearly can’t comprehend—at least, not yet. Black is understandably worried about the consequences of releasing a system like Galactica, which he says “could usher in an era of deep scientific fakes.”

“It offers authoritative-sounding science that isn't grounded in the scientific method,” Black wrote in the Twitter thread. “It produces pseudo-science based on statistical properties of science *writing*. Grammatical science writing is not the same as doing science. But it will be hard to distinguish.”