The Sexist Trolls Doubting Black Hole Researcher Katie Bouman Need to Learn to Code
The harassment campaign against one of the researchers who helped photograph a black hole isn't just sexist, it's technically simplistic and inaccurate.
Last week, fans of cool astronomical phenomena (read: almost everyone) rejoiced as an international team of scientists released the first ever image of a black hole. For the astrophysicists, software engineers, philosophers, and mathematicians who worked on the Event Horizon Telescope that captured the image, the announcement was an unprecedented milestone.
Their excitement was perhaps best embodied by a photo of one computer scientist on the Event Horizon Telescope team, Katie Bouman, who hid her beaming smile with her hands as she looked at the monumental rendering. Bouman had a lot to smile about—Bouman made crucial contributions to the computer imaging involved in the black hole project, even giving a TED Talk about her team's findings in 2016.
But within a day of the announcement, online harassers created fake Instagram accounts for Bouman, started angry threads on Reddit and Hacker News asserting that she hadn’t done as much to help the project as she was getting credit for, and produced lengthy YouTube tirades, all with the aim of discrediting her contributions to the project.
In computer science, the supposed inferiority of women is unfortunately a fairly common belief, one that has at times been expressed by software engineers at major tech firms. But here, rather than use crackpot psychology, these trolls used bad-faith interpretations of public Github repositories, which programmers use to host source code for the purposes of collaboration and contribution, in an effort to legitimize their vitriol.
The main contention uniting these attacks was ostensibly rooted in data: According to one Github repository related to the project, astrophysicist Andrew Chael wrote “850,000 of the 900,000 lines of code...in the historic black-hole image algorithm.” But this is a laughably simplistic assessment for a software library that was built with and released on open source licenses, which allow anyone in the world to use and adapt their code without much consideration for the number of lines of code shared along the way.
Microsoft software engineer Safia Abdalla, a maintainer on the open source interactive computing library nteract, told me via Twitter DM that one of the core tenants of open source is a community-focused approach to code.
“A lot of interesting innovations have come out of open source because innovative and interesting technical ideas thrive in open spaces,” she said. “Any open space needs to have a healthy community to foster collaboration and execution on ideas.”
"We might spend six hours hashing out some really difficult problem together... and at the end of the day, there might be five lines of code that were changed by one person"
Abdalla noted that open source programming thrives on a variety of contributions that never shows up on a GitHub repository or in the code itself, like planning releases, reviewing the work of fellow contributors, and writing documentation. In fact, lack of collaborative participation is one of the major threats to many invaluable open source libraries. She said the “lines of code” critique is reductive: “One person can add 20 lines of code to a project that add relatively little value, but another person might add a single line of code that resolves a major bug. You really have to look at open source with more nuance than lines of code.”
For programmers with even a basic knowledge of software development, it should seem obvious that the number of lines one commits to Github hardly correlates with success as an engineer. On principle, this kind of assessment is completely detached from the reality of software development.
Judging someone’s skill by their volume of code violates core principles of software development, like “D.R.Y.,” or “Don’t Repeat Yourself,” a philosophy that values reusable software over redundant code. But beyond technical critiques of “good” and “bad” code, such judgements simplify years of collaborative brainstorming down to a few lines on Github. For those most intimately involved with the Event Horizon Telescope, this couldn’t be further from the truth.
Andrew Chael, whom those harassing Bouman wanted to herald as the “real” hero of the black hole image, quickly jumped in to defend Bouman on Twitter, reinforcing the level of teamwork that went into the project. “While I wrote much of the code for one of these pipelines, Katie was a huge contributor to the software; it would have never worked without her contributions,” he said. “This was a team effort including contributions from many junior scientists, including many women junior scientists. Together, we all make each other's work better; the number of commits doesn't tell the full story of who was indispensable.”
Astrophysicist Michael D. Johnson, who worked with Chael and Bouman on the Event Horizon Telescope, also found these attempts to assign individualized credit to be woefully removed from the reality of collaboration on the Event Horizon Telescope.
“Certainly, we never gauged each other's contributions by lines of code,” he told me on the phone. “We might spend six hours hashing out some really difficult problem together... and at the end of the day, there might be five lines of code that were changed by one person.”
“All of this work is so tightly interlocked. Trying to assess it based on code here or there is just so profoundly misguided,” he added.
Seeking to credit one person for the development of an open source technology is not only massively simplistic from a technical perspective, but antithetical to the open source community’s goals and ambitions
And it was not just brain power that was indelibly interwoven in the development of the Event Horizon Telescope; like many codebases, the “eht-imaging” repository at the center of this debate is indebted to open source technologies, starting with Python, the language for the project. Johnson said working with open source technologies like Git and Python “just opens up so many scientific opportunities that would never be possible in a language like C, where it would take years of development from professional developers to do the same sort of thing.”
If not for Python modules like numpy and matplotlib, community libraries that provide tools like scientific computing and plotting, days of work could have instead taken “months or even years of effort,” said Johnson.
Of course, these scientists used these technologies to build proprietary algorithms to solve dense, complex problems pertaining to black hole computer imaging; using open source libraries certainly does not discredit years of research and development. But seeking to credit one person for the development of an open source technology is not only massively simplistic from a technical perspective (lest we count the millions of lines of code written in the numpy library alone), but antithetical to the open source community’s goals and ambitions.
In scientific fields that thrive on data, sexism can pass as legitimate when couched in the language of cold, unfeeling numbers and percentage points. But anyone with even a basic understanding of modern computer science should quickly realize how dangerous and plainly wrong these trolls are when they weaponize metadata from public Github repositories.
Though most in the scientific community were quick to dismiss these critiques and instead acknowledge the Event Horizon Telescope team’s monumental scientific achievements, thousands of people flocked to YouTube, Reddit, and HackerNews to amplify this false narrative. When contributions are reduced to individual lines of code, it threatens to derail the ethos of open source, a community that, at its best, is a symbol of the free exchange of ideas and information the internet promised to achieve.
Clarification: This article has been updated to remove a reference to an algorithm that was not used in the initial imaging of the black hole.