How To Keep Stack Overflow from Turning Into a Landfill

Why the site is in decline and how to save it, according to IT researchers.

Jun 25 2016, 2:00pm

Image: Huguette Roe/Shutterstock

The bare existence of Stack Overflow is something to behold. In defiance of literally every other comment ecosystem, it is intensely productive and on-topic. If trolling exists within this software engineering Q&A Library of Babel, I've yet to encounter it. Instead, there exist dense stacks of usefulness: solutions, explanations, code. So much code. Also: brutal condescension, which is ultimately just the price of entry.

Stack Overflow is considered to be one of the most successful question and answer sites ever. 92 percent of questions posted about expert topics get answered, with a median time-to-answer of 11 minutes. Which is bonkers. It's fast enough to nearly obviate conventional web searching, as evidenced by an ever-increasing abundance of repeated questions. Why search when someone chasing upvotes is assuredly going to provide a brand new answer just for you in about the same amount of time?

As robust as it may seem, the continued success of Stack Overflow isn't guaranteed. Indeed, since 2014, the site has been in decline, as noted in a pre-print (paywalled) paper published this week in IEEE Software. New site users are falling, while question/answer failure rates are increasing. In 2014, for example, the proportion of questions that were either deleted or that went unanswered rose to nearly 40 percent. In 2011, it was about 22 percent. In the same period, the proportion of active users—those who posted an average of one question and one answer per month—fell from 15 percent to 5 percent. The site is now overwhelmingly lurkers. (Lurking, it should be noted, is highly reasonable given that SO functions in many cases as de facto technical documentation.)

The IEEE paper is interested in Stack Overflow as a case study in how to ensure sustainability in a community question answering (CQA) site, generally. Stack Overflow is far from doomed, but it may need to change.

"Besides the aggregate numbers, direct feedback from community members in various Internet discussions and blogs points out the emerging problems that threaten Stack Overflow's long-term sustainability," the authors, Ivan Srba and Maria Bielikova of Slovak University of Technology, write.

Image: Bielikova and Srba

In posts appearing within the Meta Stack Overflow portion of the site—where questions are asked regarding the site itself—complaints about content quality have appeared in sigificant numbers since 2014, according to the paper. Here, site users have identified three primary groups of users that can be singled out as having contributed to the site's overall decline. For anyone that's spent much time on Stack Overflow, these types should be familiar enough. Because they are annoying.

First, there are the "help vampires." When I noted above that in many cases it may be easier to just ask a new question rather than search for an answer given the community's amazingly quick response times, this is who actually does that. The result is tedious, often duplicated content. "Help vampires are interested only in getting answers to their questions; they don't return the help they've received back to the community," Bielikova and Srba write.

Next are the noobs. I know the noobs because I get emails from the noobs every time I write a how-to about programming. Noobs are characterized by asking only the most basic-ass shit and seem to have no willingness or interest in learning anything. Don't teach me how to do something, tell me how to do something. I don't know how people like this get through their day as functioning adults.

Finally, we have the reputation collectors. These are users that go out and try to answer as many questions as possible as quickly as possible in an effort to gain votes and, thus, to increase their site reputations. Given that some employers are now asking potential developer employees for Stack Overflow links, this seems a natural consequence. Reputation collectors can frequently be found reanswering answered questions with almost or even completely identical answers in the hope that someone will see theirs and upvote it.

"Fortunately, the Stack Overflow community also contains caretakers—experts who want to keep the system clean with valuable content," the paper explains. "Caretakers regularly search for interesting questions and provide good answers. Their presence is essential, and motivating them to stay active and devoted to the community is important."

That's the qualitative stuff. The paper next did a quantitative analysis based on Stack Overflow's open dataset, finding that in general good quality questions and answers (based on upvote numbers) have been in decline. While the total number of good questions has remained stable, the proportion of good questions had gone down owing to an overall increase in total questions. This indicates a stable core, but also that the site's growth consists largely of garbage.

"This finding confirms our hypothesis as well as the community perception that the system was flooded by content that nobody cared about, while really interesting content was getting rarer," Bielikova and Srba write. Brutal.

As far as fixing things, the paper offers two main solutions. The first is to shift the site's focus from being "asker-oriented" to "answerer-oriented." The basic idea is that instead of routing every question about a particular topic to potential answerers, there should be some diversification or retro-filtering to ensure that variations on the same question or super-easy questions aren't bombarding experts.

The second solution is also a bit vague, suggesting that CQA sites can benefit by involving whole communities rather than just small pools of experts. If top-quality experts are ignoring unappealing questions, leading to higher failure rates, than maybe those questions should somehow be sprayed out at the larger Stack Overflow populace.

In any case, I thought the paper was more interesting for the problems it highlights than the solutions. Stack Overflow's talent for self-regulation is really, truly impressive and, sure, results in some dickishness, but asking bad or lazy questions is its own kind of dickishness. It would be a shame to see it erode further.