This story is over 5 years old.

Uncovering Fraudulent Research Could Be As Easy As Looking At Language

Flowery language may actually be the best way to tell if a study is complete bullshit.
November 8, 2014, 1:00pm

As someone who writes about new developments in science and technology, I read a ton of studies every day. I can tell you from experience that they're usually pretty dry on the language front. Clever turns of phrase, exclamatory remarks, and brilliant metaphors don't usually find themselves in science studies for a good reason: the science needs to speak for itself.

According to a study by researchers at Cornell University, published in PLOS One earlier this year, flowery language may actually be the best way to tell if a study is complete bullshit.

David Markowitz, a PhD candidate at Cornell's Department of Communication, and his colleague Jeffrey Hancock analyzed the linguistic patterns of legit and fraudulent papers first-authored by disgraced Dutch professor of social psychology Diederik Stapel to find out if changes in how he employed language correlated with his deceit.

Stapel authored over 120 papers throughout his career on tantalizing media-bait topics like the social dynamics of selfishness and stereotyping. In 2011, he was found guilty of scientific fraud due to basing his studies on fabricated data. Subsequent investigations found that 55 of his published papers were total bunk—and it appears he left a trail of clues in the language he chose.

liars have trouble approximating the proper amount of genre-specific terms to use in their writing

Previous research has shown that patterns of language can be good indicators of deceit in fields outside of science. A 2003 study that asked participants to write untruthful statements in several experiments on topics like abortion legislation found that liars tended to use less adjectives while also playing up the affective dimension of their argument by using positive superlatives.

Other studies have found that liars have trouble approximating the proper amount of genre-specific terms to use in their writing, as was the case in a study analyzing fake hotel reviews, whose authors' underuse of spatial dimensions flagged their reviews as fraudulent.

Markowitz and Hancock discovered that Stapel's fraudulent papers conformed to these findings in non-scientific contexts.

"Stapel also wrote with more certainty when describing his fake data, using nearly one-third more certainty terms than he did in the genuine articles," the authors wrote. "Words such as 'profoundly,' 'extremely,' and 'considerably' frame the findings as having a substantial and dramatic impact."

Additionally, they discovered that Stapel overused scientific jargon (what the researchers deemed to be genre-specific), indicating that he had trouble estimating the right amount to use in order to make his accounts seem truthful. He also used far fewer adjectives in his fraudulent papers than in his truthful ones.

Analyzing just one researcher's fraudulent writing, no matter how widespread or spectacular, doesn't constitute a surefire method of detecting fraud, however—something Markowitz and Hancock are careful to note in their study.

Even so, their findings could provide a potentially helpful way for nerds and armchair scientists to parse the good from the really fucking terrible when it comes to science papers. If you read a study that seems just a little too sure of itself or packs in enough methodological jargon to make your head spin, you should probably do a little digging of your own to make sure you're not being duped by an academic huckster.