Facebook. Big marketing data capital of the world, frequent haunt for teens, and now stomping grounds for sad college kids. Science confirms as much: a study done through Tufts University's computer science department parsed through some 24,000 Facebook posts on the Tufts Confessions page dating back to late 2013, and organized them into 13 topic groups.
Confessions pages are common across colleges and often end up being anonymous, moderated spaces where students can post about their feelings about student life. The messages are sent to an admin, who confirms that the confession is from an actual Tufts student, and then posts it on the page.
The Tufts study found that, among 12 other topic groups, people felt lonely most frequently, in 22 percent of the sampled posts.
So those keywords associated with loneliness seem pretty commonplace, right? It might be a stretch to say that any post with "want," "sometimes," or "talk," which are abstract on their own, could be pinned to any single feeling. But once you feed in 24,000 messages, those containing at least several of those keywords under loneliness get a lot more context.
"It's the sum of the entire neighborhood that constructs an interpretable basis when our model converges and gives meaning to a topic," Soubhik Barari, author of the paper and Tufts computer science undergrad, told me in an email. "One comprised of 'lunch,' 'sandwich,' 'ice cream,' may be interpreted as 'Food'; it wouldn't be out of line to infer that 'want,' 'people,' 'care,' are voicing Loneliness in a text."
Perhaps "antisocial" or "yearning" would be fitting catch-alls. After all, posts where people would want or wish people would do something or stop doing something could be just as much sideline criticisms as they are signals of loneliness.
"The vital question is how we leverage those insight[s] for good and not evil"
How did he do it? Barari ran Facebook posts through a natural language processing program he wrote in Python and separated the post contents into topical groups by through the Latent Dirichlet allocation. Facebook uses the same sort of topic modeling in its own research. Through that, he came up with "trigrams," keyword triplets that roughly correspond to a topic label.
Facebook obviously makes use of its humongous user base to further social computing: in 2014 the site toyed with updates in some 700,000 users' feeds to manipulate their emotions. While that's more ethically questionable than simply parsing user-submitted texts, these studies make it sufficiently clear that Facebook surfaces certain emotions more effectively than other forms of conversation, especially when messages are anonymized.
"What generationally ties us together as common trends in our natural language patterns? These are questions you can only so adequately begin to answer with administrative surveys, community psych studies, etc," Barari told me. "Social networks can give us unprecedented scale and insight into how collegiate culture and psyche mesh (or don't mesh)—the vital question is how we leverage those insight[s] for good and not evil."
Natural language processing could end up being a more genuine campus temperature checker, especially since it's posed in an environment where a) students don't have to be bothered to give a response, they've already posted something usable and b) they're totally anonymous and baring thoughts they usually veil in public. Barari is thinking that similar studies can be pushed along to other campus communities to analyze what's critically different between them.
Or we might confirm some timeless assumptions about college life: students are often pretty sad, busy, outspoken and horny.