This story is over 5 years old.


Researchers Are Using Tweets to Identify Transit Snarls

AI and social media could make your commute through the city smoother.
Image: Getty Images

The next time you tweet while on a Vancouver TransLink bus or train, Saeid Allahdadian might be taking notes about that post.

This postdoctoral researcher at the University of British Columbia is using AI technologies and social media data to map major travel routes. The goal is to identify areas in need of better transit service.

During three weeks this summer, Allahdadian analyzed 30,000 public geotagged tweets posted by 3,440 different individuals around Metro Vancouver and Surrey. The tweets were filtered based on if they were geotagged with a location, if they were publicly available, or if they mentioned @TransLink. He was then able to track users in real time, which helped him create patterns and population routes, as Allahdadian explained in an interview.


He then harnessed machine learning programs that could recognize and predict someone’s movement patterns based on their tweet locations throughout the day. In doing so, he found that the most popular transit routes are through downtown Vancouver, with second place going to downtown to Surrey City Centre; Burnaby to downtown; and then the Broadway corridor.

TransLink is busier than ever. Ridership surged to an all-time high in 2016, with 384.8 million boardings compared to 2015’s 362.9 million boardings.

While TransLink offers some ridership data sets, Allahdadian noted, they don’t “have the capability to track users and build up the inference to the population behaviour. The user of Compass cards [reloadable fare cards] don’t tap out when they are leaving so there is no data from their full trip available.”

“I also used census data to see that popular routes also matched up with higher population densities,” said Allahdadian.

That may seem obvious to Vancouverites. But he also discovered a few surprises, such as how travel times on the Langley to Surrey route can range up to 80 minutes. This is a popular route running for 11 kilometres, which takes far longer by transit than by car (around 16 minutes with minimal traffic). That finding suggests the need for faster, more frequent public transit options.

“When we use machine learning techniques [for this kind of research] we not only help with transit planning but also traffic planning, safety, service delivery, and land use,” noted Allahdadian.


The AI programs, created in the Python coding language, harnessed a pattern recognition program to identify the patterns of movement and sites of high population culled from the social media posts. “To my best knowledge it was not done before as such for this purpose on maps,” he added.

He also used clustering—the method of categorizing objects into different groups—and Natural Language Processing (a way to glean insights from the written language for computers) to help him extract the bus and stop numbers that were mentioned frequently in the tweets.

The geotagged tweets that didn’t mention TransLink were only analyzed for their location, Allahdadian said. When @TransLink was mentioned, he said the content was mostly negative, such as posts about late buses, and he was able to extract some details from those tweets, such as bus numbers and bus stop IDs.

These methods can also be useful during a major sports event or a natural disaster such as an earthquake, Allahdadian said. “By knowing where people are using transit the most, and where people are mostly located at a given time of day, emergency responders can shift their attention to those densely-populated areas.”

Allahdadian and his research team informed TransLink about the study, but it isn’t clear what the transit system plans to do about the results.

TransLink didn’t offer a representative to be interviewed for this article and instead replied to questions with this statement: “Innovations in social media and AI may one day present new opportunities to improve the system for our customers. We are soon starting the process of reviewing our long-term transportation strategy. This is the time to think big and we are always open to new ideas.”


Vancouver isn’t the only city to turn its attention towards machine learning programs to help them solve transit problems. In early 2017, a project in Singapore led by A*STAR Institute of High Performance Computing collected data from the city's smartcard system (which services as an all-in-one ID, payment, and transit card) on transit users tapping in and out of bus and subway stations over a period of a week—more than 20 million journeys—and used machine learning models to reproduce and predict transit ridership across the city. Some transit insiders take issue with these layers of data. Jarrett Walker, a long-time transit consultant and creator of the blog Human Transit, disagrees with the idea “that we need more detailed data on what everyone else is doing.” In an interview, he said that “Big Details isn’t so helpful when drawing up giant frequent fixed routes that will carry 30-60 passengers per vehicle an hour, because in designing those we are responding to large patterns of demand that easy to see with a higher altitude.”

As for privacy implications of studying people’s tweets, Allahdadian said the analyzed tweets are available to the public and the location-based tweets are publicly geotagged. He didn’t report on any findings related to individual tweets.

Allahdadian believes he’s only grazed the surface of what can be extracted from Big Data. As more cities go smart, expect transit planners to analyze ridership data through machine learning programs, he predicted.

“People using AI and machine learning methods can learn from this data and these programs can become trained to not only give us insights for the current transportation system, but also to predict the conditions for the future."

Presented by Toyota