Where Did Twitter's Extra Characters Come From?

Twitter is asserting its independence by moving links, media, and usernames from the content of a tweet into the metadata.

Jun 6 2016, 5:55pm

Jack Dorsey in 2008. Photo: Joi Ito/Flickr

Late last month, Twitter announced a slew of changes to the rules it uses to calculate the character limits for tweets, which previously had always been strictly capped at 140 characters. Essentially, the changes amount to exempting certain user behaviors from the character limit in order to encourage them: videos, GIFs, and initial @ mentions that address tweets to other users won't count, for example. Decisions like these shape the way people use online platforms, so users immediately began worrying about the social changes that might result. However, those concerns only scratch the surface of this change, which is about much more than just the message length.

The extra characters come from redefining what is considered valid message content. Links, images, videos, usernames, and so on are now being moved from the content into the metadata that accompanies each tweet whenever they are transmitted through data exchange APIs. For example, @ reply mentions added to the start of a tweet will not count toward the character total. The screen names of those addressees will be stored as metadata, but will continue to be displayed as part of the text when the tweet is actually rendered. Twitter has always performed text analysis of the tweet content to detect these elements and extract them as distinct entities to discrete data fields. Now the separation is fixed and formal: They're auxiliary data points, not part of the message content, and as such they will need to be specifically interpreted by any program, site, or service that integrates with Twitter. This means that tweets are no longer just text; they are turning into something entirely new.

Twitter's technical documentation for this change distinguishes between "classic tweets" and the new "extended tweets."

It is a big deal that the character limit is changing at all. Twitter is older than the iPhone, and as a result it was originally conceived as a service that could operate entirely through SMS messages. That's a powerful compatibility layer, which is why past changes have always been handled carefully so as to fit within the character limit of SMS.

In particular, the introduction of built-in link shortening in 2011 dramatically prolonged SMS compatibility. For the next few years it provided a way to pack more into tweets without increasing the character limit. A link can point to anything—a video, an image, a gallery of several images, even a reference to another 140-character tweet, leaving room to spare for additional commentary. These have all since become important parts of the Twitter timeline. In theory, anything Twitter wanted to build could conceivably be captured in a URL which would then fit nicely within the original framework of the service.

The changes announced last month mark the most dramatic expansion of the tweet both conceptually and in terms of the raw character count: Under the new rules, a tweet no longer fits within an SMS. Obviously Twitter grew into a unique service long ago, and consequently it has been many years since anyone could reasonably mistake a tweet for an SMS message circulating around inside a new network, but the length restriction nonetheless continued to shape the service practically and culturally, even if it wasn't technically central anymore

The most immediate practical effect is that Twitter becomes a bit less viable in emerging markets where elaborate smartphones aren't as prevalent. In April, the company said 6.4 million monthly active users were accessing tweets through Twitter's "Fast Follow" feature, which allows users to receive notifications over SMS without first creating an account. The new tweets won't work properly for them, since devices that do not have dedicated clients will likely be cut off from some of the features and behaviors enabled by this new character counting logic. The gap between the haves and the have-nots becomes more pronounced.

That's lamentable in a sense, of course, but at some point we probably need to accept that technology sometimes pushes forward faster than the people who use it, and decoupling tweets from SMS has been a viable, sensible decision for years. The 140 character limit may have come to symbolize brevity as a general guiding principle, but for users of the smartphone apps, third-party clients, or web UI, it hasn't been technically meaningful for many years. This is a statement of independence, in a way: Twitter has decided that the tweets no longer need to piggyback on another established technology.

Ultimately, though, device support is also a superficial concern given the overall scope of this change. The most important issue is that this redefines what it means for a tweet to be a tweet. They can no longer be expressed purely through text; they're little triptychs of data, because the extra characters come from sandwiching the tweet between new prefix and suffix segments. Tweets have always had elaborate metadata, but now portions of that metadata will become required in order to read or render the tweets at all—or, perhaps most crucially for a social network, to understand who they are addressing

This means that the message—which is to say, the tweet as we typically conceive of it—will from here forward be impossible to understand until the sections are joined together. Multimedia, in particular video, occupies a prominent and privileged role in the strategies of internet media companies, including VICE, Motherboard's parent company. Now, that media content as it exists on Twitter is moving out of the text content, which is most easily replicated and translated, and into the metadata.

Twitter's technical documentation for this change distinguishes between "classic tweets" and the new "extended tweets." The primary difference is that the latter has "hidden text regions," prefixes and suffixes of metadata which bookend a "display text region." It's a data structure rather than just a string of characters, and the display text region which is gaining the extra characters is just one component. This may already be the case to some degree for those who consume Twitter primarily through tools built on top of the APIs, but pretty soon it will be inescapable, a fundamental part of how we see and consume each and every tweet, no matter where we encounter it.

In the near future, tweets will be divided into internal segments. Image via Twitter

In order to avoid breaking existing clients that still expect at most 140 characters of text, "extended" tweets will by default be compressed down to "compatibility mode." This means that each tweet will be sent out in truncated form, with a "self-permalink URL" appended which points the older clients toward the newer tweets, using the web as an intermediate interpreter. At that point the tweets being served by the API to third-party clients and integrated services aren't even really tweets anymore—they're self-referential abbreviations. This points toward the magnitude of the change: under the new rules, many existing Twitter clients will be unable to read tweets, and the Twitter API will be unable to send them out. Extended tweets aren't actually presented as a breaking change to the API, but that's only because the API is defaulting to compatibility mode in order to bridge the gap. Fundamentally, the data structures for classic and extended tweets cannot actually be reconciled, because SMS was not designed with extra space for a hidden prefix and suffix.

Under the new rules, many existing Twitter clients will be unable to read tweets, and the Twitter API will be unable to send them out

As dramatic as this change is, there is actually a precedent even within Twitter's very recent history. Tweet functionality has almost always been extended through embedded links, but the most prominent exception to this has been polls, which are inaccessible over SMS and even Twitter's own APIs. They are disembodied nuggets floating around inside Twitter, accessible through the user interface but otherwise disconnected from the outside world. It's unlikely that this is the fate awaiting video or images hosted, of course—imagine the outcry! But no matter what tremendous plans Twitter may have for its social interactions and multimedia content, the first step will be to cleanly separate them.

Twitter long ago outgrew terms like "microblogging" and "status update," but this change finally dispenses with them for real. This, too, could be interpreted as a statement of independence, along a slightly different axis: the new extended tweets are structurally idiosyncratic, and in the near future, those idiosyncrasies will become crucial to the experience of using the service. At upwards of 500 million posts per day, tweets are arguably our most prominent modern media format, and now they are now finally formally distinct from everything else. Whatever Twitter eventually becomes in the future, it will be built from this schism.

Correction: An earlier version of this story said Twitter claimed 6.4 million people had signed up for its "Fast Follow" or SMS-only feature in the first quarter of 2016; in fact, Twitter claimed 6.4 million monthly active "Fast Follow" users.