Twitter API: With Great Power Comes Great Responsibility

2009-04-08 20:00:00 -0400


Last night the folks at Twitter released a significant update to their API. Among the changes was a massive change to the way the Twitter API responds to time-sensitive queries. In short, the since parameter and If-Modified-Since
header are no longer supported.

For those unfamiliar with the API, these features allowed searches for Public Tweets or Direct Messages that occurred after a specific point in time. It was intended to both reduce load on Twitter’s service and allow API clients to selectively retrieve tweets that they haven’t seen yet based on a time stamp.

The fact that the API changed is not shocking. The Twitter team had a solid reason behind it, as it seems date based queries weren’t handled efficiently enough for widespread use. There is even an alternate API search, based on message id, that can provide a similar function. Therefore these changes could really stand to benefit the developer community as a whole by providing increased capacity and performance for the API.

The API change was dangerous

First, and most importantly, the API was modified in a way that it would simply begin to ignore since parameters as if they weren’t even there. This means that clients querying the API with the expectation of retrieving a limited subset of tweets they hadn’t seen yet would, all of a sudden, start seeing a completely unfiltered feed of tweets, including ones from the past.

Consider the hundreds of applications like our own apps Tempo and PingMe, which process Twitter direct messages they haven’t yet seen to perform operations on a user’s data. Applications polling the Twitter API using the since parameter would be flooded by old, duplicate messages that they had already processed, possibly causing rampant duplication and perhaps data corruption.

edit: Let me draw an analogy: imagine that one day before release MySQL (or pick your favorite DB) decided to deprecate and ignore the the greater than > operator on queries. No errors – it would just pretend that query clause didn’t exist and return all rows for queries that used a >. How would that affect your application?

It is just poor form for a public API to change in such a way that the results of API operations become unpredictable like this. The API release notes claim this approach was used to ensure that existing applications didn’t break, but in fact would almost guarantee catastrophic failure of applications dependent on the function.

Errors are OK

Don’t get me wrong, it’s absolutely Twitter’s prerogative to change the API, and they can do so whenever they want. But with tens of millions of users and thousands of application and sites using the Twitter API, they should be more responsible about these kinds of changes.

If the API drops support for certain operations, just return an error message. Sure, it will be annoying to get error until the application is fixed, but at least you don’t risk corrupting thousands of down-stream integrations with improper data.

24 is only good enough for Jack

To make the situation worse, Twitter gave their developer community 1 day notice about the breaking change. Imagine if you didn’t get that email, or consider a case where an API release occurred without notification.

edit: When you expose a public API it becomes a contract with your developer community. If you make a breaking change to the API you have a fundamental responsibility to effectively communicate the change and ensure that it doesn’t create massive and unpredictable results for clients.

As Twitter grows and becomes more popular in the mainstream it needs to establish a process for notification on API changes and stick with it. Give the developer community enough time to properly code and test changes to their applications before a release. If it’s not convenient to do on the main API, perhaps they could consider some sort of business service that would provide premium access and a stabler environment for groups and teams like ours that really depend on it.