overwatering.org

blog

about

A sports broadcasting project I worked on a year ago used XMPP as a performance optimisation and for in-game chat. We used ejabberd (on AWS EC2) for the server, Strophe.js for our web clients and Asmack for the Android client (which didn’t end up getting released.) It was an interesting experience, and I’d use that combination of tools again.

Some things we learnt:

  • You can use ejabberd in as a stateless router of messages. I’d highly recommend this: it was pretty useful to us to be able to tear down our cluster of servers and re-build without any loss of data during a game. What does this mean? One example: ejabberd will happily provide the history for a chat room, but don’t use that. Instead, associate a resource with each chat room that holds the history you require. Another: we used Publish/Subscribe, but never assumed the publishing node existed, instead we’d check if it existed and create if necessary. There are other things you’ll want to do, specific to your use of XMPP.

  • The Strophe.js client is pretty comprehensive, but poorly documented and with some weird quirks. You need to be really careful about the values you return from callbacks, for example. There’s an open source IM client, called Candy implemented using Strophe.js. I highly recommend reading that code pretty closely.

  • We found that though Asmack was difficult to compile, it has a really nice and well-documented interface. We quite liked using it, at least compared to Strophe.

  • Clustering ejabberd is possible though a bit painful. There’s some tricks and manual steps to it.

  • If you’re coming in from a web client you will be using BOSH (like Comet, but for XMPP.) Contrary to claims it’s not ‘a recipe for crashiness’, but if you choose to cluster behind an ELB you need to watch out for connection leaks. Unfortunately we didn’t have time to go beyond diagnosing the problem. The browser would open a long-lived HTTP connection to the ejabberd server, routed through the ELB. After about two minutes, the ELB would drop the out-bound connection to the browser, but the connection between the ELB and the server would be held open. These leaked connections on the server would eventually pile up enough to kill the server. Pretty serious, but it felt fixable.

  • There is an RFC and even an ejabberd module to run XMPP over WebSockets. To me, this sounds like an improvement over BOSH, but we never got around to trying it out.

  • XMPP is a very extensible protocol. There are a lot of XEPs. Servers like ejabberd and OpenFire implement many of these. Think hard about your use case and pick the most appropriate XEP to model what you want to achieve. We ended up using PubSub for our performance optimisation and MUC for our in-game chat.

  • Connections to an XMPP server are long-lived. They will randomly drop. You need to understand the connection life-cycle and manage that properly. With Strophe you can do some neat things about remembering your on connect callback, and just re-run it. Took some experimentation but I was pretty happy with how that turned out.

  • We never did any authentication of users. ejabberd has lots of support, but we wanted to use it just as a router.

I’ll try to dig out some example code showing how we worked with Strophe.js, in separate posts.