overwatering.org

blog

about

ejabberd is a mature, powerful, high-performance, open-source XMPP server. If you want to use XMPP, I highly recommend it.

There are some quirks, however. For example, while it does support clustering, setting it up is not straight-forward. The documentation describes everything that is possible, but doesn’t really spell out the steps. For a recent project, I had to figure out how to cluster it using AWS.

Setup

Create an elastic load balancer and include listeners for HTTP on port 5280, TCP port 5222 and TCP port 5269. Create your main ejabberd node, add it to the ELB and then point the domain name for your ejabberd server at the ELB, typically using a CNAME record.

There’s nothing special in creating the main ejabberd node, a standard installation will do the job.

Adding a Slave to the Cluster

It’s adding a slave node to the cluster where special effort must be made. Here are some manual steps derived from the ejabberd manual — obviously, there would be some benefit to automating these.

  1. ssh to the existing main node.
  2. Grab the erlang cookie: sudo more /var/lib/ejabberd/.erlang.cookie
  3. Copy the output, and disconnect from the node.
  4. Create a new ejabberd node — however you do this.
  5. ssh to the new slave.
  6. Stop ejabberd: sudo /etc/init.d/ejabberd stop
  7. Delete the default DCD database files: sudo rm -f /var/lib/ejabberd/*.DCD
  8. Delete the default DCL database files: sudo rm -f /var/lib/ejabberd/*.DCL
  9. Delete the default DAT database files: sudo rm -f /var/lib/ejabberd/*.DAT
  10. Replace the erlang cookie on the slave with the value from the master: sudo echo <copied cookie> > /var/lib/ejabberd/.erlang.cookie
  11. Start running as the ejabberd user: sudo -u ejabberd bash
  12. cd /var/lib/ejabberd
  13. Verify that the new cookie is here: more .erlang.cookie
  14. Start an erlang interpreter: erl -setcookie '<cookie value>' -sname ejabberd -mnesia dir '"/var/lib/ejabberd/"' -mnesia extra_db_nodes "['ejabberd@<name-of-master-node>']" -s mnesia Make sure you use a name for the master node that is visible from the slave node, over the private interface.
  15. At the erlang prompt, run mnesia:info(). Don’t forget the full-stop. The output should include a running db nodes = list including both the main and new slave nodes, and any other existing slaves in the cluster.
  16. mnesia:change_table_copy_type(schema, node(), disc_copies). To allocate local space for the database schema.
  17. Now, copy the required tables to the local slave. The tables to copy will depend on the XMPP features you are using. We were using Publish/Subscribe and Multi-User Chat.

    mnesia:add_table_copy(pubsub_subscription, node(), disc_copies).
    mnesia:add_table_copy(pubsub_node, node(), disc_copies).
    mnesia:add_table_copy(pubsub_index, node(), disc_copies).
    mnesia:add_table_copy(pubsub_state, node(), disc_copies).
    mnesia:add_table_copy(muc_registered, node(), disc_copies).
    mnesia:add_table_copy(muc_room, node(), disc_copies).
    mnesia:add_table_copy(pubsub_last_item, node(), ram_copies).
    mnesia:add_table_copy(anonymous, node(), ram_copies).
    mnesia:add_table_copy(muc_online_room, node(), ram_copies).
    
  18. Quit the erlang prompt: init:stop().
  19. Stop running as the ejabberd user: exit
  20. Start ejabberd again: sudo /etc/init.d/ejabberd start
  21. Add the new slave node to the ELB.

To test that everything works, remove all but the new slave node from the ELB cluster and verify that the XMPP features you use still work.