Straight-forward Clustering of ejabberd
gga
#2014-03-26
ejabberd is a mature, powerful, high-performance, open-source XMPP server. If you want to use XMPP, I highly recommend it.
There are some quirks, however. For example, while it does support clustering, setting it up is not straight-forward. The documentation describes everything that is possible, but doesn’t really spell out the steps. For a recent project, I had to figure out how to cluster it using AWS.
Setup
Create an elastic load balancer and include listeners for HTTP on port 5280, TCP port 5222 and TCP port 5269. Create your main ejabberd node, add it to the ELB and then point the domain name for your ejabberd server at the ELB, typically using a CNAME record.
There’s nothing special in creating the main ejabberd node, a standard installation will do the job.
Adding a Slave to the Cluster
It’s adding a slave node to the cluster where special effort must be made. Here are some manual steps derived from the ejabberd manual — obviously, there would be some benefit to automating these.
ssh
to the existing main node.- Grab the erlang cookie:
sudo more /var/lib/ejabberd/.erlang.cookie
- Copy the output, and disconnect from the node.
- Create a new ejabberd node — however you do this.
ssh
to the new slave.- Stop ejabberd:
sudo /etc/init.d/ejabberd stop
- Delete the default DCD database files:
sudo rm -f /var/lib/ejabberd/*.DCD
- Delete the default DCL database files:
sudo rm -f /var/lib/ejabberd/*.DCL
- Delete the default DAT database files:
sudo rm -f /var/lib/ejabberd/*.DAT
- Replace the erlang cookie on the slave with the value from the master:
sudo echo <copied cookie> > /var/lib/ejabberd/.erlang.cookie
- Start running as the ejabberd user:
sudo -u ejabberd bash
cd /var/lib/ejabberd
- Verify that the new cookie is here:
more .erlang.cookie
- Start an erlang interpreter:
erl -setcookie '<cookie value>' -sname ejabberd -mnesia dir '"/var/lib/ejabberd/"' -mnesia extra_db_nodes "['ejabberd@<name-of-master-node>']" -s mnesia
Make sure you use a name for the master node that is visible from the slave node, over the private interface. - At the erlang prompt, run
mnesia:info().
Don’t forget the full-stop. The output should include arunning db nodes =
list including both the main and new slave nodes, and any other existing slaves in the cluster. mnesia:change_table_copy_type(schema, node(), disc_copies).
To allocate local space for the database schema.-
Now, copy the required tables to the local slave. The tables to copy will depend on the XMPP features you are using. We were using Publish/Subscribe and Multi-User Chat.
mnesia:add_table_copy(pubsub_subscription, node(), disc_copies). mnesia:add_table_copy(pubsub_node, node(), disc_copies). mnesia:add_table_copy(pubsub_index, node(), disc_copies). mnesia:add_table_copy(pubsub_state, node(), disc_copies). mnesia:add_table_copy(muc_registered, node(), disc_copies). mnesia:add_table_copy(muc_room, node(), disc_copies). mnesia:add_table_copy(pubsub_last_item, node(), ram_copies). mnesia:add_table_copy(anonymous, node(), ram_copies). mnesia:add_table_copy(muc_online_room, node(), ram_copies).
- Quit the erlang prompt:
init:stop().
- Stop running as the ejabberd user:
exit
- Start ejabberd again:
sudo /etc/init.d/ejabberd start
- Add the new slave node to the ELB.
To test that everything works, remove all but the new slave node from the ELB cluster and verify that the XMPP features you use still work.