High available MQTT service

[MQTT](http://mqtt.org/) is a pubsub protocol targetting small devices and loosy network, with a centralized server. Mosquitto is a clean, simple and stable MQTT server

Performance
14 Jan 2014
Mathieu Lecarme

Notre prochain webinar

MQTT is a pubsub protocol targetting small devices and loosy network, with a centralized server. Mosquitto is a clean, simple and stable MQTT server. For now high availability is not part of the MQTT protocol and Mosquitto doesn’t handle it. But all needed parts for building a resilient high availability service are available.

I want to build a ping system : agents ping servers in order to monitor them. Agents use different internet provider or data center, ping a dynamic list of servers. Agents are usually behind a NAT, and should remain polite.

Let’s bring the classic pubsub pattern. Nodes publish ping statistics on a channel : ping/{domain}, publish their will on rip/agent/{hostname} and subscribe to watch which provides the list for targets. Indexer subscribes to ping/+ and rip/# (+ means one slug, # a lot of slugs).

The high availabity model is naive. Pingers have a list of mqtt servers, shuffle it, try to connect to one of them. If nothing happens on the socket, it uses the next server, in a round robin fashion. Pingers ping ONE server, randomly chosen. Indexers connect to ALL the servers, and send messages to all the servers. Indexers say a few things but listen a lot. Data are pushed to ElasticSearch, à la Logstash, for Kibana drilling.

Supporting high availability server side, with a loosy connection can be painful. I don’t care if message are duplicated or even loss (it’s just ping, not financials transactions). Mosquitto is a nice MQTT server, though it’s not hackable enough to implement replication, and it’s a bad idea. Mosquitto handles low level contracts (passing message, handling mailboxes and dead peoples), not business logic. I can use RabbitMQ as a mqtt server , using its clustering features, but no, I want something dead simple.

With this pattern, I can use an unconfigured mosquitto server (apt-get install mosquitto-server) and vanilla clients. I can stop a server for a short period of time, deploy a new version, and restart it, without breaking the service.

Later, I will add authentication (with SSL certificates), client side trouble detection (it’s just me, or the target?), thresold levels (it’s slow or broken?), speed measurements.

Demo time

A simple demo, full localhost, on a debian linux. It works with any Linux or os X, just translate the instructions.

Get source of the swarming project :

 git clone https://github.com/athoune/swarming.git cd swarming

Virtualenv dance :

 virtualenv . ./bin/pip install -r requirements.txt

Install and launch mosquitto :

 sudo apt-get install mosquitto-server sudo /etc/init.d/mosquitto start

Launch another mosquitto instance, on another port :

 mosquitto -p 1884

Launch two snitchs in two terminals :

 mosquitto_sub -t "ping/+" -t "rip/#" -v mosquitto_sub -t "ping/+" -t "rip/#" -v -p 1884

Launch one agent in another terminal :

 ./bin/python swarming.py localhost localhost:1884

Install elasticsearch :

 wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.10.deb sudo dpkg -i elasticsearch-0.90.10.deb sudo /etc/init.d/elasticsearch start

Launch the indexer :

 ./bin/python indexer.py localhost localhost:1884

You can now stop, wait, restart one or other mosquitto server, add swarm agent.

Install Kibana. I’m cheating, I’m using the ugly one liner python http server.

 wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.0milestone4.tar.gz tar -xvzf kibana-3.0.0milestone4.tar.gz cd kibana-3.0.0milestone4 python -m SimpleHTTPServer

Open the default web page : http://localhost:8000/index.html#/dashboard/file/logstash.json