MQTT is a pubsub protocol targetting small devices and loosy network, with a centralized server. Mosquitto is a clean, simple and stable MQTT server. For now high availability is not part of the MQTT protocol and Mosquitto doesn’t handle it. But all needed parts for building a resilient high availability service are available.
I want to build a ping system : agents ping servers in order to monitor them. Agents use different internet provider or data center, ping a dynamic list of servers. Agents are usually behind a NAT, and should remain polite.
Let’s bring the classic pubsub pattern. Nodes publish ping statistics on a channel : ping/{domain}, publish their will on rip/agent/{hostname} and subscribe to watch which provides the list for targets. Indexer subscribes to ping/+ and rip/# (+ means one slug, # a lot of slugs).
The high availabity model is naive. Pingers have a list of mqtt servers, shuffle it, try to connect to one of them. If nothing happens on the socket, it uses the next server, in a round robin fashion. Pingers ping ONE server, randomly chosen. Indexers connect to ALL the servers, and send messages to all the servers. Indexers say a few things but listen a lot. Data are pushed to ElasticSearch, à la Logstash, for Kibana drilling.
Supporting high availability server side, with a loosy connection can be painful. I don’t care if message are duplicated or even loss (it’s just ping, not financials transactions). Mosquitto is a nice MQTT server, though it’s not hackable enough to implement replication, and it’s a bad idea. Mosquitto handles low level contracts (passing message, handling mailboxes and dead peoples), not business logic. I can use RabbitMQ as a mqtt server , using its clustering features, but no, I want something dead simple.
With this pattern, I can use an unconfigured mosquitto server (apt-get install mosquitto-server) and vanilla clients. I can stop a server for a short period of time, deploy a new version, and restart it, without breaking the service.
Later, I will add authentication (with SSL certificates), client side trouble detection (it’s just me, or the target?), thresold levels (it’s slow or broken?), speed measurements.
Demo time
A simple demo, full localhost, on a debian linux. It works with any Linux or os X, just translate the instructions.
Get source of the swarming project :
git clone https://github.com/athoune/swarming.git cd swarming
Virtualenv dance :
virtualenv . ./bin/pip install -r requirements.txt
Install and launch mosquitto :
sudo apt-get install mosquitto-server sudo /etc/init.d/mosquitto start
Launch another mosquitto instance, on another port :
mosquitto -p 1884
Launch two snitchs in two terminals :
mosquitto_sub -t "ping/+" -t "rip/#" -v mosquitto_sub -t "ping/+" -t "rip/#" -v -p 1884
Launch one agent in another terminal :
./bin/python swarming.py localhost localhost:1884
Install elasticsearch :
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.10.deb sudo dpkg -i elasticsearch-0.90.10.deb sudo /etc/init.d/elasticsearch start
Launch the indexer :
./bin/python indexer.py localhost localhost:1884
You can now stop, wait, restart one or other mosquitto server, add swarm agent.
Install Kibana. I’m cheating, I’m using the ugly one liner python http server.
wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.0milestone4.tar.gz tar -xvzf kibana-3.0.0milestone4.tar.gz cd kibana-3.0.0milestone4 python -m SimpleHTTPServer
Open the default web page : http://localhost:8000/index.html#/dashboard/file/logstash.json
Lets explore kibana. Compare target performance, compare agent performance, find something useful and meaningful from all this points.