Dabbling around Rabbit MQ persistence, durability & message routing
Rabbit MQ needs no introduction as it’s one of the most used messaging systems in the world. It’s a pretty traditional message oriented systems which believes in smart broker & dumb consumer concept. RabbitMQ implements several protocols like AMQP, MQTT & STOMP ( more details here ). When RabbitMQ was born, it was first of its kind to support implementation of standard AMQP protocol which can support messaging across polyglot systems, AMQP intends to be cross compatible across systems integration. Rabbit MQ has a very good documentation, has support through lot of client libraries for different languages, has an excellent management console & it’s quite easy to set up.
In this post, we will discuss 2 different things about RabbitMQ — Persistence, durability & scalable message routing to support thousands of message routing per second.
- Persistence & Durability: RabbitMQ has many entities — exchange, topic, queue & message. Durability & persistence seems to same apparently but they are not. Durability is a property of exchange, queue & topic. Persistence is a property of message. By default none of the entities support preserving their states in case of server failure or restart because it has some performance implications. Persistence means in case the broker suddenly stops for some reason, our messages should be able to be recovered on the next restart. But the gotcha is — in order to persist messages, RabbitMQ has to sync all messages to the disk before even it’s processed, so it is bounded by the performance of disk I/O. RabbitMQ persists messages in a special file & that file is garbage collected frequently. To support durability, while creating exchange & queue, developers have to a pass ‘durable’ flag as true in the code. Following is a sample python code for exchange creation:
Similarly while declaring queue also, durable flag has to be passed. All client libraries support this feature. If queue is durable, but exchange is non-durable or vice-versa, on next broker restart, you will loose the queue as the exchange will not be able to redeclare the queue & fetch the last state of the queue on its own. The topic that binds the exchange & the queue will be automatically durable when both exchange and queue are durable.
But durable exchange & queue does not mean the corresponding messages in the queue are also durable. In order to make the messages durable, you have to declare them as persistent. Message has a property called ‘delivery_mode’. delivery_mode=2 means persistent. Following is a sample python example:
properties=pika.BasicProperties(delivery_mode = 2))
if message is persistent but not the queue or exchange, that message will not be persisted on disk. So to guarantee proper message persistence, you have to declare both exchange & queue as durable, set message delivery mode to persistent.
2. Scalable Message Routing: AMQP standard uses topics to bind queue to exchanges. Topics can be expressed as a straight forward string literal, or empty or with regular expression. So imagine RabbitMQ is deployed in a very high scale environment with say 20,000 messages per second. If the message routing algorithm is very slow, it will cost the performance. Till 2.4.0 release, RabbitMQ seemed to be using very naive regular expression matching for incoming messages. But from 2.4.0, they started using Trie data structure for finding out all matching routing paths. They could have used one to one matching of message routing key to actual routing path, but that would be like caching a lot f unnecessary data + pre computation of many unnecessary routing paths based on the actual regular expression declared in the topic. Trie usually takes some space but it’s very fast in terms of runtime complexity. As RabbitMQ uses regular expressions in topics, there is a need of back tracking through the trie data structure in order to find out all matching combinations.
Following is a direct example from RabbitMQ’s own research:
RabbitMQ has the conventions — * matches a single word, # matches zero or more words. Trie gives a good balance between space & time tradeoff & a good worst case performance in the order of maximum possible length of the actual routing key. RabbitMQ guys have their own benchmark also for different approached they took while deciding which data structure to use to match the routing keys, please check the reference  &  for more details. This is one of the real life implementations of Trie in a scalable environment.