RabbitMQ Single Point of Failure

RabbitMQ Single Point of Failure

A single point of failure is a part of a system that, if fails, will stop the entire system from working. Single point of failures are undesirable in any system with a goal of high availability or reliability. In this post we will go through steps required to ensure that RabbitMQ won’t be single point of failure in your system.

Step One: Start Multiple Docker Instances of RabbitMQ

This one is self explanatory. If you want to be able to remove a single point of failure from RabbitMQ you will need to have more than one instance of Rabbit running. If one of them fails other one will keep messages flowing. For the simplicity of the example we will start two Docker containers instead of installing RabbitMQ on two different servers.

  • First of all, Docker needs to be installed on your system. To do so you will need to register on hub.docker.com and download Docker version compatible with your operating system. If Docker was successfully installed you should be able to see Docker version when you run ‘docker -v’ in a terminal.
codespace:~ lab$ docker -v
 Docker version 19.03.1, build 74b1e89
  • Install Docker Compose. Guide on how to do so can be found: Docker-Compose. If process was successful you should be able to check docker-compose version in a terminal by running ‘docker-compose -v’
codespace:~ lab$ docker-compose -v
docker-compose version 1.24.1, build 4667896b
  • Create Docker network for our RabbitMQ cluster by running: ‘docker network create rabbitmq-cluster’ in a terminal.
codespace:~ lab$ docker network create rabbitmq-cluster

69362a49ac7f62a217c74f84318e4809065c71a943ae83a26e47e796740059f1
  • Create new directory named rabbitmq and file docker-compose.yml in it. Content of docker-compose.yml:
version: '3.6'
 
networks:
  default:
    external:
      name: rabbitmq-cluster
 
services:
  rabbitmq-01:
    image: rabbitmq:3.7.17-management
    hostname: rabbitmq-01
    environment:
      - RABBITMQ_DEFAULT_USER=admin
      - RABBITMQ_DEFAULT_PASS=guest
      - RABBITMQ_ERLANG_COOKIE="MY-SECRET-KEY"
    volumes:
      - ./definitions.json:/etc/rabbitmq/definitions.json
    ports:
      - '5672:5672'
      - '15672:15672'
 
  rabbitmq-02:
    image: rabbitmq:3.7.17-management
    hostname: rabbitmq-02
    environment:
      - RABBITMQ_DEFAULT_USER=admin
      - RABBITMQ_DEFAULT_PASS=guest
      - RABBITMQ_ERLANG_COOKIE="MY-SECRET-KEY"
    volumes:
      - ./definitions.json:/etc/rabbitmq/definitions.json
    ports:
      - '5673:5672'
      - '15673:15672'
  • Create definitions.json file in rabbitmq directory (same directory as docker-compose.yml). Content of definitions.json file:
{
  "rabbit_version": "3.7.17",
  "users": [
    {
      "name": "admin",
      "password_hash": "fd0GyzAf6C6hmgCJ5VU+TSyzUNlzypPlGb7VDKkqUvJqVxyd",
      "hashing_algorithm": "rabbit_password_hashing_sha256",
      "tags": "administrator"
    }
  ],
  "vhosts": [
    {
      "name": "/"
    }
  ],
  "permissions": [
    {
      "user": "admin",
      "vhost": "/",
      "configure": ".*",
      "write": ".*",
      "read": ".*"
    }
  ],
  "parameters": [],
  "policies": [
    {
      "vhost": "/",
      "name": "ha",
      "pattern": "",
      "definition": {
        "ha-mode": "exactly",
        "ha-params": 2,
        "ha-sync-mode": "automatic",
        "ha-sync-batch-size": 5
      }
    }
  ],
  "queues": [
    {
      "name": "q.user.created",
      "vhost": "/",
      "durable": true,
      "auto_delete": true,
      "arguments": {}
    }
  ],
  "exchanges": [
    {
      "name": "e.user.created",
      "vhost": "/",
      "type": "topic",
      "durable": true,
      "auto_delete": false,
      "internal": false,
      "arguments": {}
    }
  ],
  "bindings": [
    {
      "source": "e.user.created",
      "vhost": "/",
      "destination": "q.user.created",
      "destination_type": "queue",
      "routing_key": "user.created",
      "arguments": {}
    }
  ]
}	
  • Navigate to rabbitmq directory in a terminal and run: ‘docker-compose up -d’
codespace:~ lab$ cd Documents/docker/rabbitmq/
codespace:rabbitmq lab$ docker-compose up -d
Creating rabbitmq_rabbitmq-02_1 ... done
Creating rabbitmq_rabbitmq-01_1 ... done
  • Run ‘docker ps’ to confirm that both RabbitMQ nodes are successfully started. We should have two containers running.
codespace:rabbitmq lab$ docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS              PORTS                                                                                        NAMES
41969230b1c7        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp   rabbitmq_rabbitmq-01_1
db0a13ceb09c        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5673->5672/tcp, 0.0.0.0:15673->15672/tcp   rabbitmq_rabbitmq-02_1
  • If you want to start more than two nodes simply edit docker-compose.yml and add configuration for as many nodes as you need. (Make sure you update ports section if you will be running multiple nodes on one box). To add third node we could just stop running containers, paste this configuration block in docker-compose.yml and execute ‘docker-compose up -d’ again.
rabbitmq-03:
    image: rabbitmq:3.7.17-management
    hostname: rabbitmq-03
    environment:
      - RABBITMQ_DEFAULT_USER=admin
      - RABBITMQ_DEFAULT_PASS=guest
      - RABBITMQ_ERLANG_COOKIE="MY-SECRET-KEY-123"
    volumes:
      - ./definitions.json:/etc/rabbitmq/definitions.json
    ports:
      - '5674:5672'
      - '15674:15672'
  • More details related to RabbitMQ Docker image and it’s latest version can be found: RabbitMQ

Step Two: Run RabbitMQ HA mode

High-availability clusters (also known as HA clusters or fail-over clusters) are groups of servers that support server applications that can be reliably utilized with a minimum amount of down-time. To run RabbitMQ cluster in HA mode we need to let instances we started in a previous step to know about each other. To do that:

  • We get container name of our first instance by running: docker ps
codespace:rabbitmq lab$ docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS              PORTS                                                                                        NAMES
41969230b1c7        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp   rabbitmq_rabbitmq-01_1
db0a13ceb09c        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5673->5672/tcp, 0.0.0.0:15673->15672/tcp   rabbitmq_rabbitmq-02_1
  • Then we open that docker container: docker exec -it rabbitmq_rabbitmq-01_1
codespace:rabbitmq lab$ docker exec -it rabbitmq_rabbitmq-01_1 /bin/bash
root@rabbitmq-01:/#
  • Execute following commands to join both instances to a cluster:
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl join_cluster rabbit@rabbitmq-02
rabbitmqctl start_app
  • If process was successful, cluster of two nodes should be visible in management console. In case of a failure – make sure that ‘rabbitmqctl join_cluster rabbit@rabbitmq-02’ command didn’t throw any errors. Common errors: misspelt node name, nodes are not on the same Docker network.
rabbitMq console
  • Make sure that ‘1’ is displayed in Info section in management console next to each service instance. It means that cluster is working in HA mode.
  • To add more nodes into a cluster repeat the same steps for each extra instance.

Step Three: Mirrored Queues

Our RabbitMQ cluster now has two nodes talking to each other. But if one of them would fail. By default, contents of a queue within a RabbitMQ cluster are located on a single node. Queues created on it, with all the messages in the queue would be lost in an event of failure. To prevent that feature called Mirrored Queues can be used.

From RabbitMQ documentation: Each mirrored queue consists of one master and one or more mirrors. The master is hosted on one node commonly referred as the master node. Each queue has its own master node. All operations for a given queue are first applied on the queue’s master node and then propagated to mirrors.

Mirroring parameters are configured using policies. A policy matches one or more queues by name and contains a definition that are added to the total set of properties of the matching queues. Lets add mirroring policy to our cluster:

  • We get container name of our first instance by running: docker ps
codespace:rabbitmq lab$ docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS              PORTS                                                                                        NAMES
41969230b1c7        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp   rabbitmq_rabbitmq-01_1
db0a13ceb09c        rabbitmq:3.7.17-management   "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5673->5672/tcp, 0.0.0.0:15673->15672/tcp   rabbitmq_rabbitmq-02_1
  • Then we open that docker container with: docker exec -it rabbitmq_rabbitmq-01_1 /bin/bash
codespace:rabbitmq lab$ docker exec -it rabbitmq_rabbitmq-01_1 /bin/bash
root@rabbitmq-01:/#
  • And set the policy: rabbitmqctl set_policy ha-all “” ‘{“ha-sync-mode”: “automatic”, “ha-mode”: “all”, “ha-sync-batch-size”: 5}’
root@rabbitmq-01:/# rabbitmqctl set_policy ha-all "" '{"ha-sync-mode": "automatic", "ha-mode": "all", "ha-sync-batch-size": 5}'

Setting policy "ha-all" for pattern "" to "{"ha-sync-mode": "automatic", "ha-mode": "all", "ha-sync-batch-size": 5}" with priority "0" for vhost "/" ...
root@rabbitmq-01:/#
  • Open Queues section in management console to verify. Important part here is “+1” next to Node name. It shows how many times queue is mirrored.
codespaceLab rabbitMQ HA mirrored Queues

That is it. Now our RabbitMQ cluster is no longer a single point of failure in a system, it is ready to be used in production and how to connect your Spring Boot application to it is explained here.

Add Comment

Your email address will not be published. Required fields are marked *