CloudBolt High Availability (HA) Guide

Overview

CloudBolt can be considered three services in terms of High Availability:

  • Web server
  • Job engine
  • Database

By default, all of these services run on a single machine. They can also each run on a separate machine or multiple separate machines. For example, seven separate machines could contain two web servers, three Job Engines, and a master database that replicates to a minion.

Recommendations

We recommend customers start with our appliance as we deliver it. If you require a geo-diverse or high-availability cluster, consider the following setup.

Start with a 3-node cluster that contains:
  • 1 web server
  • 1 Job Engine
  • An external database

You can then provision more CloudBolt web servers or Job Engines as your performance needs require.

Note

Multiple web servers will need to be provisioned behind a load balancer if you wish to have a single domain name for your CloudBolt instance.

Increasing performance

Try the following if you’re experiencing performance slowdown.

  • Increase the number of CPUs on the CloudBolt appliance.
  • Increase the amount of memory on the CloudBolt appliance.
  • Increase the number of running Job Engine processes described in Number of processes.

Shared Directories

The CloudBolt web server and Job Engine both need access to three directories. Administrators must take this into account when planning to run a CloudBolt in a High Availability configuration.

/var/opt/cloudbolt/: Stores a variety of instance-specific files, such as secret keys and downloaded pricing data.

/var/www/html/cloudbolt/static/uploads/: Stores a variety of user-uploaded data, such as code for user-created Actions and branding images.

/var/log/cloudbolt/jobs/: Stores the logs for jobs run through the CloudBolt Job Engine. The web server needs access to these so that users can view job logs through the web interface.

Sharing Files/Directories

There are various options for sharing files/directories across multiple machines:

rsync

One option is to use a recurring rsync task that transfers missing files and directories across machines on a regular basis. However, every machine that is running a CloudBolt Job Engine or web server will need to rsync with every other one, so this approach quickly becomes unsustainable. Beyond two machines, we recommend using shared directories on a Network File System (see below).

Configure a cron job to rsync these three directories every 15 minutes with the following crontab entry:

0,15,30,45 * * * * root rsync -av /var/opt/cloudbolt/ root@cb-ha2:/var/opt/cloudbolt/ >> <log_location> 2>&1
0,15,30,45 * * * * root rsync -av /var/www/html/cloudbolt/static/uploads/ root@cb-ha2:/var/www/html/cloudbolt/static/uploads/ >> <log_location> 2>&1
0,15,30,45 * * * * root rsync -av /var/log/cloudbolt/jobs/ root@cb-ha2:/var/log/cloudbolt/jobs/ >> <log_location> 2>&1

This can also be configured using supervisor, which is what CloudBolt uses to run the Job Engine.

Network file system

Setting up a Network File System (NFS) for CentOS is more difficult than a recurring rsync command, but it is a more scalable approach to sharing files across multiple machines.

Other options

Any setup that allows multiple machines to access/maintain identical copies of directories will work for a CloudBolt High Availability arrangement.