Setting up High Availability in CloudBolt

Follow the steps below to set up a High Availability arrangement for your CloudBolt instance(s).

Before You Begin

Follow these one-time steps for all existing servers:

  1. Secret key: Copy the secret key from the initial CloudBolt appliance to any CloudBolt web servers or job engines that are provisioned later. This key is used to encrypt and decrypt sensitive fields on the CloudBolt database.

    scp /var/opt/cloudbolt/secrets/secret-key-for-apache.bin
    root@cb-ha2:/var/opt/cloudbolt/secrets/
    
  2. Unique token: Copy the UNIQUE_TOKEN section of /var/opt/cloudbolt/proserv/customer_settings.py from the initial CloudBolt appliance to the customer_settings.py file on any CloudBolt web servers or job engines that are provisioned later.

Web Server

CloudBolt uses the Apache HTTP server, which runs as the httpd service on the CloudBolt appliance.

Load balancer

If you are using multiple web servers, we recommend putting them behind a load balancer. This ensures that multiple web servers share an IP address. One option is the free load balancer software HA Proxy.

Configuring web servers

To create a web server, clone an existing CloudBolt web server or run the CloudBolt installer on a new VM. Each CloudBolt web server needs the source code in /opt/cloudbolt/, as well as the Shared Directories described in CloudBolt High Availability (HA) Guide.

Note

If this machine is going to be a CloudBolt web server, it will not need to run either a job engine or the mysqld service.

Job engine

In the job engine supervisord congfiguration file at /etc/supervisord.d/jobengine.conf:

  • Change the line that begins with autostart to autostart=false.
  • Change the line that begins with autorestart to autorestart=false.

Use the command supervisorctl reload to tell supervisor to reload its configuration files.

mysqld

This is the daemon that runs the MySQL database and is not necessary on a CloudBolt web server that is connecting to an external database.

  • systemctl stop mysqld: This command turns off mysqld.
  • systemctl disable mysqld: This command stops mysqld from starting when this CloudBolt appliance reboots.

Database connection

If this machine is connecting to an external database, you must update the database settings to point to the external database:

  1. Copy the DATABASES section from /opt/cloudbolt/settings_local.py to /var/opt/cloudbolt/proserv/customer_settings.py
  2. Update the HOST key in the DATABASES section to be the IP address or Fully-Qualified Domain Name (FQDN) of your database server. For example: 'HOST': '<FQDN-of-DB>'.

Job Engine

CloudBolt runs its job engine through supervisor, which can be controlled through supervisorctl commands. First, ensure there are no active jobs running. Then use supervisorctl stop jobengine:* to stop all job engine processes. The job engine process(es) can be controlled by editing the configuration file at /etc/supervisord.d/jobengine.conf.

Configuring Job Engine servers

To create a Job Engine server, clone an existing CloudBolt Job Engine server or run the CloudBolt installer on a new VM. Each CloudBolt Job Engine server needs the source code in /opt/cloudbolt/, as well as the Shared Directories described in CloudBolt High Availability (HA) Guide.

Note

If this machine is going to be a CloudBolt Job Engine server, it will not need to run either a web server engine or the mysqld service.

Web server

Turn off the CloudBolt web server by stopping or disabling httpd using the following commands:

  • systemctl stop httpd: Stop the web server.
  • systemctl disable httpd: Disable the web server so that it does not start when this CloudBolt appliance boots.

mysqld

This is the daemon that runs the MySQL database and is not necessary on a CloudBolt Job Engine server that is connecting to an external database.

  • systemctl stop mysqld: This command turns off mysqld.
  • systemctl disable mysqld: This command stops mysqld from starting when this CloudBolt appliance reboots.

Database connection

If this machine is connecting to an external database, you must update the database settings to point to the external database:

  1. Copy the DATABASES section from /opt/cloudbolt/settings_local.py to /var/opt/cloudbolt/proserv/customer_settings.py
  2. Update the HOST key in the DATABASES section to be the IP address or Fully-Qualified Domain Name (FQDN) of your database server. For example: 'HOST': '<FQDN-of-DB>'.

Including/excluding specific job types

See The CloudBolt Job Engine docs.

Database

Each CloudBolt web server and job engine must be configured to connect to an external MySQL database if one is in use.

Database servers do not need access to the CloudBolt source code found in /opt/cloudbolt/ or any of the other required directories for CloudBolt web servers and job engines.

Replication

MySQL database servers can be configured to support one-way replication or configured as a redundant cluster. See MySQL Documentation for more information on configuring replication and this MySQL documentation for configuring an InnoDB Cluster.

Troubleshooting

Error: Too Many Open Files (version 8.8)

Solution: Upgrade to a newer version. Or, log in to your CloudBolt instance. Make sure no jobs are currently running. Open a terminal window and run the following commands:

  1. file_max=$(cat /proc/sys/fs/file-max)
  2. minfds=`expr $file_max - 5000`
  3. sed -i "/minfds=1024.*/c\minfds=$minfds" /etc/supervisord.conf
  4. service supervisord restart

Note

Running this restart will hang all jobs currently running, as it also restarts the job engine.

Error: Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server UUIDs; these UUIDs must be different for replication to work.

Solution: Delete /var/lib/mysql/auto.cnf from any non-master database instances.