High Availability in PubSub+ Cloud

Event broker services can be deployed in high-availability (HA) redundancy groups. HA redundancy provides 1:1 event broker service paring to provide fault tolerance and increase overall service availability. If one of the event broker services fails or is taken out of service, the other event broker service automatically takes over and provides service to the clients that were previously being served. A brief interruption of less than one minute occurs during the HA activity failover. In comparison, outages of 15-30 minutes occur for Developer and standalone event broker services because they do not have HA redundancy.

To learn more about HA redundancy, see High Availability for Software Event Brokers.

HA Concepts

PubSub+ Cloud implements HA using an Active/Standby model with an arbiter node (Monitoring Node) for split-brain detection. This setup requires three event broker instances:

  • Primary messaging node
  • Backup messaging node
  • Monitoring node

Active-Standby HA Model

The primary and backup messaging nodes both run under the messaging node role, while the monitoring node runs under the monitoring node role. Each of their respective roles is fixed by the configuration and never changes. The HA group is fronted by a network load balancer that routes traffic to-and-from the active event broker service in the HA group (either the primary or backup).

When in operation, the messaging nodes will assume one of these Active/Standby roles: Primary or Backup. At any one time, one event broker service is the primary and the other is the backup.

With this model, a primary messaging node provides messaging services to clients, while a backup event broker waits in standby mode—it only provides service should the primary event broker fail. A third event broker acts as a monitoring node, to act as a tie-breaker and prevent split-brain scenarios that would otherwise cause both the primary and backup messaging nodes to become active simultaneously.

Upon a failover, connections to the event broker are switched over from the Primary to the Backup messaging node automatically.

Subsequently, a failover occurs in the following sequence:

  1. The backup event broker service takes over messaging activity.
  2. Once the failed primary event broker servicecomes back on-line, it resynchronizes to match the currently active backup event broker service.
  3. The primary messaging node takes on the “Standby” role.

HA in Public and Private Clouds

To ensure that a high-availability group is adequately provisioned, pods run on different worker nodes. Additionally, the pods can be spread over multiple availability zones (AZ) when available. The following diagram shows a Kubernetes cluster that has worker nodes over three availability zones. The Cloud-Agent will schedule the Messaging nodes over two AZ and the monitor node on a third AZ. For each HA service, the primary pod is deployed in one AZ, the backup pod in a second AZ, and the monitoring pod in a third AZ. This guarantees that pods for the same HA service are not running on the same hardware.

Similarly, when deploying a HA group in virtual private clouds such as AWS, there are two network topologies available.

  1. For regions with three or more AZ:

  2. For regions with two AZ:

Connecting to a Cloud HA Group

Typically, event broker services are fronted by load balancers in deployments of PubSub+ Cloud. When a load balancer is used, it abstracts the switchover between primary and backup in the event of a failure for HA configurations. For this reason, client applications can connect to an event broker service using a single-DNS entry, whether they use PubSub+ Messaging APIs or third-party messaging APIs, such as MQTT.

If a load balancer is not used, a host-list is required. Host lists are a feature supported by Solace PubSub+ APIs. Third-party APIs do not natively have host-lists, though you can choose to implement this functionality.

HA and Service Types

The following service types deploy an HA redundancy group by default:

  • Professional (Standard account)
  • Enterprise (Enterprise account)

PubSub+ Cloud automates all of the configuration and setup when you create your event broker service. Once the event broker service is created, applications can use the DNS name entry provided in the connectivity tab in the console.

Screenshot showing an example as described by the surrounding text.

HA-Link Security

When a new enterprise event broker service is created, the communication between the primary and backup messaging nodes are encrypted by default, including the HA mate link and Config-sync . You can override the default HA Mate link encryption to plain text through the advanced options when you create a service (see Configuring High-Availability Mate-Link Encryption). Overriding the default HA mate link encryption to plain text may be useful if you require maximum performance, and are willing to trust the security restrictions of the VPC in the cloud providers or on-premises; Config-sync always remains encrypted.

If you have an existing event broker service without encryption, you can encrypt it, including its HA mate link and Config-Sync link through the console or the REST API. In the console, you can easily differentiate between the encrypted services and ones that are not; when the mate-link encryption is disabled, a warning icon is displayed on the event broker service's status screen. For more information, see Configuring High-Availability Mate-Link Encryption.

Viewing the Mate-link Encryption Status

The status of the mate-link encryption is available in Cluster Manager and shown on the Status tab for the selected event broker service.

Modifying the HA Mate-Link Encryption Status

To modify the HA mate-link encryption status for an existing event broker service perform these steps:

  1. Log in to the PubSub+ Cloud Console if you have not done so yet. The URL to access the Cloud Console differs based on your authentication scheme. For more information, see Logging In to the PubSub+ Cloud Console.

  2. Select Cluster Manager from the navigation bar.

  3. Select the event broker service with the HA mate-link encryption status you want to modify. If the event broker service is not listed, make sure you have the right environment selected. For more information, see Selecting Environments.

  4. On the service page, select the Manage tab.
  5. On the Manage tab, click Advanced Options.

  6. In the Mate-Link Encryption pane, select Disable or Enable to modify the encryption.