Replication provides a data center redundancy and disaster recovery solution for Solace PubSub+ event brokers.
Replication uses corresponding Message VPNs with active and standby replication states at separate replication sites to ensure that Guaranteed Messaging clients can continue to have service through a specified Message VPN should one data center become unavailable. When replication is enabled, Guaranteed messages received by durable endpoints in a Message VPN with an active replication state at one replication site are automatically propagated to corresponding durable endpoints in a duplicate Message VPN with a standby replication state at the other replication site. In addition, local and XA transactions that publish or consume replicated Guaranteed messages are automatically propagated to the standby replication site. If a service fail-over to one replication site occurs, clients can reconnect to the same Message VPN at a different replication site to continue to receive service, and any messages that were received, but not consumed, before the service interruption can be delivered to them.
This section discusses design considerations that should be made prior to implementing data center replication.
The mixing of message types on an endpoint is not recommended. In particular, the mixing of the following types of messages on a single endpoint can create complications:
- replicated and non-replicated messages
- transacted and non-transacted messages
If replicated and non-replicated messages are mixed, the endpoint will have different sets of messages on the active and standby sites. In the event of a replication fail-over to the standby site, clients will have different messages to consume in the endpoint which will likely be difficult for the client application to handle. Additionally, the non-replicated messages that are in the endpoint on the previously active, now standby site are not guaranteed to be preserved (since they were never intended to be replicated) and will eventually be cleaned from the endpoint on the standby event broker as newer replicated messages are consumed from that endpoint on the newly active site. This means that on a fail-back to the originally active site, this messages may or may not be present in the endpoint.
If an endpoint is being subject to a mixture of transacted and non-transacted operations, then delivery delays can occur, especially when using synchronous transactions and when the replication service becomes degraded. The issue is worse if the
reject-msg-when-sync-ineligible option is enabled.
In order to prevent complexities that occur when mixing message types on endpoints, it's recommended that a topic structure that classifies messages by type be used. The following is a full message classification that provides many benefits:
- Guaranteed messages that will:
- Not be replicated
- Be synchronously replicated
- Be asynchronously replicated
- Guaranteed messages that have a Store and Forward forwarding mode
- Direct messages
- Messages from a particular paired replication site (for example, the paired New York and New Jersey replication sites)
- Messages from different publishers
For example, a topic hierarchy that accomplishes all of these things would have the following topic prefixes:
For deployments where short topics are preferred, the topics could be made less verbose. For example,
Creating such a hierarchy provides the following benefits:
- It simplifies the configuration of replicated topic subscriptions—only two subscriptions need to be added per replicated Message VPN to replicate all messages:
- It prevents the unintended promotion or demotion of a message because of topic matches (for example, a non-persistent message that is converted to a Direct message). If Direct message consumers only subscribe to
*/*/MODE_DIRECTtopics, and Guaranteed message consumers only subscribe to
*/*/MODE_GM*promotion and demotion is avoided.
- The paired replication site prefix allows for creation of bridged networks without forwarding loops.
- It is easy for publishers to make use of last value queues (LVQs) to determine their last published message by setting an LVQ’s subscription to
solace(configure/message-vpn/replication)# create replicated-topic */*/MODE_GM_SF/REPL_ASYNC/>
solace(configure/message-vpn/replication)# create replicated-topic */*/MODE_GM_SF/REPL_SYNC/>
solace(...sage-vpn/replication/replicated-topic)# replication-mode sync
The last three points offer benefits to all Solace-based solutions, not just those using replication. For non-replicated solutions, the paired replication site prefix would become a prefix specific to the virtual router.
For applications that publish directly to queues rather than publishing to topics that are mapped to queues, the published messages can be replicated by configuring the queue’s special topic that is unique to the queue. (The special topic for a queue is
When deploying replication, there must be sufficient network bandwidth to accommodate the published message rate for all replicated topics. Some additional overhead is needed if using transactions. The replication queue can absorb message bursts above the available bandwidth, but it is important that the network connection between the active and standby site link be fast to keep up with the replication data. Compression can be enabled on the replication bridge connection, if necessary.
If security is an issue between the replication sites, SSL encryption on the replication bridge can be enabled.
The replication facility consumes some system resources when it's enabled on an event broker because the event broker automatically creates the following objects:
- one Message VPN bridge for the replication facility, plus one Message VPN bridge for each replicated Message VPN
- one queue for each Replicated Message VPN
- one queue topic subscription for each Replicated Message VPN
These system-created objects all have names that begin with the
# character (for example, the replication bridge is
#MSGVPN_REPLICATION_BRIDGE). As objects required for the successful operation of the replication facility, you can't delete or directly edit them.
When the Config-Sync facility is enabled, an event broker also automatically creates some objects that consume system resources. For information, refer to System Resources Used by Config-Sync.
If local or XA transactions are being replicated, additional transaction resources are used on both the active and standby sites to replicate the transactions.
When using replication, you must ensure that the Message VPN-level and system-level resources used by one event broker does not exceed those that can be supported by the event broker the other replication site. Consider, for example, a scenario where the appliance used at the primary site has higher client connection capacity than the appliance at the backup site. In the event of a fail-over, all the clients may not be able to connect to the backup site.