Message Replay Configuration

Message replay allows published Guaranteed messages, and Direct messages promoted to Guaranteed, that were not rejected to the publisher to be stored in a replay log. Storage of the messages occurs for an indefinite period after being received in case they need to be replayed at some later date.

You can initiate replay log playback for an endpoint from either the event broker, or from clients who have at least consume privileges for the endpoint. The latter causes a filtered subset of the logged messages to be sent to those clients.

For more information about message replay, see:

For instructions for using Solace CLI commands to configure and manage message replay, see:

For instructions to set up and use message replay in Broker Manager, see Configuring Message Replay.

Next Steps

From here there are a few different paths you can take depending on what you want to learn:

Deployment Notes

Supported PubSub+ Products

Message replay is supported on the following Solace PubSub+ products:

  • Solace PubSub+ 3530 and 3560 appliances
  • PubSub+ software event brokers
  • PubSub+ Cloud (Controlled Availability)

Advisories and Considerations When Using Message Replay

You should be aware of the following message replay characteristics:

  • Message replay consumes event broker processing resources because it needs to maintain the replay log, handle retrieval and replay of messages, and trim the replay log whenever its spool usage reaches 90% of its configured quota for max-spool-usage.
  • Slow disks negatively impact message replay performance.
  • Replication with message replay isn't supported.

Terminology

The following table presents definitions of terms that are commonly used in message replay discussions.

Term Description

live message

A published message that has not been sent to all consuming clients. It exists somewhere in the datapath; that is, it was either received on ingress, sent via egress, or exists in one or more endpoints.

replayable message

A live message that is received on a Message VPN that has message replay enabled.

logged message

A replayable message that is available in the replay log for later replay.

replaying endpoint

An endpoint that is currently receiving logged messages from the replay log.

replayed message

A logged message that is available in the replay log and matches a replaying endpoint's subscriptions and has been added to the endpoint.

replay to endpoint

The process of adding a logged message to a replaying endpoint, thereby creating a replayed message instance on the replaying endpoint.

replay log trimming

The process of automatically deleting the oldest messages from the replay log to make room for new live data.

Types of Replay Requests

A message replay request can be initiated for an endpoint by either a client application using a PubSub+ Messaging API when it binds to a Guaranteed endpoint, or from Broker Manager, Solace CLI or SEMP. In either case, message replay supports the following types of playback requests:

Start replay from the beginning of the log
The requester specifies that the playback is to start from the oldest message in the replay log. All messages in the replay log that match the endpoint’s subscriptions are delivered to the endpoint.
Start replay from a specific date
The requester specifies the date and time from which the playback is to start. Any messages in the replay log equal to, or newer than, the specified date and time that match the endpoint’s subscriptions are delivered to the endpoint.
Start replay from replication group message ID
The requester specifies a replication group message ID after which the playback is to start. Any messages in the replay log received after the specified replication group message ID that match the endpoint's subscriptions are delivered to the endpoint.

In all cases, once the replay has caught up to the live data stream, message delivery switches to live message delivery from the endpoint.

Lifecycle of a Replayable Message

Let's walk though the lifecycle of a replayable message. A couple of points to keep in mind is that the lifecycle is the same for all Types of Replay Requests, and in each of the following lifecycle phases the underlined passages correspond to terms in the Terminology section.

  1. A live message is received on a Message VPN that has message replay enabled. This message is considered to be a replayable message.
  2. The message is successfully spooled to all its non-replaying destinations, gets added to the replay log, and is now considered to be a logged message.
  3. At some time later the event broker receives a replay request with a start time that includes the logged message. Once the replay gets to the logged message, and successfully performs replay to endpoint, it's considered a replayed message.
  4. Finally, the replayed message is delivered to a consumer and is acknowledged, which removes it from the replaying endpoint; however, it's still in the replay log until it gets trimmed.

Replaying to an Endpoint

Replaying logged messages occurs on a per-endpoint basis, and can be initiated using PubSub+ Broker Manager, Solace CLI, and SEMP commands, or by a client application using a PubSub+ Messaging API when it binds to an endpoint. As described in Types of Replay Requests, a replay request can initiate the playback of all the logged messages, all messages after a given replication group message ID, or start from a given date and time, with the condition that if a start date and time are provided, they must not be for some future date and time. In all cases, all clients bound to the replaying endpoint are disconnected, except the client requesting the replay if done through a bind request.

Topic Matching

All logged messages from the playback starting point are processed, and the event broker attempts to match their topics with the replaying endpoint subscriptions. When there's a match, the event broker replays the logged message to the replaying endpoint, and it sends the original message as it was received, without modification. It sends the original topic, content, and message ID. The event broker doesn't mark the message as redelivered.

Windowing

Windowing is used while replaying messages to the replaying endpoint. This means that only a certain number of messages are replayed to the endpoint before the event broker waits for a replayed message to be consumed from the replaying endpoint.

Initiating Replay on Endpoints Undergoing Replay

Initiating a replay on an endpoint already under replay cancels the previous replay and starts a new one.

Receiving Live Messages on an Endpoint Undergoing Replay

Live messages received with a replaying endpoint as a destination undergo all the usual checks against the replaying endpoint as it would do if it wasn't replaying. Once all the checks have passed, the message isn't spooled to the endpoint, but is spooled to the replay log - it will be replayed later once the replay catches up to the message.

Rejected Messages

A message that gets rejected (NACK'ed) to the publisher doesn't get logged to the replay log.

Replaying to Exclusive Queues

Replaying to any exclusive queue whose active consumer is using an egress selector to filter messages is not recommended. If an egress selector is used, and it causes replayed messages to stay indefinitely on a replaying endpoint, the replay won't complete.

Replaying to Temporary endpoints

In event broker version 10.0.0 and later, you can replay to a temporary topic endpoint or temporary queue provided that you configure the event broker appropriately. For more information, see Enable Replay on Temporary Queues.

Replay States

Endpoints display one of the following replay states:

State Description

N/A

Message replay has never been requested for the endpoint.

Complete

The last requested replay has finished and no replay is in progress.

Initializing

A message replay has been requested and will begin after all live messages have been removed from the endpoint.

Active

Message replay is in progress. The endpoint is currently receiving messages from the replay log.

Pending Complete

Message replay has reached the end of the replay log but there are still unacknowledged replayed messages on the endpoint. New live messages are being delivered to the endpoint. However, replay can still fail, in which case the unacknowledged replayed messages would be deleted from the endpoint.

Failed

A replay has failed and is the endpoint is waiting for acknowledgment of the unbind request it sent to the event broker as a failure indication.

Trimming the Replay Log

As the replay log grows larger and reaches the configured capacity, older logged messages are deleted to make room for new ones. This process is called replay log trimming. The event broker automatically performs trimming when the replay log exceeds 90% of its configured quota for max-spool-usage.

It is possible for an endpoint to contain and successfully deliver replayed messages that were trimmed from the replay log. This is because if at least one other endpoint contains the message, the message is still considered spooled.

You can also choose to manually trim the replay log using the trim-logged-messages command in addition to automatic trimming as part of your trimming strategy.

Because trimming occurs in batches and is triggered based on factors, such as the size of the replay log (configured quota), and the number and size of the messages, trimming may not immediately occur when the replay log is over 90% of its configured quota for max-spool-usage. For example, if the configured quota for the replay log is very small (such as less than 1000 MB), the target of 90% may not be perfectly maintained.

Impacts of Slow Trimming

If an event broker receives messages at a faster rate than it can trim the replay log, the spool usage for the replay log can increase to over 90% of its configured quota for max-spool-usage.If the replay log's spool usage reaches 100% of configured quota, the event broker back-pressures publishing clients to ensure the replay log size does not exceed its configured quota.

Solace recommends that you monitor the replay log usage and choose a replay log trimming strategy. For example, if you notice that a replay log consistently exceeds 90% of its configured quota for max-spool-usage, you may need to manually trim the replay log, or be more selective with the messages that are stored to the message replay log. For example, you can use topic filters to reduce the number of messages written to the replay log. For more information, see Determining a Log Trimming Strategy.

For more information about monitoring and using topic filters, see:

Determining a Log Trimming Strategy

When you configure message replay, Solace recommends that you decide on a strategy to handle replay log trimming. You can simply choose to use automatic trimming where no additional configuration is required, but you should be aware that trimming the replay log may cause potential performance impacts to the event broker regarding the rate of received messages it is able to process. 

You can choose to manually trim the replay log before the 90% threshold is reached. This strategy can reduce the performance implications of automatic trimming at unexpected times. Though manual trimming has the same performance impact on the event broker, you can choose to run it at designated times when the event broker isn't sensitive to performance variations. A use case that is well-suited for manual trimming and avoids automatic trimming includes these characteristics:

  • The traffic pattern of the event broker follows a predictable cycle, in which there is a quiet time period where trimming the replay log does not cause performance impacts.

  • The replay log can be sized large enough to hold all messages logged for the minimum replay log retention that you require, plus the messages logged within the duration of a traffic cycle.

  • It's possible to deploy an automated application to perform the trimming on that runs on an appropriate schedule.

The following example shows how you could implement a manual trimming strategy. Consider a deployment where:

  • A system writes up to 125,000 MB of messages to the replay log each day.
  • The event broker has low-traffic during nighttime, when manual replay log trimming can be performed without impacting the event broker performance.
  • The required log retention is seven days and includes an additional cycle day.

With the parameters mentioned above, you can use the following configuration:

  1. The replay log would require 1,000,000 MB without automatic trimming occurring. This sample calculation is as follows:

    • The 125,000 MB /day x (7 days of retention with an addition one day cycle) = 125,000 MB/day x 8 days = 1,000,000 MB.

    • Because trimming begins at 90% of the configured quota, you would set the max-spool-usage to 25% more than the size of the data to retain (1,250,000 MB) as recommended. Because the expected spool usage won't exceed 1,000,000 MB, automatic trimming wouldn't begin until the replay log had approximately 1,125,000 MB of data and would ensure that your replay log is sufficiently sized. See Configuring the Replay Log Size Using max-spool-usage for more information.

      This design uses 80% of the replay log's configured quota for max-spool-usage, but because trimming doesn't occur until 90%, it minimizes automatic trimming from occurring and it provides you the flexibility to change the system behavior without losing important data.To ensure this design operates within the design parameters, monitoring is required as described in step 3.

  2. In your deployment, your automated management application runs nightly to manually trim the replay log during your quiet period using one of the following mechanisms:

    • Running the enable admin message-spool message-vpn <vpn-name> replay-log trim-logged-messages <older-than-date> command (see Message Replay Configuration).
    • Using the SEMPv2 command, PUT /msgVpns/{msgVpnName}/replayLogs/{replayLogs/{replayLogName}/trimLoggedMsgs (see the SEMP reference for more information).
  3. As you monitor the replay log usage, ensure the replay log usage never exceeds 80% as described in step 1. If it does, the presumptions of the design of your strategy are no longer valid and you must re-characterize your system, adjust the replay log's configured quota for max-spool-usage, and then update the system.

Direct Messages and Message Replay

Message replay is intended to be used for Guaranteed messages. Direct messages that aren't promoted to Guaranteed are not logged. Therefore, if there's a possibility that your messages might need to be replayed, they should be published as Guaranteed messages.

Be careful when promoting Direct messages because it can dramatically affect the event broker's performance and if the event broker's Guaranteed message budget is exceeded, the Direct messages are not logged. If you have Direct messages that you want to record in the replay log, Solace recommends changing the publisher to use Guaranteed messages. For more information about message promotion and the associated risks, see Topic Matching and Message Delivery Modes.