JCSMP Best Practices

General Best Practices

Tuning Guidelines for Guaranteed Messaging

Reductions in the rate at which clients receive messages can occur when a high volume of Guaranteed messages (particularly when they are large messages) is received over many Flows. In this situation, the number of Flows used and the Guaranteed window size used for each Flow affects the buffer usage of the per‑client priority queues that the event broker uses for Guaranteed messages. These queues, called the G-1 queues, hold Guaranteed messages for the clients that are waiting for delivery out of the event broker, or have been sent but are waiting for acknowledgment from the clients.

Each G-1 queue is allocated a maximum depth buffer. This maximum depth is measured in work units, whereby a work unit represents 2,048 bytes of a message. (By default, each G-1 queue is given a maximum depth of 20,000 work units.)

To address slow Guaranteed message delivery rates caused by high demands on the buffer allocated by G-1 queues, you should reduce the Guaranteed message window size used for each Flow and, when possible, reduce the number of Flows used.

If it's not possible to reduce the Guaranteed message window size, or the number of flows, you can also effectively increase the G-1 queue size by adjusting the min-msg-burst size used by the event broker.

Reapply Subscriptions

If enabled, the API maintains a local cache of subscriptions and reapplies them when the subscriber connection is reestablished. Reapply Subscriptions will only apply direct topic subscriptions upon a Session reconnect. It will not reapply topic subscriptions on durable and non-durable endpoints.

Number of Flows and Guaranteed Message Window Size

The amount of buffers used by a client for receiving Guaranteed messages is primarily determined by the number of Flows used per Session * the Guaranteed Message window size of each Flow. To limit a client’s maximum buffer use, you can reduce the number of Flows used and/or reduce the Guaranteed Message window size of each Flow. (The Guaranteed Message window size for each Flow is set through the Flow properties; refer to Important Flow (Message Consumer) Properties.)

Consider, for example, a client using Flows with a window size of 255 to bind to 10 Queues, and the Guaranteed messages from those Queues have an average size of 20kB. In this scenario, the Flow configuration for the client is not appropriately sized, as the client’s maximum buffer usage (approximately 24,902 work units) exceeds that offered by the event broker (20,000 work units). However, if the Flows are reconfigured with a window size of 25, then the client’s maximum buffer usage will fall within an acceptable range (approximately 2,441 work units).

Work units are fixed size buffers on the event broker that are used to process messages according to set queue depths. A work unit represents 2,048 bytes of a message.

If you are using JCSMP, you also need to tune the size of the Java consumer notification dispatcher queue so that it is large enough to buffer the maximum number of notifications that can be generated by all consumer flows (Guaranteed message flows as well as direct consumers) contained in all Sessions in a Context.

Illustration depicting the concepts described in the surrounding text.

Where:

GDFlows is all of the Guaranteed message consumer Flows in a Context.

FCL is the default consumer Flow congestion limit.

Nconsumers is the number of Direct message consumers in a Context.

Minimum Message Burst Size

If you can't reduce the number of Flows, or the Guaranteed Message window size, you can adjust the size of the G-1 queue. The simplest way to increase the queue is to adjust the min-msg-burst size. The min-msg-burst size specifies the number of messages that are always allowed entry into the queue. The min‑msg‑burst size is set on a per-client basis through client profiles.

Under normal operating conditions it's not necessary to change the default min‑msg-burst value for the G-1 queue. However, in situations where a client is consuming messages from multiple endpoints, it's important that the min‑msg‑burst size for the G-1 queue is at least equal to the sum of all of the Guaranteed message window sizes used for the Flows that the client consumes messages from. For example, if the client connects to 1,000 endpoints, and the Flows have a window size of 50, then the min-msg-burst size should be set to 50,000 or more.

Tuning the min-msg-burst size in this manner ensures that the NAB holds enough messages to fill the client’s combined Guaranteed message window size when it comes online. If there aren't enough messages held, messages that aren't delivered to the client can be discarded, then another delivery attempt is required. This process of discarding, then resending messages results in a slow recovery for a slow subscriber (that is, a client that doesn't consume messages at a quick rate).

For information on how to set the min-msg-burst size, refer to Configuring Egress Per-Client Priority Queues.

Basic Rules

When programming using JCSMP, it's useful to remember the following basic rules:

  • durable and non-temporary objects (such as durable endpoints) are created at the Factory level
  • non‑durable and temporary objects are created at the Session level
  • flows are created at the Session level

Threading

API Threading

Recommendation

  • Consider dispatching messages directly from the I/O thread if optimizing latency is most important. This model is the default behavior for C, .NET, and Java RTO APIs.

The APIs use a Context to organize communication between a client application and a Solace PubSub+ event broker. A Context is a container for one or more Sessions. The Context is responsible for handling Session related events and encapsulates threads that drive network I/O and message delivery notification for the Sessions.

Solace’s JCSMP is implemented to be inherently blocking, that is, calls available to the application could block. To address this, the API runs a separate dispatcher thread so that the application can make blocking calls within callbacks while still allowing the network I/O thread to continue reading the messages off the network. As the messages continue to be accepted into the API, a storing mechanism called the Notification Queue is used to keep track of the messages before they are dispatched to the client application by the dispatcher thread.

Unlike JCSMP, the C, .Net and Java RTO APIs are implemented to be inherently non-blocking. And, as such, client applications are not allowed to make blocking calls within callbacks using these APIs. Hence these implementations do not require separate dispatch and I/O threads. A Context thus has a single processing thread which is used to read messages off the socket as well as perform notification and message dispatching.

Dispatching directly from the I/O thread has the benefit of optimizing latency as messages are not being queued up in the Notification Queue which can potentially inject latency. Therefore C, .NET, and Java RTO APIs should be considered for latency sensitive applications. JCSMP can also be configured to dispatch directly from its I/O thread (called the “reactor” thread). To do so, enable the MESSAGE_CALLBACK_ON_REACTOR in the session property. In this case, however, the application must then ensure that it does not block in the callbacks. Otherwise, a deadlock may occur if the API is waiting for a response from the event broker but the thread is blocked for reading that response.

Although dispatching directly from the I/O thread reduces message latency, it also decreases the potential maximum message throughput as the messages are processed individually instead of in batches. This aspect should also be weighed in when deciding on the API threading model to implement. As the typical message size of an application plays a major determining factor in this regard, a performance evaluation prior to model selection is recommended when necessary.

For further details on threading interactions, please refer to API Threading .

Context and Session Threading Model Considerations

Recommendation

  • Use the 'One Session, One Context' threading model whenever possible. The 'Multiple Session, One Context' and 'Multiple Sessions, Multiple Contexts' models can potentially increase message processing throughput, but at the expense of additional processing.

There are three different Threading Models to consider when designing an application:

  1. One Session One Context. A single Session is used with a single Context.
  2. Multiple Sessions One Context. Multiple Sessions are serviced using one Context.
  3. Multiple Sessions Multiple Contexts. Application provides or uses a number of threads, each containing a single Context and each Context contains one or more Sessions.

For majority of cases, the 'One Session, One Context' model is sufficient for publisher and consumer application design.

An application designer may want to move to 'Multiple Sessions, One Context' if there is a need to prioritize messages where higher value messages maybe sent/received across different Sessions,for example, through different TCP connections. This approach can potentially increase throughput as well. This means that it may be necessary to forward received messages to downstream application internal queues such that messages are processed by additional application message processing threads. All received messages can be processed by the same message and event callback functions, or Session specific ones by creating additional callbacks.

With 'Multiple Sessions, Multiple Contexts', a designer can reduce the Context Thread processing burden of the 'Multiple Sessions, One Context' model where all Sessions must wait in the select loop before being processed. In this model, each Session can be separated into its own Context thread, and enhance the processing performance that multi-threading in the OS provides. Due to the increased number of threads however, this approach requires expensive thread context switching, and therefore places more burden on the CPU and is more resource intensive.

Always Cleanup

JCSMPSession, XMLMessageProducer, and XMLMessageConsumer are all related to system resource allocation. Close them properly whenever they are no longer used anymore or if an error occurs.

Closing JCSMPSession closes any Producer and Consumer associated with it.

Increase Buffer Sizes When Publishing Large Messages

On some Java Virtual Machines (JVMs), some users have reported it necessary to increase the socket send and receive buffer sizes when publishing large messages. The default socket buffer size used by Java is 64 KB.

To modify the send socket buffer size, call JCSMPChannelProperties#setSendBuffer(int so_sendbuf). To modify the receive socket buffer size, call JCSMPChannelProperties#setReceiveBuffer(int so_rcvbuf).

If your application is running on a Linux system, you must also change its TCP buffer size settings. Add variables rmem_max, wmem_max, tcp_rmem, and tcp_wmem as shown in the following snippet to /etc/sysctl.conf file.

# increase TCP maximum buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
 
# increase Linux autotuning TCP buffer limits
# min, default, and maximum number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216 
net.ipv4.tcp_wmem = 4096 65536 16777216

Scale the buffer sizes appropriately to handle the maximum message size and the network links to be encountered, then run sysctl -p. The values are set when the system is booted.

TCP Send and Receive Buffer Size

Recommendation

  • Adjust the TCP send and receive buffer sizes to optimize TCP performance, particularly when publishing large messages or for WAN performance optimization.

For TCP, the bandwidth-delay product refers to the product of a data link’s capacity and its round-trip delay time. The result expresses the maximum amount of data that can be on the network at any given time. A large bandwidth-delay product is expected for a WAN environment due to the intrinsic long round-trip delay, and as such TCP can only achieve optimum throughput if a sender sends a sufficiently large quantity of data to fill the maximum amount of data that the network can accept. This means that the TCP send and receive buffer size needs to be adjusted.

Specific to Windows platform, the receive socket buffer size must be much larger than the send socket buffer size to prevent data loss when sending and receiving messages. The recommended ratio is 3 parts send buffer to 5 parts receive buffer.

TCP’s socket send and receive buffer sizes can be configured through the API’s session properties setting. The session property parameters and default values are shown below. If the value of zero is used for setting these properties, the operating system’s default size is used.

  • JCSMPChannelProperties.setSendBuffer(int); 64,000 bytes
  • JCSMPChannelProperties.setReceiveBuffer(int); 64,000 bytes

Do Not Cache XMLMessages

If using messages from the producer message pool, always call XMLMessageProducer.create<TYPE>XMLMessage() to acquire a new XMLMessage instance for publishing. The application should not cache or reuse messages because the JCSMP may automatically recycle the messages.

Ultra-Low Latency

For ultra-low latency applications, you can enable the MESSAGE_CALLBACK_ON_REACTOR Session property to reduce message latency. When this Session property is enabled, messages delivered asynchronously to an XMLMessageListener are delivered directly from the I/O thread instead of from the consumer notification thread.

Although enabling this Session property reduces message latency, it also decreases the maximum message throughput.

Session Establishment

Host Lists

Recommendation

As a best practice, use host lists (see note below). Host lists are applicable when you use replication for failover support and the software event broker's hostlist High Availability (HA) support.

Host Lists should not be used in Active/Active Replication deployments.

For replication failover support, client applications must be configured with a host list of two addresses, one for each of the Guaranteed Messaging enabled virtual router at each site. If a connection fails for one host, it would the client would then try to connect to the to-be-active replication host before retrying the same host. For that reason, it's recommended to set the reconnect retires per host to 1.

Host lists must not be used in an active/active replication deployment where client applications are consuming messages from endpoints on the replication active message VPN on both sites.

Similarly, for software event broker HA failover support, if the switchover-mechanism is set to hostlist instead of IP address-takeover, the client application must provide a host list of two addresses.

For more details on hostlist configuration, see HA Group Configuration.

Client API Keep-alive

Recommendation

  • The Client Keep-alive interval should be set to the same order of magnitude as the TCP Keep-alive setting on the client profile.

There are two types of keep-alive mechanisms between the client application and the event broker.

There is the TCP Keep-alive that operates at the TCP level that is sent by the event broker to the client application. This is the TCP Keep-alive mechanism described in RFC 1122. The client application’s TCP stack responds to the event broker’s TCP Keep-alive probe. By default, the event broker sends out a keep-alive message after it detects idle connection for 3 seconds. It then sends 5 probes at the interval of 1 keep-alive probe per second. Hence, the event broker will flag a client to have failed TCP keep-alive if it receives no response after 8 seconds.

There is also the Client API Keep-alive that occurs concurrently to the TCP Keep-alive. This is the API’s built-in keep-alive mechanism, and operates on top of TCP at the API level. This is sent from the API to the event broker. By default, a Client Keep-alive is sent at the interval of once every 3 seconds, and up to 3 keep-alive responses can be missed before the API declares that the event broker is unreachable; that is, after 9 seconds.

These keep-alive mechanisms exist so that they will be able to advise the application or the event broker that its peer has died before it's able to notify the corresponding party. The keep-alive mechanism is also used to prevent disconnection due to network inactivity. However, if either mechanism is set much more aggressively, that is, with a shorter detecting time, than the other, the connection can be prematurely disconnected. For example, if the Client API Keep-alive is set at a 500 ms interval with 3 keep-alive responses while the TCP Keep-alive remains unchanged at the default, then the client API Keep-alive will trigger aggressive disconnection.

High Availability Failover and Reconnect Retries

Recommendation

  • The reconnect duration should be set to last for at least 300 seconds when designing applications for High Availability (HA) support.

When using a High Availability (HA) appliance setup, a failover from one appliance to its mate will typically occur within 30 seconds. However, applications should attempt to reconnect for at least 5 minutes. Below is an example of setting the reconnect duration to 5 minutes using the following session property values:

  • connect retries: 1
  • reconnect retries: 20
  • reconnect retry wait: 3,000 ms
  • connect retries per host: 5

Refer to Configuring Connection Time-Outs and Retries for instructions on setting connect retries, reconnect retries, reconnect retry wait and connect retires per host parameters.

Replication Failover and Reconnect Retries

Recommendation

  • The number of reconnect retries should be set to -1 so that the API will retry indefinitely during a replication failover.

In general, the duration of a replication failover is non-deterministic as it may require operational intervention for the switch, which can take up to tens of minutes, or hours. Hence, it's recommended to set the number of reconnect retires to -1 so that the API will try to indefinitely reconnect for a replication aware client application.

Refer to Reconnect Retries for instructions on how to set the reconnect retries parameter.

Replication Failover and Session Re-Establishment

Recommendation

  • API versions higher than 7.1.2 are replication aware, and automatically handle session re-establishment when a replication failover occurs. Client applications running lower API versions must re-establish a session upon reconnect.

Prior to 7.1.2, sessions need to be re-established after a replication failover when a client is publishing Guaranteed messages in a session that has been disconnected, because, while the reconnect is successful, the flow needs to be re-established since the newly connected event broker in the Replication site doesn't t have any flow state information, unlike the case for HA failover where this information is synchronized. The recommendation is to catch the unkonw_flow_name event and re-establish a new session to get the flow created. From version 7.1.2 onwards, the API is replication aware and transparently handles session re-establishment .

File Descriptor Limitation

Recommendation

  • The number of Solace sessions created by an application shouldn't exceed the number of file descriptors supported per process by the underlying operating system. For Unix variants, this number is 1024, and for Windows it's 63.

File descriptor limits in Unix platforms restrict the number of files that can be managed per process, and this is 1024 by default. Hence, an application shouldn't create more than 1023 Sessions per Context. A session represents a TCP/IP connection, and such a connection occupies one file descriptor. A file descriptor is an element - usually a number - that allows you to identify, in this case, a stream of data from the socket. Open a file to read information from disk also occupies one file descriptor.

File descriptors are called ‘file’ because initially they only identified files, although more recently, they can be used to identify files on disk, sockets, pipes, and so on.

Similarly, on Windows platforms, a single Context can only manage at most 63 Sessions.

Subscription Management

The following best practices can be used for managing subscriptions:

  • If you are adding or removing a large number of subscriptions, set the Wait for Confirm flag (JCSMPSession.WAIT_FOR_CONFIRM) on the final subscription to ensure that all subscriptions have been processed by the event broker. However, to increase performance, it is recommended that you do not set Wait for Confirm on all other subscriptions except for the last.
  • In the event of a Session disconnect, you can have the API reapply subscriptions that were initially added by the application when the Session is reconnected. To reapply subscriptions on reconnect, enable the Reapply Subscriptions Session property (JCSMPProperties.REAPPLY_SUBSCRIPTIONS). Using this setting is recommended.

Sending Messages

When sending messages, you should consider the following best practices for the message ownership model that you are using—a session-independent message model or a session-dependent message model.

For information on session-independent and session-dependent message ownership models, refer to Message Ownership for Direct messages and Message Ownership for Guaranteed messages.

Sending Session-Independent Messages

Summary

Always use the session-independent message model when sending messages. The session-dependent message model is supported for backwards compatibility.

Practice Details

There is no performance penalty for using the session-independent message ownership model if the messages are preallocated and reused whenever possible. Messages are allocated on-demand in the session-independent model.

With the session-dependent model, the application must call send immediately after the creation of the message to avoid possible memory resource exhaustion. Furthermore, messages from the session dependent message pool may be automatically recycled after the initial use, hence preventing the created message from being reused.

Refer to the Products for information on how to send session-independent messages.

Sending Session-Dependent Messages

When using the session-dependent message model, in which messages are taken from the Producer message pool, you must publish using send() after the creation of XMLMessage to avoid exhaustion of resources. For example, the following code sample incorrectly allows execution to continue without the message being sent after acquiring it:

while (keepPublishing)
			{
			TextMessage message = producer.createTextMessage();
			if (messageToPublish)
			{
			message.setText(“<xml>hello!</xml>”);
			producer.send(message);
			}
		}

To ensure that resources are not exhausted, call the send() method immediately after creating XMLMessage. For example:

while (keepPublishing)
			{
			TextMessage message = producer.createTextMessage();
			message.setText(“<xml>hello!</xml>”);
			producer.send(message);
		}

The session-dependent message ownership model is primarily maintained for backwards compatibility with existing applications that use JCSMP. It is recommended that new Java applications use Session‑independent messages. There is no performance penalty for using the Session‑independent message ownership model if messages are pre-allocated and reused, whenever possible.

Batch Send

Recommendation

  • Use the batch sending facility to optimize send performance. This is particularly useful for performance benchmarking a client application.

Use the batch-sending facility to optimize send performance. This is particularly useful for performance benchmarking client applications.

A group of up to 50 messages can be sent through a single API call. This allows messages to be sent in a batch. The messages can be either Direct or Guaranteed. When batch-sending messages through the send-multiple API, the same Delivery mode, that is Direct or Persistent mode, should be set for all messages in the batch. Messages in a batch can be set to different destinations.

In addition to using the batch-sending API, messages should be pre-allocated and reused for batch-sending whenever possible. Specifically, don't reallocate new messages for each call to the batch-sending API.

The batch-sending API call is XMLMessageProducer.sendMultiple().

High-Performance Batch Sending

When batch sending messages, only use session-independent messages; do not use session-dependent messages. For information on session-independent and session-dependent message ownership models, refer to Message Ownership for Direct messages and Message Ownership for Guaranteed messages.

To optimize performance, it is also recommended that you preallocate messages and reuse them for batch sends whenever possible. (That is, avoid reallocating new messages for each call to JCSMPSendMultipleEntry().)

For more information on batch sending for Direct messages, refer to Sending Direct Messages. For more information on batch sending for Guaranteed messages, refer to Sending Direct Messages.

Time-to-Live Messages

Recommendation

  • Set the TTL attribute on published guaranteed messages to reduce the risk of unconsumed messages unintentionally piling up in the queue if the use-case allows for discarding old or stale messages.

Publishing applications should consider utilizing the TTL feature available for Guaranteed Messaging. Publishers can set the TTL attribute on each message prior to sending to the event broker. Once the message has been spooled, the message will be automatically discarded (or moved to the queue’s configured Dead Message Queue, if available) should the message not be consumed within the specified TTL. This common practice reduces the risk of unconsumed messages unintentionally piling up.

Alternatively, queues have a max-ttl setting, and this can be used instead of publishers setting the TTL on each message sent. See Configuring Max Message TTLs for instructions on setting max-ttl for a queue.

Configuring respect-ttl

Queues should be configured to respect-ttl as, by default, this feature is disabled on all queues. Refer to Enforcing Whether to Respect TTLs for instructions on how to set up respect-ttl.

Receiving Messages

Handling Duplicate Message Publication

Recommendation

  • Publishing duplicate messages can be avoided if the client application uses the Last Value Queue (LVQ) to determine the last message successfully spooled by the event broker upon restarting.

When a client application is unexpectedly restarted, it's possible for it to become out-of-sync with respect to the message publishing sequence. There should be a mechanism by which it can determine the last message that was successfully published to, and received by, the event broker in order to correctly resume publishing without injecting duplicate messages.

One approach is for the publishing application to maintain a database that correlates between the published message identifier and the acknowledgment it receives from the event broker. This approach is completely self-contained on the client application side, but can introduce processing latencies if not well managed.

Another approach is to make use of the Last Value Queue (LVQ) feature, where the LVQ stores the last message spooled on the queue. A publishing client application can then browse the LVQ to determine the last message spooled by the event broker. This allows the publisher to resume publishing without introducing duplicate messages.

Refer to Configuring Max Spool Usage Values for instructions on setting up LVQ.

Handling Redelivered Messages

Recommendation

  • When consuming from endpoints, a client application should appropriately handle redelivered messages.

When a client application restarts, unexpectedly or not, and rebinds to a queue, it may receive messages that it had already processed as well as acknowledged. This can happen because the acknowledgment can be lost on route to the event broker due to network issues. The redelivered messages will be marked with the redelivered flag.

A client application that binds to a non-exclusive queue may also receive messages with the redelivered flag set, even though the messages are received by the client application for the first time. This is due to other clients connecting to the same non-exclusive queue which disconnects without the application acknowledging the received messages. These messages are then redelivered to other client applications that bind to the same non-exclusive queue.

The consuming application should contain a message processing mechanism to handle the above mentioned scenarios.

Dealing with Unexpected Message Formats

Recommendation

  • Client applications should be able to handle unexpected message formats. In the case of consuming from endpoints, a client application should acknowledge received messages even if those messages are unexpectedly formatted.

Client applications should be able to contend with unexpected message formats. There shouldn't be any assumptions made about a message's payload; for example, a payload may contain an empty attachment. Applications should be coded such that they will avoid crashing, as well as logging the message contents and sending an acknowledgment back to the event broker if using Guaranteed Messaging. If client applications crash without sending acknowledgments, then when they reconnect, the same messages will be redelivered causing the applications to fail again.

Client Acknowledgment

Recommendation

  • Client Applications should acknowledge received messages as soon as they have completed processing those messages when client acknowledgment mode is used.

Once an application has completed processing a message, it should acknowledge the receipt of the message to the event broker. Only when the event broker receives an acknowledgment for a Guaranteed Message will the message be permanently removed from its message spool. If the client disconnects without sending acknowledgments for some received messages, then those messages will be redelivered. For the case of an exclusive queue, those messages will be delivered to the next connecting client. For the case of a non-exclusive queue, those messages will be redelivered to the other clients that are bound to the queue.

There are two kinds of acknowledgments:

  • API (also known as Transport) Acknowledgment. This is an internal acknowledgment between the API and the event broker and isn't exposed to the application. The Assured Delivery (AD) window size, acknowledgment timer, and the acknowledgment threshold settings control API Acknowledgment. A message that isn't transport acknowledged will be automatically redelivered by the event broker.
  • Application Acknowledgment. This acknowledgment mechanism is on top of the API Acknowledgment. Its primary purpose is to confirm that message processing has been completed, and that the corresponding messages can be permanently removed from the event broker. There are two application acknowledgment modes: auto-acknowledgment and client acknowledgment. When auto-acknowledgment mode is used, the API automatically generates application-level acknowledgments on behalf of the application. When client acknowledgment mode is used, the client application must explicitly send the acknowledgment for the message ID of each message received.

Refer to the Receiving Guaranteed Messages for a more detailed discussion on the different acknowledgment modes.

Consume Messages and Return From Callbacks As Soon As Possible

To ensure the highest possible message throughput, received messages should be consumed as soon as possible after receipt.

When using XMLMessageConsumer in synchronous operating mode, the application should call receive() as often as possible to retrieve messages received by an XMLMessageConsumer. If too many messages accumulate in the Queue for XMLMessageConsumer, XMLMessageConsumer is deemed congested and message delivery is suspended.

When using XMLMessageConsumer in asynchronous operating mode, the application should ensure that the callback methods defined in XMLMessageListener return promptly, so that the calling thread is not blocked from processing subsequent messages.

The application should similarly return from API event callbacks as quickly as possible.

Memory Management When Receiving Messages

The API dynamically allocates memory for each message received. If an application uses AUTO ACK mode, the API keeps a reference to the message until the message callback returns. The API then uses the reference to acknowledge the message, after which the message memory is eligible to be freed by the garbage collector. If the application is receiving Direct messages or is using CLIENT ACK mode, the API does not keep a reference to the message after calling the message callback. Provided the application does not create a new reference, the message memory is then eligible for garbage collection when the callback returns, and the reference it contains go out of scope.

Queues and Flows

Receiving One Message at a Time

Recommendation

  • Setting max-delivered-unacked-msgs-per-flow to 1 and AD Window Size to 1 to ensure messages are delivered from the event broker to the client application one message at a time and in a time-consistent manner.

An API only sends transport acknowledgments when either,

  1. it has received the configured acknowledgment threshold worth of configured Assured Delivery (AD) window messages (i.e. 60%)
  2. a message has been received and the configured AD acknowledgment time as passed since the last acknowledgment was sent (i.e. 1 seconds), whichever comes first.

The application acknowledgment piggybacks on the transport acknowledgment for the delivery from the client application to the event broker. And the event broker only releases further messages once it receives the acknowledgment.

Therefore, while setting max-delivered-unacked-msgs-per-flow to 1 will ensure that messages are delivered to the client application one at a time, if the AD window size is not 1, then condition 1 will not be immediately fulfilled. This will result in a reception delay variation since the API will only send the acknowledgment after condition 2 is fulfilled, and is therefore not consistent with the expected end-to-end message receipt delivery delay.

Refer to Configuring Max Permitted Number of Delivered Unacked Messages for instructions on how to configure max-delivered-unacked-msgs-per-flow on queues.

Setting Temporary Endpoint Spool Size

Recommendation

  • Exercise caution if a client application frequently creates temporary endpoints to ensure that the sum of all temporary endpoint spool sizes does not exceed the total spool size provisioned for the Message VPN.

By default, the message spool quota of a Message VPN and endpoint is based on an over-subscription model. For instance, it's possible to set the message spool quota of multiple endpoints to the same quota as that of an entire Message VPN. Temporary endpoints created by a client application default to 4000 MB for the Solace application and 1500 MB for the software event broker. When temporary endpoints are used extensively by a client application, the message spool over-subscription model can quickly get out-of-control when temporary endpoints are being created on-demand. Therefore, it's recommended that a client application overwrite an endpoint’s default message spool size to a value that is inline with expected usage, especially if temporary endpoints are heavily used.

AD Window Size and max-delivered-unacked-msgs-per-flow

Recommendation

  • The AD window size configured on the API should not be greater than the max-delivered-unacked-msgs-per-flow value that is set for a queue on the event broker.

max-delivered-unacked-msgs-per-flow controls how many messages the event broker can deliver to the client application without receiving back an acknowledgment. The Assured Delivery (AD) window size controls how many messages can be in transit between the event broker and the client application. So, if the AD window size is greater than max-delivered-unacked-msgs-per-flow, then the API may not be able to acknowledge the messages it receives in a timely manner. Effectively, the AD window size is bounded by the value set for max-delivered-unacked-msgs-per-flow. For instance, if the AD window size is set to 10, and max-delivered-unacked-msgs-per-flow is set to 5, then the event broker will effectively be limited to send out 5 messages at a time regardless of the client application’s AD window size setting of 10.

Refer to Configuring Max Permitted Number of Delivered Unacked Messages for instructions on how to set up max-delivered-unacked-msgs-per-flow on queues.

Number of Flows and AD Window Size

Recommendation

  • Size the expected number of flows per session, and its associated AD window size, to within the available memory limit of the client application host, and within the default work units allocated per client egress queue on the event broker.

The API buffers received Guaranteed messages and, in general, also owns the messages and is responsible for freeing them. The amount of buffers used by a client is primarily determined by multiplying the Assured Delivery (AD) window size by the number of flows used per session. For example, if a receiving client application is using flows with an AD window size of 255 to bind to 10 different queues on an event broker, then the maximum buffer usage, given an average message size of 1 M, will be 2560 MB. If there are 10 such clients running on the same host, then 25.6 G of memory will be required.

Similarly, the event broker dedicates a per-client egress queue to buffer the to be transmitted messages to the client application. By default, this is 20,000 work units, or an equivalent of 40 M worth of buffer as each work unit is 2048 bytes. For a per-client egress queue to support 2560 MB worth of buffering, the number of work units for this particular client will need to be increased to 130,560. Hence, depending on application usage, it's recommended that you dimension the AD window size in relation to the number of expected flows per session such that it will be within the default 20,000 work units worth of buffer per client connection.

Error Handling and Logging

Logging and Log Level

Recommendation

  • Client Application Debug level logging should not be enabled in production environments.

Client Application Event logging can have a significant impact on performance, and so, in a production environment, it's not recommended to enable debug level logging.

Handling Session Events / Errors

Recommendation

  • Client Applications should register an implementation of the Session Event handler interface / delegate / callback when creating a Session to receive Session events.

Client applications should register an implementation of the Session Event Handler interface / delegate / callback when creating a Session to receive Session events. A complete list of Session Event is listed in the table below. Subsequently, Session events should be handled appropriately based on client application usage.

A number of these Session Events are feedback to the client application through exception handling. Refer to JCSMPException and its sub classes, JCSMPTransportException, JCSMPStateException, JCSMPOperationException, for detailed information. For instance, the JCSMPTransportException is raised after several (re)connection retries.

Session Events

Java (SessionEvent Enum) Description

DOWN_ERROR

The Session was established and then went down.

RECONNECTED

The automatic reconnect of the Session was successful, and the Session was established again.

RECONNECTING

The Session has gone down, and an automatic reconnect attempt is in progress.

SUBSCRIPTION_ ERROR

The application rejected a subscription (add or remove).

VIRTUAL_ ROUTER_ NAME_ CHANGED

The appliance’s Virtual Router Name changed during a reconnect operation.

UNKNOWN_ TRANSACTED_ SESSION_ NAME

An attempt to re-establish a transacted session failed.

INCOMPLETE_ LARGE_ MESSAGE_ RECEIVED

Incomplete large message is received by the consumer due to not receiving all the segments in time.

Handling Flow Events / Errors

Recommendation

  • Client applications should register an implementation of the Flow Event handler interface / delegate / callback when creating a Flow to receive Flow events.

Client applications should register an implementation of the Flow Event Handler interface / delegate / callback when creating a Flow to receive Flow events. Flow error / events should be handled appropriately based on client application usage.

An error event is indicated by throwing on access a JCSMPFlowTransportException, and, in the case of a flow in asynchronous mode, the error condition is delivered to the XMLMessageListener’s onException().

Flow Events

Java (FlowEvent Enum) Description

FLOW_ACTIVE

The Flow has become active.

FLOW_INACTIVE

The Flow has become inactive.

FLOW_RECONNECTING

When Flow Reconnect is enabled, instead of an INACTIVE event, the API generates this event and attempts to rebind the Flow.

If the Flow rebind fails, the API monitors the bind failure and terminates the reconnecting attempts with an INACTIVE unless the failure reason is one of the following:

  • Queue Shutdown
  • Topic Endpoint Shutdown
  • Service Unavailable

For more information about Flow Reconnect, refer to Flow Reconnect.

FLOW_RECONNECTED

The Flow has been successfully reconnected.

Custom Handling of Reconnect Events

Recommendation

  • A client application that uses the Session’s message consumer should also register a JCSMPReconnectEventHandler instance when acquiring the consumer to hook into the API’s reconnect logic.

A client application that uses the Session’s message consumer - acquired through JCSMPSession.getMessageConsumer(JCSMPReconnectEventHandler.XMLMessageListener) - should also register a JCSMPReconnectEventHandler instance when acquiring that consumer. This handler receives callbacks before preReconnect() and after postReconnect() the client’s TCP connection is reconnected after a connection failure. Through preReconnect() and postReconnect(), the client application can execute a list of actions before and after JCSMP attempts to re-establish the connection, for instance, to notify other parties within the system of the connection failure.

Notification of Transport Events in Publish-Only Applications

Recommendation

  • For publish-only applications, it's necessary to create an empty XMLMessageConsumer in order to receive events from JCSMPReconnectEventHandler such that transport layer events can be captured.

The JCSMPReconnectEventHandler is only exposed through XMLMessageConsumer. For publish-only applications that acquire XMLMessageProducer, it's necessary to create an empty XMLMessageConsumer in order to receives events from JCSMPReconnectEventHandler. The consumer can be acquired through session.getMessageConsumer(). It need not be started by calling the consumer.start() method. This way, transport layer events can be captured for publishing only applications.

Event Broker Configuration that Influences Client Application Behavior

Max Redelivery

Recommendation

  • By default, messages are to be redelivered indefinitely from endpoints to clients. Set the maximum redelivery option on endpoints at the event broker, when appropriate, to limit the maximum number of message redeliveries per message.

The maximum redelivery option can be set on an endpoint to control the number of deliveries per message on that endpoint. After the maximum number of redeliveries by the endpoint is exceeded, messages are either discarded or moved to the Dead Message Queue (DMQ), if it's configured and the messages are set to DMQ eligible.

There are benefits for client applications when the number of redeliveries on an endpoint is not infinite (by default the redelivery mode is set to redelivery forever). For instance, if a client application is unable to handle unexpected poison messages, the message will eventually be discarded or moved to DMQ where further examination can take place.

Reject Message to Sender on Discard

Recommendation

  • reject-msg-to-sender-on-discard on an endpoint should be enabled unless there are good reasons not to.

When publishing guaranteed messages to an event broker, messages can be discarded for reasons such as message-spool full, maximum message size exceeded, endpoint shutdown, and so on. If the message discard option on the endpoint, that is, reject-msg-to-sender-on-discard, is enabled, then the client application will be able to detect that discarding is happening and take corrective action such as pausing publication. There is no explicit support at the API to pause publication; this should be carried out by the client application logic.

One reason to consider disabling reject-msg-to-send-on-discard is the situation where there are multiple queues subscribing to the same topic that the Guaranteed messages are published to, and the intent is for other queues to continue receiving messages even if one of the queues is deemed unable to accept messages.