Performance Monitoring

The event broker can provide metrics on ingress and egress message and byte rates for the following:

  • Endpoint (Queue or Topic Endpoint)
  • Client
  • VPN aggregated endpoint rates
  • VPN aggregated client rates
  • Global aggregated endpoint rates
  • Global aggregated client rates

The above metrics can all be retrieved using CLI or SEMP. Also, the Client, Endpoint, and Global aggregated client rates can be obtained via SolAdmin.

The Solace API packages contain samples that demonstrate how to create such SEMP based applications. Moreover, a read-only administrative user can be used for SolAdmin to allow for viewing of statistical data.

For more information on SEMP request/reply formats, and performing SEMP commands over the message bus, refer to SEMP.

In this section we'll take a holistic view of messaging paths, and ensure that it's understood that all points where message loss could occur are correctly monitored and can be reported on.

Guaranteed Messaging Data Path

The Guaranteed Messaging data path can be divided into three distinct phases:

Ingress Message Processing

It's assumed the reason for monitoring message loss, or high latency message flow through the event broker, is to notify you of potential issues so you can take appropriate action to limit impact on your business operations. Therefore, the tightest feedback is for the event broker to send a negative acknowledgment back to the application in as many failure situations as possible. To facilitate this, the queue and topic endpoints should be configured with reject-msg-to-sender-on-discard as per Configuring Message Discard Handling. Special care must be taken with this feature when publishing to topics. If a single topic-endpoint or queue rejects a message, then all other endpoints will also reject the message.

The corollary to this recommendation is that all applications process their positive and negative acknowledgments and take appropriate action on message rejection.

This single recommendation will give the applications immediate notification for ingress discards in most cases, but not advance warning that discards are imminent. Also the case where the application will not be notified is where there is no valid destination for the persistent message published to a topic. To determine this case is happening it's important to poll the command show log no-subscription-match which will give the most recent 1000 topics that have been published to without a valid endpoint or subscribing client. This command provides the client, client-username, VPN, and actual topic published.

It's not recommended to poll for other message-spool ingress discards statistics at a rate higher than once every few minutes, if at all, as although these statistics tell you the condition is happening, it will take significant work to tell you why. By this time the application will have received a negative acknowledgment indicating cause, and the event.log will have produced a detailed log on why discards are occurring.

When the maximum number of ingress flows for a message VPN or the event broker wide limit is reached, no more messages can be published to the event broker. In order to set this value, use the following CLI Command:

solace (config-message-spool)# max-ingress-flows

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MAX_INGRESS_FLOWS_EXCEEDED

SYSTEM_AD_INGRESS_FLOWS_HIGH

SYSTEM_AD_INGRESS_FLOWS_HIGH_CLEAR

Message Replication to Mate and Spooling

This phase of the data path can be monitored by configuring the event broker to receive notifications where the event broker is approaching limits that will cause ingress discards or that ingress discards. It's important to set appropriate event thresholds and monitor events.

There are a number of events which can be monitored, starting from the lowest level disk health, through general system assured-delivery health to Message VPN health, and finally endpoint health.

These settable events monitor total disk utilization:

solace(config-disk-utilization-trap) # thresholds
clear-value set-value

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH

SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH_CLEAR

SYSTEM_CHASSIS_DISK_UTILIZATION_MAX

These events will be generated if the ADB blade goes down. Note there is no BLADE_UP on reboot:

SYSTEM_CHASSIS_BLADE_DOWN

SYSTEM_CHASSIS_BLADE_UP

These events indicate that the ADB mate links are down or up and messages are able to replicate to the mate:

SYSTEM_LINK_ADB_LINK_DOWN

SYSTEM_LINK_ADB_LINK_UP

These events indicate that at least one multipath link to the SAN is down or up:

SYSTEM_LINK_PATH_TO_DISK_ARRAY_DOWN

SYSTEM_LINK_PATH_TO_DISK_ARRAY_UP

Events for monitoring Guaranteed Messaging systems

These settable events monitor disk utilization by Guaranteed Messaging. When exceeded, clients will not be able to send messages to Guaranteed Messaging:

solace(config-hardware-message-spool-event) # disk-usage thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_DISK_USAGE_EXCEEDED

SYSTEM_AD_DISK_USAGE_HIGH

SYSTEM_AD_DISK_USAGE_HIGH_CLEAR

These settable events monitor ingress flows, when exceeded clients will not be able to connect and send messages to Guaranteed Messaging system:

solace(config-hardware-message-spool-event) # ingress-flows thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_INGRESS_FLOWS_HIGH

SYSTEM_AD_INGRESS_FLOWS_HIGH_CLEAR

SYSTEM_AD_MAX_INGRESS_FLOWS_EXCEEDED

These settable events monitor the total trackable number of messages on the event broker, and when exceeded, clients will not be able to send messages to Guaranteed Messaging:

solace(config-hardware-message-spool-event) # message-count thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MSG_COUNT_UTILIZATION_EXCEED

SYSTEM_AD_MSG_COUNT_UTILIZATION_HIGH

SYSTEM_AD_MSG_COUNT_UTILIZATION_HIGH_CLEAR

These settable events monitor the total trackable files in the spool directories on the event broker, and when exceeded, clients will not be able to send messages to Guaranteed Messaging:

solace(config-hardware-message-spool-event) # spool-files thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_SPOOL_FILES_EXCEEDED

SYSTEM_AD_SPOOL_FILES_HIGH

SYSTEM_AD_SPOOL_FILES_HIGH_CLEAR

These settable events monitor the spool against the set quota, and when exceeded, clients will not be able to send messages to Guaranteed Messaging:

solace(config-hardware-message-spool-event) # spool-usage thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MSG_SPOOL_HIGH

SYSTEM_AD_MSG_SPOOL_HIGH_CLEAR

SYSTEM_AD_MSG_SPOOL_QUOTA_EXCEED

These settable events are ADB3 specific, and when exceeded, messaging performance will drop to ADB2 levels of performance:

solace(config-hardware-message-spool-event) # cache-usage thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MAX_MSG_CACHE_USAGE_EXCEEDED

SYSTEM_AD_MSG_CACHE_USAGE_HIGH

SYSTEM_AD_MSG_CACHE_USAGE_HIGH_CLEAR

At the VPN level, ingress flow and spool-usage events are configurable as well as the VPN-level spool quota:

solace(config-message-spool-event) #
ingress-flows spool-usage

The corresponding set and clear Syslog events generated are as follows:

VPN_AD_INGRESS_FLOWS_HIGH

VPN_AD_INGRESS_FLOWS_HIGH_CLEAR

VPN_AD_MAX_INGRESS_FLOWS_EXCEEDED

VPN_AD_MSG_SPOOL_HIGH

VPN_AD_MSG_SPOOL_HIGH_CLEAR

VPN_AD_MSG_SPOOL_QUOTA_EXCEED

At the individual endpoint level, there is a configurable spool-usage as well as reject-low-priority-msg-limit:

solace(config-message-spool-queue-event) #
reject-low-priority-msg-limit spool-usage

The corresponding set and clear Syslog events generated are as follows:

VPN_AD_MSG_SPOOL_HIGH

VPN_AD_MSG_SPOOL_HIGH_CLEAR

VPN_AD_MSG_SPOOL_QUOTA_EXCEED

VPN_AD_MSG_SPOOL_REJECT_LOW_PRIORITY_MSG_LIMIT_EXCEED

VPN_AD_MSG_SPOOL_REJECT_LOW_PRIORITY_MSG_LIMIT_HIGH

VPN_AD_MSG_SPOOL_REJECT_LOW_PRIORITY_MSG_LIMIT_HIGH_CLEAR

Egress Message Processing

On the egress side there are no failure type discards; however, there are TTL expires if configured and administrative messages are deleted. There are situations that will prevent endpoints from being created or bound for which corresponding events are generated, as well as global situations that will prevent messages from being delivered.

When the system wide total number of messages are delivered, but not acknowledged by the consuming clients, exceed the maximum, then no more messages can be delivered to the clients. This property can be set using the following CLI command:

solace(config-hardware-message-spool-event-delivered-unacked)# thresholds
clear-percentage set-percentage

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_DELIVERED_UNACKED_MSGS_EXCEED

SYSTEM_AD_DELIVERED_UNACKED_MSGS_HIGH

SYSTEM_AD_DELIVERED_UNACKED_MSGS_HIGH_CLEAR

When the maximum number of egress flows for a message VPN, or the event broker wide limit has been reached, no clients can bind to queues to receive messages. In order to set this limit for the event broker, use the following CLI command:

solace (config-message-spool)# max-egress-flows

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MAX_EGRESS_FLOWS_EXCEEDED

SYSTEM_AD_EGRESS_FLOWS_HIGH

SYSTEM_AD_EGRESS_FLOWS_HIGH_CLEAR

When the maximum number of endpoints for a Message VPN, or the event broker wide limit has been reached, no more endpoints can be created on the event broker for sending or receiving Guaranteed Messages. In order to set this limit for the event broker, use the following CLI command:

solace (config-message-spool)# max-endpoints

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_AD_MAX_ENDPOINTS_EXCEEDED

SYSTEM_AD_ENDPOINTS_HIGH

SYSTEM_AD_ENDPOINTS_HIGH_CLEAR

Direct Delivery Data Path

For direct messages, there are no positive or negative acknowledgments sent back to the publisher.

Setting event thresholds for expected message rates will help identify situations where message patterns are greater than expected.

Use the following command to set event thresholds for the ingress message rate on a VPN:

solace(config-egress-msg-rate-trap)# thresholds
clear-value set-value
solace(config-ingress-msg-rate-trap)# thresholds
clear-value set-value

The corresponding set and clear Syslog events generated are as follows:

SYSTEM_CLIENT_EG_MSG_RATE_HIGH

SYSTEM_CLIENT_EG_MSG_RATE_HIGH_CLEAR

SYSTEM_CLIENT_ING_MSG_RATE_HIGH

SYSTEM_CLIENT_ING_MSG_RATE_HIGH_CLEAR

Use the following command to set event thresholds for the egress message rate on a VPN:

solace (config-msg-vpn-event)# egress-message-rate thresholds
clear-value set-value
solace (config-msg-vpn-event)# egress-message-rate thresholds
clear-value set-value

The corresponding set and clear Syslog events generated are as follows:

VPN_VPN_EG_MSG_RATE_HIGH

VPN_VPN_EG_MSG_RATE_HIGH_CLEAR

VPN_VPN_ING_MSG_RATE_HIGH

VPN_VPN_ING_MSG_RATE_HIGH_CLEAR

Finally, an egress discard of direct messages occurs due to consumers unable to keep up. This results in the following event being written to the Syslog:

CLIENT_CLIENT_EGRESS_MSG_DISCARD

In order to monitor the size of a client’s NAB egress buffers, use the following command:

show client <name> stats queues

Both the size of these buffers, as well as the minimum burst size, can be increased depending on the network throughput and the rate at which the client processes these messages. For more information, refer to Data Buffer Management.