Monitoring & Managing Disk Consumption

If your appliance is using a SAN, or your software event broker has its message spool residing on a dedicated volume, as a best practice, Solace recommends limiting maximum disk space use to less than 50%.

Using SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH

To help you keep track of disk space usage, it's recommended to monitor for exceeding the limit using the SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH event with the alert configured for triggering when the disk is 30% full. For configuration instructions, refer to Configuring System Event Thresholds.

Receiving SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH

If a SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH event is received, the next step is to determine why.

The most common reason for receiving a SYSTEM_CHASSIS_DISK_UTILIZATION_HIGH event is that old messages aren't being acknowledged by clients, which causes lots of disk space to be consumed to store those messages. The section Diagnosing the Sparse Message Spool File Condition shows you how to identify this situation.

Diagnosing the Sparse Message Spool File Condition

Disk usage will always be greater than persistent store usage; however, if disk space usage is several multiples of persistent store usage, then there is likely a large number of message spool files residing on the disk where each file contains few messages. This is referred to as a sparse message spool file condition.

To help you determine if this condition is present, consider the following rule:

  • if disk space usage is > 30%
  • and disk space usage is >= 3 times persistent store usage
  • then you likely have sparse message spool files unnecessarily using up disk space

You can use the output of the CLI command show message-spool detail to help figure out if the ratio of disk space usage to persistent store usage is too high. For example, shown below are two snippets from the output of the show message-spool detail command:

                                              ADB            Disk              Total
Current Persistent Store Usage (MB)         0.000     10 000.0000         10 000.000
Number of Messages Currently Spooled            0         100 000            100 000

Notice that in the above snippet, the current persistent store usage is 10 000.0000 MB, or 10 GB. In the second snippet, shown below, there are 150.00 million 1 KB disk blocks in use on the active event broker, which means that 150 GB of disk space is being consumed. This is greater than 3 times the 10 GB of persistent storage that has been used, and therefore, in accordance with the rule, suggests that the sparse message spool file condition exists on the disk.

Disk Partition    1K-blocks         Used        Available    Use%    Mounted on
Active             200.0 Mi    150.00 Mi          50.0 Mi     75%    /usr/sw/externalSpool/p1
Standby            200.0 Mi       0.0 Mi         200.0 Mi      0%    /usr/sw/externalSpool/p2

Sparse Message Spool Files

Regardless of whether there are sparse spool files, or simply a large number of messages spooled for slow/offline consumers, it's necessary to identify and consume, or delete, old messages. This will allow the message spool files to be deleted, freeing up disk space.

Alternatively, it is also possible to defragment and consolidate message spool files to optimize the broker's use of disk storage. Note that this feature is only available in version 9.3.1.5+. For more information, refer to Defragmenting the Guaranteed Messaging Spool.

If the disk hits capacity, new guaranteed messages won't be accepted by the event broker until older messages are consumed or deleted.