Partition Handoff

Each time the broker performs Partition Rebalancing, there is a point in the process where one or more partitions must be reassigned to different consumers (flows). This is called a partition handoff. The broker attempts to perform a partition handoff with as few disruptions as possible—that is, a "graceful" partition handoff. Graceful is not necessarily perfect; there may be a disruption to message flow depending on the behavior of consumer applications. Guidelines for consumer applications are discussed in Best Practices for Consuming Applications, below.

The graceful partition handoff process attempts to achieve the following goals (listed in order from most to least desirable):

  • Goal 1: Prevent disruptions altogether. This is ideal, but requires certain consumer behaviors. See Best Practices for Consuming Applications, below.
  • Goal 2: Restrict disruptions to the partition whose consumer is changing, such that other partitions are not affected.
  • Goal 3: Limit disruptions to a subset of the key sequences mapped to the partition. The term "key sequence" refers to a series of messages with the same key that must be processed in sequence and to completion. The start and stop (that is, the length) of the key sequence is unknown to the event broker.

Let's consider an example that illustrates the factors involved in partition handoff. Suppose we have an application consisting of a number of microservices (publishers and consumers). As shown in the diagram below, the application initially has one publisher and one consumer. Messages are represented by the letters A and B (representing the partition key; for example, customer numbers) and the numbers 1 through 5 (representing the sequence within the key; for example, invoice numbers). There is a single partitioned queue with two partitions, assigned to Consumer 1. Suppose a new consumer (Consumer 2) becomes available, so that Partition 1 needs to be handed off from Consumer 1 to Consumer 2. In this example, Partition 0 is not affected in any way (satisfying Goal 2); only messages relating to key B are impacted.

A diagram that illustrates the example described by the surrounding text.

The state of messages B1 through B5 from the perspective of the event broker is as follows:

  • B1 has been delivered to the flow but has not been acknowledged, and so remains in the partition for potential redelivery. (The Consumer 1 client has received B1 and is in the process of handling it.)
  • B2 has been delivered to the flow but has not been acknowledged, and so remains in the partition for potential redelivery. (The consumer has not yet received B2.)
  • B3 has been received but has not been delivered to any flow.
  • B4 is unknown to the broker because it is inflight from the publisher.
  • B5 is unknown to the broker because the publisher has not yet sent it.

A graceful partition handoff consists of the following steps:

  1. The event broker pauses new message delivery from the affected partition to the current flow (they remain in the queue).
    • In the example, message B3 in Partition 1 is held in the queue and not sent to the flow for Consumer 1. After the queue receives them, messages B4 and B5 are also held and not sent to the flow.
  2. If required, the event broker waits for the configured rebalance max-handoff-time (so that outstanding messages can be acknowledged by the consumers).
    • In the example, Consumer 1 acknowledges messages B1 and B2.
  3. The event broker updates the partition-to-flow mapping, so that the new flow is assigned to the affected partition.
    • In the example, the flow for Consumer 2 is now assigned to Partition 1.
  4. The event broker resumes delivery, now including the new flow.
    • Here, B3, B4, and B5 are delivered to Consumer 2.

The result of the handoff (assuming all messages have now been acknowledged) is shown in the diagram below:

A diagram that illustrates the example described by the surrounding text.

Our example application may expect that all messages B1 through B5 would be delivered to the same consumer, so that state from earlier messages can be used when handling subsequent messages. This is the normal behavior of the partitioned queue, but due to the handoff, B1 and B2 are sent to Consumer 1, and B3 through B5 are sent to Consumer 2. That is, the key sequence for B has been disrupted at some point in its middle.

It is important to note that this disruption of the key sequence occurs only if the key sequence has undelivered messages. The particular sequence in the example (B3-B5) was disrupted, but might be only one of many independent key sequences passing through the partition as it was being reassigned. The same publisher, or any other publisher, might be generating full key sequences whose messages are inflight towards the consumer, or not yet been seen by any consumer. After the delivery pause, these messages are handled without disruption to their key sequences (satisfying Goal 3).

Best Practices for Consuming Applications

Ideally, partition handoffs are transparent to the consuming application. Although handoffs are designed to be as unobtrusive as possible to the consumer, they may cause application-level errors as the flow of messages belonging to one partition is moved from one consumer to another.

It is possible to design your consuming applications so that they are resilient to any message flow disruption that results from partition handoff (Goal 1). Some possible ways to achieve this are:

  • Use a shared database to maintain state. The diagrams for the preceding example show a database that all consumers can read from and write to. If each instance of the application writes to the database any state needed for future processing of a given key before acknowledging the corresponding message back to the broker, then any other instance of the application could handle a message with a key it had not previously encountered by first retrieving state for this key from the shared database.
  • Acknowledge only after receiving the last message in a sequence. If the application knows the start and end values of a key sequence, it can delay sending acknowledgments for all messages in the sequence until the final message has been received and processed. That way, if a sequence is disturbed, the event broker will resend the entire sequence, in order, to a new consumer. This approach doesn't eliminate the problem, but it reduces the size of the window during which the application is exposed (a handoff could still occur during the time it takes to acknowledge all messages back-to-back).

It might not be possible or practical to design applications that are so sophisticated. In particular, some design decisions can result in disruption to the partition key sequence. For example:

  • The consumer might hold its per-key state solely in RAM and automatically acknowledge each message so it can be as simple and performant as possible.
  • The consumer might not acknowledge messages immediately, and instead be more selective based on the message sequence, thus exceeding the rebalance timer or max-handoff timer for the queue. The consumer may not be able to acknowledge messages within the configured time limits. This can cause message duplication. You can adjust the max-delivered-unacked-msgs-per-flow setting (similar to how you would for other non-exclusive queue scenarios where there are different processing times for different message workloads) to prevent this.
  • The administrator might set the timeout values low, perhaps even to 0, to cause new key sequences to be handled in the new consumer in preference to trying to gracefully finish handling old key sequences in the old consumer.
  • The consumer could be designed to fail and restart a transaction or process.

The recovery methods in these cases are application specific. You must decide how your applications behave when the key sequence of their received messages never completes, or when they are given messages where the keys don't start at the beginning of the key sequence.