Chapter 10. Detecting duplicate messages
Brokers can automatically detect and filter duplicate messages, eliminating the need for custom detection logic. Duplicates result from clients resending messages after unexpected connection failures because the client cannot determine if the broker received the initial message.
The following is an example of a scenario where a duplicate message might be sent to the broker. Suppose that a client sends a message to the broker. If the broker or connection fails before the message is received and processed by the broker, the message never arrives at its address. The client does not receive a response from the broker due to the failure. If the broker or connection fails after the message is received and processed by the broker, the message is routed correctly, but the client still does not receive a response.
In addition, using a transaction to determine success does not necessarily help in these cases. If the broker or connection fails while the transaction commit is being processed, the client is still unable to determine whether it successfully sent the message.
In these situations, to correct the assumed failure, the client resends the most recent message. The result might be a duplicate message that negatively impacts your system. For example, if you are using the broker in an order-fulfilment system, a duplicate message might mean that a purchase order is processed twice.
The following procedures show how to configure duplicate message detection to protect against these types of situations.
10.1. Configuring the duplicate ID cache Copy linkLink copied to clipboard!
To enable duplicate detection, producers must assign a unique ID to the _AMQ_DUPL_ID property in each message. The broker caches these IDs for each address and verifies that the ID of incoming messages is not already in the cache. You can customize the size of the cache and if it is persisted.
Each address has its own cache. Each cache is circular and fixed in size. This means that new entries replace the oldest ones as cache space demands.
Procedure
-
Open the
<broker_instance_dir>/etc/broker.xmlconfiguration file. Within the
coreelement, add theid-cache-sizeandpersist-id-cacheproperties and specify values. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow id-cache-sizeMaximum size of the ID cache, specified as the number of individual entries in the cache. The default value is 20,000 entries. In this example, the cache size is set to 5,000 entries.
NoteWhen the maximum size of the cache is reached, it is possible for the broker to start processing duplicate messages. For example, suppose that you set the size of the cache to
3000. If a previous message arrived more than 3,000 messages before the arrival of a new message with the same value of_AMQ_DUPL_ID, the broker cannot detect the duplicate. This results in both messages being processed by the broker.persist-id-cache-
When the value of this property is set to
true, the broker persists IDs to disk as they are received. The default value istrue. In the example above, you disable persistence by setting the value tofalse.
Additional resources
10.2. Configuring duplicate detection for cluster connections Copy linkLink copied to clipboard!
You can configure cluster connections to insert a unique duplicate ID header in each message that moves across the cluster. If the connection is interrupted or the target server crashes, this unique ID allows the target server to detect if a resent message was already received.
Prerequisites
- You configured a broker cluster. For more information, see Section 14.2, “Creating a broker cluster”.
Procedure
-
Open the
<broker_instance_dir>/etc/broker.xmlconfiguration file. Within the
coreelement, for a given cluster connection, add theuse-duplicate-detectionproperty and specify a value. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow use-duplicate-detection-
When the value of this property is set to
true, the cluster connection inserts a duplicate ID header for each message that it handles.