Chapter 20. Solving common replication problems
Multi-supplier replication uses an eventually-consistency replication model. This means that the same entries can be changed on different servers. When replication occurs between these two servers, Directory Server needs to resolve the conflicting changes. Mostly, resolution occurs automatically, based on the timestamp associated with the change on each server. The most recent change has priority. However, there are some cases where conflicts require manual intervention in order to reach a resolution.
20.1. Identifying and solving naming conflicts
When several supplier servers receive a request to create an entry with the same distinguished name (DN), each server creates the entry with this DN and a different entry unique identifier (entry ID). The entry ID is stored in the nsuniqueid
operational attribute.
For example, Server A
and Server B
receive a request to create uid=user_name,ou=people,dc=example,dc=com
user entry. As a result, each server has its own entry:
On Server A, the entry has:
-
uid=user_name,ou=people,dc=example,dc=com
-
nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b
-
On Server B, the entry has:
-
uid=user_name,ou=people,dc=example,dc=com
-
nsuniqueid=643a461e-b61311e1-b23be826-4afeed5f
-
During replication, Server A
replicates newly created entry uid=user_name,ou=people,dc=example,dc=com
to Server B
, and Server B
replicates newly created entry to Server A
, and a naming conflict occurs on each server. By comparing change sequence numbers (CSN), each server determines which entry was created earlier. For example, the entry on Server B
was created earlier.
The automatic conflict resolution procedure changes the last entry created (the entry on Server A
) the following way:
-
Adds the
nsuniqueid
value to the non-unique DN. -
Adds the
nsds5replconflict
attribute with the description which operation caused the conflict. -
Adds the
ldapsubentry
objectclass.
Now the following entries exist on both servers:
The valid entry with:
-
uid=user_name,ou=people,dc=example,dc=com
-
nsuniqueid=643a461e-b61311e1-b23be826-4afeed5f
-
The conflict entry with:
-
nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com
-
nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b
-
To solve the naming conflict manually, use the following procedure on each server.
Procedure
List the conflict entries:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list dc=example,dc=com
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list dc=example,dc=com dn: nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com cn: user_name displayName: user gidNumber: 99998 homeDirectory: /var/empty legalName: user name loginShell: /bin/false nsds5replconflict: namingConflict (ADD) uid=user_name,ou=people,dc=example,dc=com objectClass: top objectClass: nsPerson objectClass: nsAccount objectClass: nsOrgPerson objectClass: posixAccount objectClass: ldapsubentry uid: user_name uidNumber: 99998
If conflict entries exist, decide how to proceed:
To keep only the valid entry (
uid=user_name,ou=people,dc=example,dc=com
) and delete the conflict entry, enter:Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
To keep only the conflict entry (
nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
) and delete the valid entry, enter:Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict swap nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict swap nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
To keep both entries, specify a new relative distinguished name (RDN) to rename the conflict entry:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert --new-rdn=uid=user_name_NEW nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert --new-rdn=uid=user_name_NEW nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com
This command renames the conflict entry to
uid=user_name_NEW,ou=people,dc=example,dc=com
.
Directory Server replicates LDAP operations performed on a conflict entry. Usually replicated operations target the entry by using the nsuniqueid
of the original operation entry rather than by using the operation dn
. However, in cases with conflict entries, the behavior might differ.
20.2. Identifying and solving orphan entry conflicts
When Directory Server replicates a delete operation and the consumer server finds that the entry to be deleted has child entries, the conflict resolution procedure creates a glue entry to avoid having orphaned entries in the directory.
In the same way, when Directory Server replicates an add operation and the consumer server cannot find the parent entry, the conflict resolution procedure creates a glue entry for the parent.
Glue entries are temporary entries that include the object classes glue
and extensibleObject
. Glue entries can be created in several ways:
If the conflict resolution procedure finds a deleted entry with a matching unique identifier, the glue entry has the same attributes as the deleted entry, but with the added
glue
object class and thensds5ReplConflict
attribute.In such cases, either modify the glue entry to remove the
glue
object class and thensds5ReplConflict
attribute to keep the entry as a normal entry or delete the glue entry and its child entries.-
The server creates an entry with the
glue
andextensibleObject
object classes.
Procedure
List the orphan entry conflicts:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list-glue suffix
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list-glue suffix dn: ou=parent,dc=example,dc=com objectClass: top objectClass: organizationalunit objectClass: glue objectClass: extensibleobject ou: parent
If orphan entry conflicts exist, decide how to proceed:
To delete a glue entry and its child entries, enter:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete-glue "ou=parent,dc=example,dc=com"
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete-glue "ou=parent,dc=example,dc=com" dn: ou=parent,dc=example,dc=com objectClass: top objectClass: organizationalunit objectClass: extensibleobject ou: parent
To convert a glue entry into a regular entry, enter:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert-glue "ou=parent,dc=example,dc=com"
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert-glue "ou=parent,dc=example,dc=com"
20.3. Identifying and solving errors about obsolete or missing suppliers
Directory Server stores information about the replication topology, such as all suppliers that send updates to other replicas, in a set of metadata called replica update vector (RUV). An RUV contains information about the supplier, such as its ID and URL, the last change state number (CSN) on the local server, and the CSN of the first change. Both suppliers and consumers store RUV information, and they use it to control replication updates.
When you remove a supplier from the replication topology, information about it can remain in another replica’s RUV. You can use a cleanallruv
task to remove the RUV entry form all suppliers in the topology.
Prerequisites
- Replication is enabled.
Procedure
Monitor the
/var/log/dirsrv/slapd-instance_name/errors
log file and search for entries similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - ruv_compare_ruv: RUV [changelog max RUV] does not contain element [{replica 8 ldap://server2.example.com:389} 4aac3e59000000080000 4c6f2a02000000080000] which is present in RUV [database RUV] ... [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: for replica dc=example,dc=com there were some differences between the changelog max RUV and the database RUV. If there are obsolete elements in the database RUV, you should remove them using the CLEANALLRUV task. If they are not obsolete, you should check their status to see why there are no changes from those servers in the changelog.
[22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - ruv_compare_ruv: RUV [changelog max RUV] does not contain element [{replica 8 ldap://server2.example.com:389} 4aac3e59000000080000 4c6f2a02000000080000] which is present in RUV [database RUV] ... [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: for replica dc=example,dc=com there were some differences between the changelog max RUV and the database RUV. If there are obsolete elements in the database RUV, you should remove them using the CLEANALLRUV task. If they are not obsolete, you should check their status to see why there are no changes from those servers in the changelog.
In this case, the replica ID
8
causes this error.Display all RUV records and replica IDs, both valid and invalid:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com"
# dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com" RUV: {replica 1 ldap://server1.example.com} 61a4d8f8000100010000 61a4f5b8000000010000 Replica ID: 1 LDAP URL: ldap://server1.example.com Min CSN: 2021-11-29 13:43:20 1 0 (61a4d8f8000100010000) Max CSN: 2021-11-29 15:46:00 (61a4f5b8000000010000) RUV: {replica 2 ldap://server2.example.com} 61a4d8fb000100020000 61a4f550000000020000 Replica ID: 2 LDAP URL: ldap://server2.example.com Min CSN: 2021-11-29 13:43:23 1 0 (61a4d8fb000100020000) Max CSN: 2021-11-29 15:44:16 (61a4f550000000020000) RUV: {replica 8 ldap://server3.example.com} 61a4d903000100080000 61a4d908000000080000 Replica ID: 8 LDAP URL: ldap://server3.example.com Min CSN: 2021-11-29 13:43:31 1 0 (61a4d903000100080000) Max CSN: 2021-11-29 13:43:36 (61a4d908000000080000)
Note the list of returned replica IDs:
1
,2
, and8
.Run cleanup tasks for the replica IDs
8
.Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server1.example.com repl-tasks cleanallruv --suffix="dc=example,dc=com" --replica-id=8
# dsconf -D "cn=Directory Manager" ldap://server1.example.com repl-tasks cleanallruv --suffix="dc=example,dc=com" --replica-id=8
Note that Directory Server replicates RUV cleanup tasks. Therefore, you need to start the tasks on only one supplier.
If one of the replicas can not be joined, for example if it is down, you can use the
--force-cleaning
option to achieve an immediate clean up of the RUV.
Verification
Display the RUV records and replica IDs:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com"
# dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com" RUV: {replica 1 ldap://server1.example.com} 61a4d8f8000100010000 61a4f5b8000000010000 Replica ID: 1 LDAP URL: ldap://server1.example.com Min CSN: 2021-11-29 14:02:10 1 0 (61a4d8f8000100010000) Max CSN: 2021-11-29 16:00:00 (61a4f5b8000000010000) RUV: {replica 2 ldap://server2.example.com} 61a4d8fb000100020000 61a4f550000000020000 Replica ID: 2 LDAP URL: ldap://server2.example.com Min CSN: 2021-11-29 14:02:10 1 0 (61a4d8fb000100020000) Max CSN: 2021-11-29 15:58:22 (61a4f550000000020000)
The command no longer returns RUV entries for the replica IDs
8
.
20.4. Stopping cleanallruv
task on a supplier
For performance or maintenance purposes, it is possible to stop the cleanallruv
task if the task runs for a long time. You can use the dsconf
utility to stop the task.
Prerequisites
- Replication is enabled.
Procedure
Display all
cleanallruv
tasks on a supplier:Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf <instance_name> repl-tasks list-cleanruv-tasks
# dsconf <instance_name> repl-tasks list-cleanruv-tasks dn: cn=cleanallruv_2025-04-15T09:15:18.535868,cn=cleanallruv,cn=tasks,cn=config cn: cleanallruv_2025-04-15T09:15:18.535868 nsTaskCreated: 20250415131518Z ... nsTaskStatus: Not all replicas online, retrying in 20 seconds... nsTaskTotalItems: 1 nsTaskWarning: 0 objectClass: top objectClass: extensibleObject replica-base-dn: dc=example,dc=com replica-id: 2
The example shows that the
cleanallruv
task cannot be completed because the replica became unresponsive. In some cases, it can negatively impact the server performance.Stop the
cleanallruv
task:Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf <instance_name> repl-tasks abort-cleanallruv --suffix "dc=example,dc=com" --replica-id 12
# dsconf <instance_name> repl-tasks abort-cleanallruv --suffix "dc=example,dc=com" --replica-id 12
Additionally, you can use the
--certify
option to force Directory Server to stop thecleanallruv
task on all replicas.
Verification
Display all
cleanallruv
tasks on the supplier:Copy to Clipboard Copied! Toggle word wrap Toggle overflow dsconf <instance_name> repl-tasks list-cleanruv-tasks
# dsconf <instance_name> repl-tasks list-cleanruv-tasks dn: cn=cleanallruv_2025-04-15T09:15:18.535868,cn=cleanallruv,cn=tasks,cn=config cn: cleanallruv_2025-04-15T09:15:18.535868 nsTaskCreated: 20250415131518Z ... nsTaskStatus: Task aborted for rid(2). nsTaskTotalItems: 1 nsTaskWarning: 0 objectClass: top objectClass: extensibleObject replica-base-dn: dc=example,dc=com replica-id: 2
Additional resources