Chapter 11. Recover from an out-of-sync passive site
This chapter describes the procedures required to synchronize the secondary site with the primary site in a setup as outlined in Concepts for active-passive deployments together with the blueprints outlined in Building blocks active-passive deployments.
11.1. When to use procedure Copy linkLink copied to clipboard!
Use this after a temporary disconnection between sites where Data Grid was disconnected and the contents of the caches are out-of-sync.
At the end of the procedure, the session contents on the secondary site have been discarded and replaced by the session contents of the primary site. All caches in the secondary site have been cleared to prevent invalid cached contents.
See the Multi-site deployments chapter for different operational procedures.
11.2. Procedures Copy linkLink copied to clipboard!
11.2.1. Data Grid Cluster Copy linkLink copied to clipboard!
For the context of this chapter, Site-A
is the primary site and is active, and Site-B
is the secondary site and is passive.
Network partitions may happen between the site and the replication between the Data Grid cluster will stop. These procedures bring both sites back in sync.
Transferring the full state may impact the Data Grid cluster performance by increasing the response time and/or resources usage.
The first procedure is to delete the stale data from the secondary site.
- Login into your secondary site.
Shutdown Red Hat build of Keycloak. This will clear all Red Hat build of Keycloak caches, and it prevents the state of Red Hat build of Keycloak from being out-of-sync with Data Grid.
When deploying Red Hat build of Keycloak using the Red Hat build of Keycloak Operator, change the number of Red Hat build of Keycloak instances in the Red Hat build of Keycloak Custom Resource to 0.
Connect into Data Grid Cluster using the Data Grid CLI tool:
Command:
oc -n keycloak exec -it pods/infinispan-0 -- ./bin/cli.sh --trustall --connect https://127.0.0.1:11222
oc -n keycloak exec -it pods/infinispan-0 -- ./bin/cli.sh --trustall --connect https://127.0.0.1:11222
Copy to Clipboard Copied! Toggle word wrap Toggle overflow It asks for the username and password for the Data Grid cluster. Those credentials are the one set in the Deploy Data Grid for HA with the Data Grid Operator chapter in the configuring credentials section.
Output:
Username: developer Password: [infinispan-0-29897@ISPN//containers/default]>
Username: developer Password: [infinispan-0-29897@ISPN//containers/default]>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe pod name depends on the cluster name defined in the Data Grid CR. The connection can be done with any pod in the Data Grid cluster.
Disable the replication from secondary site to the primary site by running the following command. It prevents the clear request to reach the primary site and delete all the correct cached data.
Command:
site take-offline --all-caches --site=site-a
site take-offline --all-caches --site=site-a
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the replication status is
offline
.Command:
site status --all-caches --site=site-a
site status --all-caches --site=site-a
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
{ "status" : "offline" }
{ "status" : "offline" }
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the status is not
offline
, repeat the previous step.WarningMake sure the replication is
offline
otherwise the clear data will clear both sites.Clear all the cached data in secondary site using the following commands:
Command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow These commands do not print any output.
Re-enable the cross-site replication from secondary site to the primary site.
Command:
site bring-online --all-caches --site=site-a
site bring-online --all-caches --site=site-a
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the replication status is
online
.Command:
site status --all-caches --site=site-a
site status --all-caches --site=site-a
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
{ "status" : "online" }
{ "status" : "online" }
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Now we are ready to transfer the state from the primary site to the secondary site.
- Login into your primary site
Connect into Data Grid Cluster using the Data Grid CLI tool:
Command:
oc -n keycloak exec -it pods/infinispan-0 -- ./bin/cli.sh --trustall --connect https://127.0.0.1:11222
oc -n keycloak exec -it pods/infinispan-0 -- ./bin/cli.sh --trustall --connect https://127.0.0.1:11222
Copy to Clipboard Copied! Toggle word wrap Toggle overflow It asks for the username and password for the Data Grid cluster. Those credentials are the one set in the Deploy Data Grid for HA with the Data Grid Operator chapter in the configuring credentials section.
Output:
Username: developer Password: [infinispan-0-29897@ISPN//containers/default]>
Username: developer Password: [infinispan-0-29897@ISPN//containers/default]>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe pod name depends on the cluster name defined in the Data Grid CR. The connection can be done with any pod in the Data Grid cluster.
Trigger the state transfer from the primary site to the secondary site.
Command:
site push-site-state --all-caches --site=site-b
site push-site-state --all-caches --site=site-b
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the replication status is
online
for all caches.Command:
site status --all-caches --site=site-b
site status --all-caches --site=site-b
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
{ "status" : "online" }
{ "status" : "online" }
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Wait for the state transfer to complete by checking the output of
push-site-status
command for all caches.Command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the table in this section for the Cross-Site Documentation for the possible status values.
If an error is reported, repeat the state transfer for that specific cache.
Command:
site push-site-state --cache=<cache-name> --site=site-b
site push-site-state --cache=<cache-name> --site=site-b
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Clear/reset the state transfer status with the following command
Command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
As now the state is available in the secondary site, Red Hat build of Keycloak can be started again:
- Login into your secondary site.
Startup Red Hat build of Keycloak.
When deploying Red Hat build of Keycloak using the Red Hat build of Keycloak Operator, change the number of Red Hat build of Keycloak instances in the Red Hat build of Keycloak Custom Resource to the original value.
11.2.2. AWS Aurora Database Copy linkLink copied to clipboard!
No action required.
11.2.3. Route53 Copy linkLink copied to clipboard!
No action required.
11.3. Further reading Copy linkLink copied to clipboard!
See Concepts to automate Data Grid CLI commands on how to automate Infinispan CLI commands.