Performing disaster recovery with Identity Management
Recovering IdM after a server or data loss
概要
Providing feedback on Red Hat documentation リンクのコピーリンクがクリップボードにコピーされました!
We appreciate your feedback on our documentation. Let us know how we can improve it.
Submitting feedback through Jira (account required)
- Log in to the Jira website.
- Click Create in the top navigation bar
- Enter a descriptive title in the Summary field.
- Enter your suggestion for improvement in the Description field. Include links to the relevant parts of the documentation.
- Click Create at the bottom of the dialogue.
第1章 Disaster scenarios in IdM リンクのコピーリンクがクリップボードにコピーされました!
Prepare and respond to various disaster scenarios in Identity Management (IdM) systems that affect servers, data, or entire infrastructures.
| Disaster type | Example causes | How to prepare | How to respond |
|---|---|---|---|
| Server loss: The IdM deployment loses one or several servers. |
| ||
| Data loss: IdM data is unexpectedly modified on a server, and the change is propagated to other servers. |
| ||
| Total infrastructure loss: All IdM servers or Certificate Authority (CA) replicas are lost with no VM snapshots or data backups available. |
| This situation is a total loss. |
A total loss scenario occurs when all Certificate Authority (CA) replicas or all IdM servers are lost, and no virtual machine (VM) snapshots or backups are available for recovery. Without CA replicas, the IdM environment cannot deploy additional replicas or rebuild itself, making recovery impossible. To avoid such scenarios, ensure backups are stored off-site, maintain multiple geographically redundant CA replicas, and connect each replica to at least two others.
第2章 Recovering a single server with replication リンクのコピーリンクがクリップボードにコピーされました!
If a single server is severely disrupted or lost, having multiple replicas ensures you can create a replacement replica and quickly restore the former level of redundancy.
If your IdM topology contains an integrated Certificate Authority (CA), the steps for removing and replacing a damaged replica differ for the CA renewal server and other replicas.
2.1. Recovering from losing the CA renewal server リンクのコピーリンクがクリップボードにコピーされました!
If the Certificate Authority (CA) renewal server is lost, you must first promote another CA replica to fulfill the CA renewal server role, and then deploy a replacement CA replica.
Prerequisites
- Your deployment uses IdM’s internal Certificate Authority (CA).
- Another Replica in the environment has CA services installed.
An IdM deployment is unrecoverable if:
- The CA renewal server has been lost.
- No other server has a CA installed.
No backup of a replica with the CA role exists.
It is critical to make backups from a replica with the CA role so certificate data is protected. For more information about creating and restoring from backups, see Backing up and restoring IdM.
Procedure
- From another replica in your environment, promote another CA replica in the environment to act as the new CA renewal server. See Changing and resetting IdM CA renewal server.
- From another replica in your environment, remove replication agreements to the lost CA renewal server. See Removing server from topology using the CLI.
- Install a new CA Replica to replace the lost CA replica. See Installing an IdM replica with a CA.
- Update DNS to reflect changes in the replica topology. If IdM DNS is used, DNS service records are updated automatically.
- Verify IdM clients can reach IdM servers. See Adjusting IdM clients during recovery.
Verification
Test the Kerberos server on the new replica by successfully retrieving a Kerberos Ticket-Granting-Ticket as an IdM user.
[root@server ~]# kinit admin Password for admin@EXAMPLE.COM: [root@server ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 15:51:37 11/01/2019 15:51:02 HTTP/server.example.com@EXAMPLE.COM 10/31/2019 15:51:08 11/01/2019 15:51:02 krbtgt/EXAMPLE.COM@EXAMPLE.COMTest the Directory Server and SSSD configuration by retrieving user information.
[root@server ~]# ipa user-show admin User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash Principal alias: admin@EXAMPLE.COM UID: 1965200000 GID: 1965200000 Account disabled: False Password: True Member of groups: admins, trust admins Kerberos keys available: TrueTest the CA configuration with the
ipa cert-showcommand.[root@server ~]# ipa cert-show 1 Issuing CA: ipa Certificate: MIIEgjCCAuqgAwIBAgIjoSIP... Subject: CN=Certificate Authority,O=EXAMPLE.COM Issuer: CN=Certificate Authority,O=EXAMPLE.COM Not Before: Thu Oct 31 19:43:29 2019 UTC Not After: Mon Oct 31 19:43:29 2039 UTC Serial number: 1 Serial number (hex): 0x1 Revoked: False
2.2. Recovering from losing a regular replica リンクのコピーリンクがクリップボードにコピーされました!
To replace a replica that is not the Certificate Authority (CA) renewal server, remove the lost replica from the topology and install a new replica in its place.
Prerequisites
- The CA renewal server is operating properly. If the CA renewal server has been lost, see Recovering from losing the CA renewal server.
Procedure
- Remove replication agreements to the lost server. See Uninstalling an IdM server.
- Deploy a new replica with the corresponding services (CA, KRA, DNS). See Installing an IdM replica.
- Update DNS to reflect changes in the replica topology. If IdM DNS is used, DNS service records are updated automatically.
- Verify IdM clients can reach IdM servers. See Adjusting IdM clients during recovery.
Verification
Test the Kerberos server on the new replica by successfully retrieving a Kerberos Ticket-Granting-Ticket as an IdM user.
[root@newreplica ~]# kinit admin Password for admin@EXAMPLE.COM: [root@newreplica ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 15:51:37 11/01/2019 15:51:02 HTTP/server.example.com@EXAMPLE.COM 10/31/2019 15:51:08 11/01/2019 15:51:02 krbtgt/EXAMPLE.COM@EXAMPLE.COMTest the Directory Server and SSSD configuration on the new replica by retrieving user information.
[root@newreplica ~]# ipa user-show admin User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash Principal alias: admin@EXAMPLE.COM UID: 1965200000 GID: 1965200000 Account disabled: False Password: True Member of groups: admins, trust admins Kerberos keys available: True
第3章 Recovering multiple servers with replication リンクのコピーリンクがクリップボードにコピーされました!
If multiple servers are lost at the same time, determine if the environment can be rebuilt by seeing which one of the following five scenarios applies to your situation.
3.1. Recovering from losing multiple servers in a CA-less deployment リンクのコピーリンクがクリップボードにコピーされました!
Servers in a CA-less deployment are all considered equal, so you can rebuild the environment by removing and replacing lost replicas in any order.
Prerequisites
- Your deployment uses an external Certificate Authority (CA).
Procedure
3.2. Recovering from losing multiple servers when the CA renewal server is unharmed リンクのコピーリンクがクリップボードにコピーされました!
If the CA renewal server is intact, you can replace other servers in any order.
Prerequisites
- Your deployment uses the IdM internal Certificate Authority (CA).
Procedure
3.3. Recovering from losing the CA renewal server and other servers リンクのコピーリンクがクリップボードにコピーされました!
If you lose the CA renewal server and other servers, promote another CA server to the CA renewal server role before replacing other replicas.
Prerequisites
- Your deployment uses the IdM internal Certificate Authority (CA).
- At least one CA replica is unharmed.
Procedure
- Promote another CA replica to fulfill the CA renewal server role. See Recovering from losing the CA renewal server.
- Replace all other lost replicas. See Recovering from losing a regular replica.
第4章 Recovering from data loss with VM snapshots リンクのコピーリンクがクリップボードにコピーされました!
If a data loss event occurs, you can restore a Virtual Machine (VM) snapshot of a Certificate Authority (CA) replica to repair the lost data, or deploy a new environment from it.
4.1. Recovering from only a VM snapshot リンクのコピーリンクがクリップボードにコピーされました!
If a disaster affects all IdM servers, and only a snapshot of an IdM CA replica virtual machine (VM) is left, you can recreate your deployment by removing all references to the lost servers and installing new replicas.
Prerequisites
- You have prepared a VM snapshot of a CA replica VM. See Preparing for data loss with VM snapshots.
Procedure
- Boot the desired snapshot of the CA replica VM.
Remove replication agreements to any lost replicas.
[root@server ~]# ipa server-del lost-server1.example.com [root@server ~]# ipa server-del lost-server2.example.com ...- Install a second CA replica. See Installing an IdM replica.
- The VM CA replica is now the CA renewal server. Red Hat recommends promoting another CA replica in the environment to act as the CA renewal server. See Changing and resetting IdM CA renewal server.
- Recreate the desired replica topology by deploying additional replicas with the desired services (CA, DNS). See Installing an IdM replica
- Update DNS to reflect the new replica topology. If IdM DNS is used, DNS service records are updated automatically.
- Verify that IdM clients can reach the IdM servers. See Adjusting IdM Clients during recovery.
Verification
Test the Kerberos server on every replica by successfully retrieving a Kerberos ticket-granting ticket as an IdM user.
[root@server ~]# kinit admin Password for admin@EXAMPLE.COM: [root@server ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 15:51:37 11/01/2019 15:51:02 HTTP/server.example.com@EXAMPLE.COM 10/31/2019 15:51:08 11/01/2019 15:51:02 krbtgt/EXAMPLE.COM@EXAMPLE.COMTest the Directory Server and SSSD configuration on every replica by retrieving user information.
[root@server ~]# ipa user-show admin User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash Principal alias: admin@EXAMPLE.COM UID: 1965200000 GID: 1965200000 Account disabled: False Password: True Member of groups: admins, trust admins Kerberos keys available: TrueTest the CA server on every CA replica with the
ipa cert-showcommand.[root@server ~]# ipa cert-show 1 Issuing CA: ipa Certificate: MIIEgjCCAuqgAwIBAgIjoSIP... Subject: CN=Certificate Authority,O=EXAMPLE.COM Issuer: CN=Certificate Authority,O=EXAMPLE.COM Not Before: Thu Oct 31 19:43:29 2019 UTC Not After: Mon Oct 31 19:43:29 2039 UTC Serial number: 1 Serial number (hex): 0x1 Revoked: False
4.2. Recovering from a VM snapshot among a partially-working environment リンクのコピーリンクがクリップボードにコピーされました!
If a disaster affects some IdM servers while others are still operating properly, you may want to restore the deployment to the state captured in a Virtual Machine (VM) snapshot. For example, if all Certificate Authority (CA) Replicas are lost while other replicas are still in production, you will need to bring a CA Replica back into the environment.
In this scenario, remove references to the lost replicas, restore the CA replica from the snapshot, verify replication, and deploy new replicas.
Prerequisites
- You have prepared a VM snapshot of a CA replica VM. See Preparing for data loss with VM snapshots.
Procedure
- Remove all replication agreements to the lost servers. See Uninstalling an IdM server.
- Boot the desired snapshot of the CA replica VM.
Remove any replication agreements between the restored server and any lost servers.
[root@restored-CA-replica ~]# ipa server-del lost-server1.example.com [root@restored-CA-replica ~]# ipa server-del lost-server2.example.com ...If the restored server does not have replication agreements to any of the servers still in production, connect the restored server with one of the other servers to update the restored server.
[root@restored-CA-replica ~]# ipa topologysegment-add Suffix name: domain Left node: restored-CA-replica.example.com Right node: server3.example.com Segment name [restored-CA-replica.com-to-server3.example.com]: new_segment --------------------------- Added segment "new_segment" --------------------------- Segment name: new_segment Left node: restored-CA-replica.example.com Right node: server3.example.com Connectivity: both-
Review Directory Server error logs at
/var/log/dirsrv/slapd-YOUR-INSTANCE/errorsto see if the CA replica from the snapshot correctly synchronizes with the remaining IdM servers. If replication on the restored server fails because its database is too outdated, reinitialize the restored server.
[root@restored-CA-replica ~]# ipa-replica-manage re-initialize --from server2.example.com- If the database on the restored server is correctly synchronized, continue by deploying additional replicas with the desired services (CA, DNS) according to Installing an IdM replica.
Verification
Test the Kerberos server on every replica by successfully retrieving a Kerberos ticket-granting ticket as an IdM user.
[root@server ~]# kinit admin Password for admin@EXAMPLE.COM: [root@server ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 15:51:37 11/01/2019 15:51:02 HTTP/server.example.com@EXAMPLE.COM 10/31/2019 15:51:08 11/01/2019 15:51:02 krbtgt/EXAMPLE.COM@EXAMPLE.COMTest the Directory Server and SSSD configuration on every replica by retrieving user information.
[root@server ~]# ipa user-show admin User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash Principal alias: admin@EXAMPLE.COM UID: 1965200000 GID: 1965200000 Account disabled: False Password: True Member of groups: admins, trust admins Kerberos keys available: TrueTest the CA server on every CA replica with the
ipa cert-showcommand.[root@server ~]# ipa cert-show 1 Issuing CA: ipa Certificate: MIIEgjCCAuqgAwIBAgIjoSIP... Subject: CN=Certificate Authority,O=EXAMPLE.COM Issuer: CN=Certificate Authority,O=EXAMPLE.COM Not Before: Thu Oct 31 19:43:29 2019 UTC Not After: Mon Oct 31 19:43:29 2039 UTC Serial number: 1 Serial number (hex): 0x1 Revoked: False
4.3. Recovering from a VM snapshot to establish a new IdM environment リンクのコピーリンクがクリップボードにコピーされました!
If the Certificate Authority (CA) replica from a restored Virtual Machine (VM) snapshot is unable to replicate with other servers, create a new IdM environment from the VM snapshot.
To establish a new IdM environment, isolate the VM server, create additional replicas from it, and switch IdM clients to the new environment.
Prerequisites
- You have prepared a VM snapshot of a CA replica VM. See Preparing for data loss with VM snapshots.
Procedure
- Boot the desired snapshot of the CA replica VM.
Isolate the restored server from the rest of the current deployment by removing all of its replication topology segments.
First, display all
domainreplication topology segments.[root@restored-CA-replica ~]# ipa topologysegment-find Suffix name: domain ------------------ 8 segments matched ------------------ Segment name: new_segment Left node: restored-CA-replica.example.com Right node: server2.example.com Connectivity: both ... ---------------------------- Number of entries returned 8 ----------------------------Next, delete every
domaintopology segment involving the restored server.[root@restored-CA-replica ~]# ipa topologysegment-del Suffix name: domain Segment name: new_segment ----------------------------- Deleted segment "new_segment" -----------------------------Finally, perform the same actions with any
catopology segments.[root@restored-CA-replica ~]# ipa topologysegment-find Suffix name: ca ------------------ 1 segments matched ------------------ Segment name: ca_segment Left node: restored-CA-replica.example.com Right node: server4.example.com Connectivity: both ---------------------------- Number of entries returned 1 ---------------------------- [root@restored-CA-replica ~]# ipa topologysegment-del Suffix name: ca Segment name: ca_segment ----------------------------- Deleted segment "ca_segment" -----------------------------
- Install a sufficient number of IdM replicas from the restored server to handle the deployment load. There are now two disconnected IdM deployments running in parallel.
- Switch the IdM clients to use the new deployment by hard-coding references to the new IdM replicas. See Adjusting IdM clients during recovery.
- Stop and uninstall IdM servers from the previous deployment. See Uninstalling an IdM server.
Verification
Test the Kerberos server on every new replica by successfully retrieving a Kerberos ticket-granting ticket as an IdM user.
[root@server ~]# kinit admin Password for admin@EXAMPLE.COM: [root@server ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 15:51:37 11/01/2019 15:51:02 HTTP/server.example.com@EXAMPLE.COM 10/31/2019 15:51:08 11/01/2019 15:51:02 krbtgt/EXAMPLE.COM@EXAMPLE.COMTest the Directory Server and SSSD configuration on every new replica by retrieving user information.
[root@server ~]# ipa user-show admin User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash Principal alias: admin@EXAMPLE.COM UID: 1965200000 GID: 1965200000 Account disabled: False Password: True Member of groups: admins, trust admins Kerberos keys available: TrueTest the CA server on every new CA replica with the
ipa cert-showcommand.[root@server ~]# ipa cert-show 1 Issuing CA: ipa Certificate: MIIEgjCCAuqgAwIBAgIjoSIP... Subject: CN=Certificate Authority,O=EXAMPLE.COM Issuer: CN=Certificate Authority,O=EXAMPLE.COM Not Before: Thu Oct 31 19:43:29 2019 UTC Not After: Mon Oct 31 19:43:29 2039 UTC Serial number: 1 Serial number (hex): 0x1 Revoked: False
第5章 Recovering from data loss with IdM backups リンクのコピーリンクがクリップボードにコピーされました!
You can use the ipa-restore utility or Ansible playbooks to restore an IdM server to a previous state captured in an IdM backup.
For details how to perform IdM backups, see the following chapters in the Identity Management documentation:
第6章 Managing data loss リンクのコピーリンクがクリップボードにコピーされました!
The proper response to a data loss event will depend on the number of replicas that have been affected and the type of lost data.
6.1. Responding to isolated data loss リンクのコピーリンクがクリップボードにコピーされました!
When a data loss event occurs, minimize replicating the data loss by immediately isolating the affected servers. Then create replacement replicas from the unaffected remainder of the environment.
Prerequisites
- A robust IdM replication topology with multiple replicas. See Preparing for server loss with replication.
Procedure
To limit replicating the data loss, disconnect all affected replicas from the rest of the topology by removing their replication topology segments.
Display all
domainreplication topology segments in the deployment.[root@server ~]# ipa topologysegment-find Suffix name: domain ------------------ 8 segments matched ------------------ Segment name: segment1 Left node: server.example.com Right node: server2.example.com Connectivity: both ... ---------------------------- Number of entries returned 8 ----------------------------Delete all
domaintopology segments involving the affected servers.[root@server ~]# ipa topologysegment-del Suffix name: domain Segment name: segment1 ----------------------------- Deleted segment "segment1" -----------------------------Perform the same actions with any
catopology segments involving any affected servers.[root@server ~]# ipa topologysegment-find Suffix name: ca ------------------ 1 segments matched ------------------ Segment name: ca_segment Left node: server.example.com Right node: server2.example.com Connectivity: both ---------------------------- Number of entries returned 1 ---------------------------- [root@server ~]# ipa topologysegment-del Suffix name: ca Segment name: ca_segment ----------------------------- Deleted segment "ca_segment" -----------------------------
- The servers affected by the data loss must be abandoned. To create replacement replicas, see Recovering multiple servers with replication.
6.2. Responding to limited data loss among all servers リンクのコピーリンクがクリップボードにコピーされました!
A data loss event can affect all replicas in the environment, such as replication carrying out an accidental deletion among all servers. If data loss is known and limited, manually re-add lost data.
Prerequisites
- A Virtual Machine (VM) snapshot or IdM backup of an IdM server that contains the lost data.
Procedure
- If you need to review any lost data, restore the VM snapshot or backup to an isolated server on a separate network.
-
Add the missing information to the database using
ipaorldapaddcommands.
6.3. Responding to undefined data loss among all servers リンクのコピーリンクがクリップボードにコピーされました!
If data loss is severe or undefined, deploy a new environment from a Virtual Machine (VM) snapshot of a server.
Prerequisites
- A Virtual Machine (VM) snapshot contains the lost data.
Procedure
- Restore an IdM Certificate Authority (CA) Replica from a VM snapshot to a known good state, and deploy a new IdM environment from it. See Recovering from only a VM snapshot.
-
Add any data created after the snapshot was taken using
ipaorldapaddcommands.
第7章 Adjusting IdM clients during recovery リンクのコピーリンクがクリップボードにコピーされました!
While IdM servers are being restored, you may need to adjust IdM clients to reflect changes in the replica topology.
Procedure
Adjusting DNS configuration:
-
If
/etc/hostscontains any references to IdM servers, ensure that hard-coded IP-to-hostname mappings are valid. -
If IdM clients are using IdM DNS for name resolution, ensure that the
nameserverentries in/etc/resolv.confpoint to working IdM replicas providing DNS services.
-
If
Adjusting Kerberos configuration:
By default, IdM clients look to DNS Service records for Kerberos servers, and will adjust to changes in the replica topology:
[root@client ~]# grep dns_lookup_kdc /etc/krb5.conf dns_lookup_kdc = trueIf IdM clients have been hard-coded to use specific IdM servers in
/etc/krb5.conf:[root@client ~]# grep dns_lookup_kdc /etc/krb5.conf dns_lookup_kdc = falsemake sure
kdc,master_kdcandadmin_serverentries in/etc/krb5.confare pointing to IdM servers that work properly:[realms] EXAMPLE.COM = { kdc = functional-server.example.com:88 master_kdc = functional-server.example.com:88 admin_server = functional-server.example.com:749 default_domain = example.com pkinit_anchors = FILE:/var/lib/ipa-client/pki/kdc-ca-bundle.pem pkinit_pool = FILE:/var/lib/ipa-client/pki/ca-bundle.pem }
Adjusting SSSD configuration:
By default, IdM clients look to DNS Service records for LDAP servers and adjust to changes in the replica topology:
[root@client ~]# grep ipa_server /etc/sssd/sssd.conf ipa_server = _srv_, functional-server.example.comIf IdM clients have been hard-coded to use specific IdM servers in
/etc/sssd/sssd.conf, make sure theipa_serverentry points to IdM servers that are working properly:[root@client ~]# grep ipa_server /etc/sssd/sssd.conf ipa_server = functional-server.example.com
Clearing SSSD’s cached information:
The SSSD cache may contain outdated information pertaining to lost servers. If users experience inconsistent authentication problems, purge the SSSD cache :
[root@client ~]# sss_cache -E
Verification
Verify the Kerberos configuration by retrieving a Kerberos Ticket-Granting-Ticket as an IdM user.
[root@client ~]# kinit admin Password for admin@EXAMPLE.COM: [root@client ~]# klist Ticket cache: KCM:0 Default principal: admin@EXAMPLE.COM Valid starting Expires Service principal 10/31/2019 18:44:58 11/25/2019 18:44:55 krbtgt/EXAMPLE.COM@EXAMPLE.COMVerify the SSSD configuration by retrieving IdM user information.
[root@client ~]# id admin uid=1965200000(admin) gid=1965200000(admins) groups=1965200000(admins)