Este contenido no está disponible en el idioma seleccionado.

2.10.3. Recovering Failed Node Hosts


Important

This section presumes you have backed up the /var/lib/openshift directory. See Section 2.10.2, “Backing Up Node Host Files” for more information.
A failed node host can be recovered if the /var/lib/openshift gear directory had fault tolerance and can be restored. SELinux contexts must be preserved with the gear directory in order for recovery to succeed. Note this scenario rarely occurs, especially when node hosts are virtual machines in a fault-tolerant infrastructure rather than physical machines. Note that scaled applications cannot be recovered onto a node host with a different IP address than the original node host.

Procedure 2.7. To Recover a Failed Node Host:

  1. Create a node host with the same host name and IP address as the one that failed.
    1. The host name DNS A record can be adjusted if the IP address must be different. However, note that the application CNAME and database records all point to the host name and cannot be easily changed.
    2. Ensure the ruby193-mcollective service is not running on the new node host:
      # service ruby193-mcollective stop
      Copy to Clipboard Toggle word wrap
    3. Copy all the configuration files in the /etc/openshift directory from the failed node host to the new node host and ensure that the gear profile is the same.
  2. Attach and mount the backup to /var/lib/openshift, ensuring the usrquota mount option is used:
    # echo "/dev/path/to/backup/partition /var/lib/openshift/ ext4 defaults,usrquota 0 0" >> /etc/fstabecho "/dev/path/to/backup/partition /var/lib/openshift/ ext4 defaults,usrquota 0 0" >> /etc/fstabecho "/dev/path/to/backup/partition /var/lib/openshift/ ext4 defaults,usrquota 0 0" >> /etc/fstab
    # mount -a
    Copy to Clipboard Toggle word wrap
  3. Reinstate quotas on the /var/lib/openshift directory:
    # quotacheck -cmug /var/lib/openshift
    # restorecon /var/lib/openshift/aquota.user
    # quotaon /var/lib/openshift
    Copy to Clipboard Toggle word wrap
  4. Run the oo-admin-regenerate-gear-metadata tool, available starting in OpenShift Enterprise 2.1.6, on the new node host to replace and recover the failed gear data. This browses each existing gear on the gear data volume and ensures it has the correct entries in certain files, and if necessary, performs any fixes:
    # oo-admin-regenerate-gear-metadata
    
    This script attempts to regenerate gear entries for:
      *  /etc/passwd
      *  /etc/shadow
      *  /etc/group
      *  /etc/cgrules.conf
      *  /etc/cgconfig.conf
      *  /etc/security/limits.d
    
    Proceed? [yes/NO]: yes
    Copy to Clipboard Toggle word wrap
    The oo-admin-regenerate-gear-metadata tool will not make any changes unless it notices any missing entries. Note that this tool can be added to a node host deployment script.
    Alternatively, if you are using OpenShift Enteprise 2.1.5 or earlier, replace the /etc/passwd file on the new node host with the content from the original, failed node host. If this backup file was lost, see Section 2.10.4, “Recreating /etc/passwd Entries” for instructions on recreating the /etc/passwd file.
  5. When the oo-admin-regenerate-gear-metadata tool completes, it runs the oo-accept-node command and reports the output:
    Running oo-accept-node to check node consistency...
    ...
    FAIL: user 54fe156faf1c09b9a900006f does not have quotas imposed. This can be addressed by running: oo-devel-node set-quota --with-container-uuid 54fe156faf1c09b9a900006f --blocks 2097152 --inodes 80000
    Copy to Clipboard Toggle word wrap
    If there are any quota errors, run the suggested quota command, then run the oo-accept-node command again to ensure the problem has been resolved:
    # oo-devel-node set-quota --with-container-uuid 54fe156faf1c09b9a900006f --blocks 2097152 --inodes 80000
    # oo-accept-node
    Copy to Clipboard Toggle word wrap
  6. Reboot the new node host to activate all changes, start the gears, and allow MCollective and other services to run.
Volver arriba
Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2025 Red Hat