Home
Prodotti
OpenShift Container Platform
4.17
Hosted control planes
Chapter 14. Troubleshooting hosted control planes

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 14. Troubleshooting hosted control planes

If you encounter issues with hosted control planes, see the following information to guide you through troubleshooting.

14.1. Gathering information to troubleshoot hosted control planes
Copia collegamento

When you need to troubleshoot an issue with hosted clusters, you can gather information by running the must-gather command. The command generates output for the management cluster and the hosted cluster.

The output for the management cluster contains the following content:

Cluster-scoped resources: These resources are node definitions of the management cluster.
The hypershift-dump compressed file: This file is useful if you need to share the content with other people.
Namespaced resources: These resources include all of the objects from the relevant namespaces, such as config maps, services, events, and logs.
Network logs: These logs include the OVN northbound and southbound databases and the status for each one.
Hosted clusters: This level of output involves all of the resources inside of the hosted cluster.

The output for the hosted cluster contains the following content:

Cluster-scoped resources: These resources include all of the cluster-wide objects, such as nodes and CRDs.
Namespaced resources: These resources include all of the objects from the relevant namespaces, such as config maps, services, events, and logs.

Although the output does not contain any secret objects from the cluster, it can contain references to the names of secrets.

Prerequisites

You must have cluster-admin access to the management cluster.
You need the name value for the HostedCluster resource and the namespace where the CR is deployed.
You must have the hcp command-line interface installed. For more information, see "Installing the hosted control planes command-line interface".
You must have the OpenShift CLI (oc) installed.
You must ensure that the kubeconfig file is loaded and is pointing to the management cluster.

Procedure

To gather the output for troubleshooting, enter the following command:
```
oc adm must-gather \
  --image=registry.redhat.io/multicluster-engine/must-gather-rhel9:v<mce_version> \
  /usr/bin/gather hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE \
  hosted-cluster-name=HOSTEDCLUSTERNAME \
  --dest-dir=NAME ; tar -cvzf NAME.tgz NAME
```
```
$ oc adm must-gather \
  --image=registry.redhat.io/multicluster-engine/must-gather-rhel9:v<mce_version> \
  /usr/bin/gather hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE \
  hosted-cluster-name=HOSTEDCLUSTERNAME \
  --dest-dir=NAME ; tar -cvzf NAME.tgz NAME
```
Copy to Clipboard Toggle word wrap
where:
- You replace <mce_version> with the version of multicluster engine Operator that you are using; for example, 2.6.
- The hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE parameter is optional. If you do not include it, the command runs as though the hosted cluster is in the default namespace, which is clusters.
- If you want to save the results of the command to a compressed file, specify the --dest-dir=NAME parameter and replace NAME with the name of the directory where you want to save the results.

14.2. Gathering OpenShift Container Platform data for a hosted cluster
Copia collegamento

You can gather OpenShift Container Platform debugging information for a hosted cluster by using the multicluster engine Operator web console or by using the CLI.

14.2.1. Gathering data for a hosted cluster by using the CLI
Copia collegamento

You can gather OpenShift Container Platform debugging information for a hosted cluster by using the CLI.

Prerequisites

You must have cluster-admin access to the management cluster.
You need the name value for the HostedCluster resource and the namespace where the CR is deployed.
You must have the hcp command-line interface installed. For more information, see "Installing the hosted control planes command-line interface".
You must have the OpenShift CLI (oc) installed.
You must ensure that the kubeconfig file is loaded and is pointing to the management cluster.

Procedure

Generate the kubeconfig file by entering the following command:

hcp create kubeconfig --namespace <hosted_cluster_namespace> \
  --name <hosted_cluster_name> > <hosted_cluster_name>.kubeconfig

$ hcp create kubeconfig --namespace <hosted_cluster_namespace> \
  --name <hosted_cluster_name> > <hosted_cluster_name>.kubeconfig

Copy to Clipboard

Toggle word wrap

After you save the kubeconfig file, you can access the hosted cluster by entering the following example command:
```
oc --kubeconfig <hosted_cluster_name>.kubeconfig get nodes
```
```
$ oc --kubeconfig <hosted_cluster_name>.kubeconfig get nodes
```
Copy to Clipboard Toggle word wrap
. Collect the must-gather information by entering the following command:
```
oc adm must-gather
```
```
$ oc adm must-gather
```
Copy to Clipboard Toggle word wrap

14.2.2. Gathering data for a hosted cluster by using the web console
Copia collegamento

You can gather OpenShift Container Platform debugging information for a hosted cluster by using the multicluster engine Operator web console.

Prerequisites

You must have cluster-admin access to the management cluster.
You need the name value for the HostedCluster resource and the namespace where the CR is deployed.
You must have the hcp command-line interface installed. For more information, see "Installing the hosted control planes command-line interface".
You must have the OpenShift CLI (oc) installed.
You must ensure that the kubeconfig file is loaded and is pointing to the management cluster.

Procedure

In the web console, select All Clusters and select the cluster you want to troubleshoot.
In the upper-right corner, select Download kubeconfig.
Export the downloaded kubeconfig file.
Collect the must-gather information by entering the following command:
```
oc adm must-gather
```
```
$ oc adm must-gather
```
Copy to Clipboard Toggle word wrap

14.3. Entering the must-gather command in a disconnected environment
Copia collegamento

Complete the following steps to run the must-gather command in a disconnected environment.

Procedure

In a disconnected environment, mirror the Red Hat operator catalog images into their mirror registry. For more information, see Install on disconnected networks.

Run the following command to extract logs, which reference the image from their mirror registry:

REGISTRY=registry.example.com:5000
IMAGE=$REGISTRY/multicluster-engine/must-gather-rhel8@sha256:ff9f37eb400dc1f7d07a9b6f2da9064992934b69847d17f59e385783c071b9d8

$ oc adm must-gather \
  --image=$IMAGE /usr/bin/gather \
  hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE \
  hosted-cluster-name=HOSTEDCLUSTERNAME \
  --dest-dir=./data

REGISTRY=registry.example.com:5000
IMAGE=$REGISTRY/multicluster-engine/must-gather-rhel8@sha256:ff9f37eb400dc1f7d07a9b6f2da9064992934b69847d17f59e385783c071b9d8

$ oc adm must-gather \
  --image=$IMAGE /usr/bin/gather \
  hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE \
  hosted-cluster-name=HOSTEDCLUSTERNAME \
  --dest-dir=./data

Copy to Clipboard

Toggle word wrap

14.4. Troubleshooting hosted clusters on OpenShift Virtualization
Copia collegamento

When you troubleshoot a hosted cluster on OpenShift Virtualization, start with the top-level HostedCluster and NodePool resources and then work down the stack until you find the root cause. The following steps can help you discover the root cause of common issues.

14.4.1. HostedCluster resource is stuck in a partial state
Copia collegamento

If a hosted control plane is not coming fully online because a HostedCluster resource is pending, identify the problem by checking prerequisites, resource conditions, and node and Operator status.

Procedure

Ensure that you meet all of the prerequisites for a hosted cluster on OpenShift Virtualization.
View the conditions on the HostedCluster and NodePool resources for validation errors that prevent progress.
By using the kubeconfig file of the hosted cluster, inspect the status of the hosted cluster:
- View the output of the oc get clusteroperators command to see which cluster Operators are pending.
- View the output of the oc get nodes command to ensure that worker nodes are ready.

14.4.2. No worker nodes are registered
Copia collegamento

If a hosted control plane is not coming fully online because the hosted control plane has no worker nodes registered, identify the problem by checking the status of various parts of the hosted control plane.

Procedure

View the HostedCluster and NodePool conditions for failures that indicate what the problem might be.
Enter the following command to view the KubeVirt worker node virtual machine (VM) status for the NodePool resource:
```
oc get vm -n <namespace>
```
```
$ oc get vm -n <namespace>
```
Copy to Clipboard Toggle word wrap
If the VMs are stuck in the provisioning state, enter the following command to view the CDI import pods within the VM namespace for clues about why the importer pods have not completed:
```
oc get pods -n <namespace> | grep "import"
```
```
$ oc get pods -n <namespace> | grep "import"
```
Copy to Clipboard Toggle word wrap
If the VMs are stuck in the starting state, enter the following command to view the status of the virt-launcher pods:
```
oc get pods -n <namespace> -l kubevirt.io=virt-launcher
```
```
$ oc get pods -n <namespace> -l kubevirt.io=virt-launcher
```
Copy to Clipboard Toggle word wrap
If the virt-launcher pods are in a pending state, investigate why the pods are not being scheduled. For example, not enough resources might exist to run the virt-launcher pods.
If the VMs are running but they are not registered as worker nodes, use the web console to gain VNC access to one of the affected VMs. The VNC output indicates whether the ignition configuration was applied. If a VM cannot access the hosted control plane ignition server on startup, the VM cannot be provisioned correctly.
If the ignition configuration was applied but the VM is still not registering as a node, see Identifying the problem: Access the VM console logs to learn how to access the VM console logs during startup.

14.4.3. Worker nodes are stuck in the NotReady state
Copia collegamento

During cluster creation, nodes enter the NotReady state temporarily while the networking stack is rolled out. This part of the process is normal. However, if this part of the process takes longer than 15 minutes, identify the problem by investigating the node object and pods.

Procedure

Enter the following command to view the conditions on the node object and determine why the node is not ready:
```
oc get nodes -o yaml
```
```
$ oc get nodes -o yaml
```
Copy to Clipboard Toggle word wrap

Enter the following command to look for failing pods within the cluster:

oc get pods -A --field-selector=status.phase!=Running,status,phase!=Succeeded

$ oc get pods -A --field-selector=status.phase!=Running,status,phase!=Succeeded

Copy to Clipboard

Toggle word wrap

14.4.4. Ingress and console cluster operators are not coming online
Copia collegamento

If a hosted control plane is not coming fully online because the Ingress and console cluster Operators are not online, check the wildcard DNS routes and load balancer.

Procedure

If the cluster uses the default Ingress behavior, enter the following command to ensure that wildcard DNS routes are enabled on the OpenShift Container Platform cluster that the virtual machines (VMs) are hosted on:

oc patch ingresscontroller -n openshift-ingress-operator \
  default --type=json -p \
  '[{ "op": "add", "path": "/spec/routeAdmission", "value": {wildcardPolicy: "WildcardsAllowed"}}]'

$ oc patch ingresscontroller -n openshift-ingress-operator \
  default --type=json -p \
  '[{ "op": "add", "path": "/spec/routeAdmission", "value": {wildcardPolicy: "WildcardsAllowed"}}]'

Copy to Clipboard

Toggle word wrap

If you use a custom base domain for the hosted control plane, complete the following steps:
- Ensure that the load balancer is targeting the VM pods correctly.
- Ensure that the wildcard DNS entry is targeting the load balancer IP address.

14.4.5. Load balancer services for the hosted cluster are not available
Copia collegamento

If a hosted control plane is not coming fully online because the load balancer services are not becoming available, check events, details, and the Kubernetes Cluster Configuration Manager (KCCM) pod.

Procedure

Look for events and details that are associated with the load balancer service within the hosted cluster.
By default, load balancers for the hosted cluster are handled by the kubevirt-cloud-controller-manager within the hosted control plane namespace. Ensure that the KCCM pod is online and view its logs for errors or warnings. To identify the KCCM pod in the hosted control plane namespace, enter the following command:
```
oc get pods -n <hosted_control_plane_namespace> \
  -l app=cloud-controller-manager
```
```
$ oc get pods -n <hosted_control_plane_namespace> \
  -l app=cloud-controller-manager
```
Copy to Clipboard Toggle word wrap

14.4.6. Hosted cluster PVCs are not available
Copia collegamento

If a hosted control plane is not coming fully online because the persistent volume claims (PVCs) for a hosted cluster are not available, check the PVC events and details, and component logs.

Procedure

Look for events and details that are associated with the PVC to understand which errors are occurring.
If a PVC is failing to attach to a pod, view the logs for the kubevirt-csi-node daemonset component within the hosted cluster to further investigate the problem. To identify the kubevirt-csi-node pods for each node, enter the following command:
```
oc get pods -n openshift-cluster-csi-drivers -o wide \
  -l app=kubevirt-csi-driver
```
```
$ oc get pods -n openshift-cluster-csi-drivers -o wide \
  -l app=kubevirt-csi-driver
```
Copy to Clipboard Toggle word wrap
If a PVC cannot bind to a persistent volume (PV), view the logs of the kubevirt-csi-controller component within the hosted control plane namespace. To identify the kubevirt-csi-controller pod within the hosted control plane namespace, enter the following command:
```
oc get pods -n <hcp namespace> -l app=kubevirt-csi-driver
```
```
$ oc get pods -n <hcp namespace> -l app=kubevirt-csi-driver
```
Copy to Clipboard Toggle word wrap

14.4.7. VM nodes are not correctly joining the cluster
Copia collegamento

If a hosted control plane is not coming fully online because the VM nodes are not correctly joining the cluster, access the VM console logs.

Procedure

To access the VM console logs, complete the steps in How to get serial console logs for VMs part of OpenShift Virtualization Hosted Control Plane clusters.

14.4.8. RHCOS image mirroring fails
Copia collegamento

For hosted control planes on OpenShift Virtualization in a disconnected environment, oc-mirror fails to automatically mirror the Red Hat Enterprise Linux CoreOS (RHCOS) image to the internal registry. When you create your first hosted cluster, the Kubevirt virtual machine does not boot, because the boot image is not available in the internal registry.

To resolve this issue, manually mirror the RHCOS image to the internal registry.

Procedure

Get the internal registry name by running the following command:

oc get imagecontentsourcepolicy -o json \
  | jq -r '.items[].spec.repositoryDigestMirrors[0].mirrors[0]'

$ oc get imagecontentsourcepolicy -o json \
  | jq -r '.items[].spec.repositoryDigestMirrors[0].mirrors[0]'

Copy to Clipboard

Toggle word wrap

Get a payload image by running the following command:

oc get clusterversion version -ojsonpath='{.status.desired.image}'

$ oc get clusterversion version -ojsonpath='{.status.desired.image}'

Copy to Clipboard

Toggle word wrap

Extract the 0000_50_installer_coreos-bootimages.yaml file that contains boot images from your payload image on the hosted cluster. Replace <payload_image> with the name of your payload image. Run the following command:
```
oc image extract \
  --file /release-manifests/0000_50_installer_coreos-bootimages.yaml \
  <payload_image> --confirm
```
```
$ oc image extract \
  --file /release-manifests/0000_50_installer_coreos-bootimages.yaml \
  <payload_image> --confirm
```
Copy to Clipboard Toggle word wrap

Get the RHCOS image by running the following command:

cat 0000_50_installer_coreos-bootimages.yaml | yq -r .data.stream \
  | jq -r '.architectures.x86_64.images.kubevirt."digest-ref"'

$ cat 0000_50_installer_coreos-bootimages.yaml | yq -r .data.stream \
  | jq -r '.architectures.x86_64.images.kubevirt."digest-ref"'

Copy to Clipboard

Toggle word wrap

Mirror the RHCOS image to your internal registry. Replace <rhcos_image> with your RHCOS image; for example, quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d9643ead36b1c026be664c9c65c11433c6cdf71bfd93ba229141d134a4a6dd94. Replace <internal_registry> with the name of your internal registry; for example, virthost.ostest.test.metalkube.org:5000/localimages/ocp-v4.0-art-dev. Run the following command:
```
oc image mirror <rhcos_image> <internal_registry>
```
```
$ oc image mirror <rhcos_image> <internal_registry>
```
Copy to Clipboard Toggle word wrap

Create a YAML file named rhcos-boot-kubevirt.yaml that defines the ImageDigestMirrorSet object. See the following example configuration:

apiVersion: config.openshift.io/v1
kind: ImageDigestMirrorSet
metadata:
  name: rhcos-boot-kubevirt
spec:
  repositoryDigestMirrors:
    - mirrors:
        - virthost.ostest.test.metalkube.org:5000/localimages/ocp-v4.0-art-dev 
      source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

apiVersion: config.openshift.io/v1
kind: ImageDigestMirrorSet
metadata:
  name: rhcos-boot-kubevirt
spec:
  repositoryDigestMirrors:
    - mirrors:
        - virthost.ostest.test.metalkube.org:5000/localimages/ocp-v4.0-art-dev


      source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

Copy to Clipboard

Toggle word wrap

1: Specify the name of your internal registry, for example, virthost.ostest.test.metalkube.org:5000/localimages/ocp-v4.0-art-dev.
2: Specify your RHCOS image without its digest, for example, quay.io/openshift-release-dev/ocp-v4.0-art-dev.

Apply the rhcos-boot-kubevirt.yaml file to create the ImageDigestMirrorSet object by running the following command:
```
oc apply -f rhcos-boot-kubevirt.yaml
```
```
$ oc apply -f rhcos-boot-kubevirt.yaml
```
Copy to Clipboard Toggle word wrap

14.4.9. Return non-bare-metal clusters to the late binding pool
Copia collegamento

If you are using late binding managed clusters without BareMetalHosts, you must complete additional manual steps to delete a late binding cluster and return the nodes back to the Discovery ISO.

For late binding managed clusters without BareMetalHosts, removing cluster information does not automatically return all nodes to the Discovery ISO.

Procedure

To unbind the non-bare-metal nodes with late binding, complete the following steps:

Remove the cluster information. For more information, see Removing a cluster from management.
Clean the root disks.
Reboot manually with the Discovery ISO.

14.5. Troubleshooting hosted clusters on bare metal
Copia collegamento

The following information applies to troubleshooting hosted control planes on bare metal.

14.5.1. Nodes fail to be added to hosted control planes on bare metal
Copia collegamento

When you scale up a hosted control planes cluster with nodes that were provisioned by using Assisted Installer, the host fails to pull the ignition with a URL that contains port 22642. That URL is invalid for hosted control planes and indicates that an issue exists with the cluster.

Procedure

To determine the issue, review the assisted-service logs:
```
oc logs -n multicluster-engine <assisted_service_pod_name>
```
```
$ oc logs -n multicluster-engine <assisted_service_pod_name> 
```
1
Copy to Clipboard Toggle word wrap
1
Specify the Assisted Service pod name.

In the logs, find errors that resemble these examples:

error="failed to get pull secret for update: invalid pull secret data in secret pull-secret"

error="failed to get pull secret for update: invalid pull secret data in secret pull-secret"

Copy to Clipboard

Toggle word wrap

pull secret must contain auth for \"registry.redhat.io\"

pull secret must contain auth for \"registry.redhat.io\"

Copy to Clipboard

Toggle word wrap

To fix this issue, see "Add the pull secret to the namespace" in the multicluster engine for Kubernetes Operator documentation.
Note
To use hosted control planes, you must have multicluster engine Operator installed, either as a standalone operator or as part of Red Hat Advanced Cluster Management. Because the operator has a close association with Red Hat Advanced Cluster Management, the documentation for the operator is published within that product’s documentation. Even if you do not use Red Hat Advanced Cluster Management, the parts of its documentation that cover multicluster engine Operator are relevant to hosted control planes.

14.6. Restarting hosted control plane components
Copia collegamento

If you are an administrator for hosted control planes, you can use the hypershift.openshift.io/restart-date annotation to restart all control plane components for a particular HostedCluster resource. For example, you might need to restart control plane components for certificate rotation.

Procedure

To restart a control plane, annotate the HostedCluster resource by entering the following command:
```
oc annotate hostedcluster \
  -n <hosted_cluster_namespace> \
  <hosted_cluster_name> \
  hypershift.openshift.io/restart-date=$(date --iso-8601=seconds)
```
```
$ oc annotate hostedcluster \
  -n <hosted_cluster_namespace> \
  <hosted_cluster_name> \
  hypershift.openshift.io/restart-date=$(date --iso-8601=seconds) 
```
1
Copy to Clipboard Toggle word wrap
1
The control plane is restarted whenever the value of the annotation changes. The date command serves as the source of a unique string. The annotation is treated as a string, not a timestamp.

Verification

After you restart a control plane, the following hosted control planes components are typically restarted:

Note

You might see some additional components restarting as a side effect of changes implemented by the other components.

catalog-operator
certified-operators-catalog
cluster-api
cluster-autoscaler
cluster-policy-controller
cluster-version-operator
community-operators-catalog
control-plane-operator
hosted-cluster-config-operator
ignition-server
ingress-operator
konnectivity-agent
konnectivity-server
kube-apiserver
kube-controller-manager
kube-scheduler
machine-approver
oauth-openshift
olm-operator
openshift-apiserver
openshift-controller-manager
openshift-oauth-apiserver
packageserver
redhat-marketplace-catalog
redhat-operators-catalog

14.7. Pausing the reconciliation of a hosted cluster and hosted control plane
Copia collegamento

If you are a cluster instance administrator, you can pause the reconciliation of a hosted cluster and hosted control plane. You might want to pause reconciliation when you back up and restore an etcd database or when you need to debug problems with a hosted cluster or hosted control plane.

Procedure

To pause reconciliation for a hosted cluster and hosted control plane, populate the pausedUntil field of the HostedCluster resource.
- To pause the reconciliation until a specific time, enter the following command:
  $ oc patch -n <hosted_cluster_namespace> \ hostedclusters/<hosted_cluster_name> \ -p '{"spec":{"pausedUntil":"<timestamp>"}}' \ --type=merge
  1
  Copy to Clipboard Toggle word wrap
  1
  Specify a timestamp in the RFC339 format, for example, 2024-03-03T03:28:48Z. The reconciliation is paused until the specified time is passed.
- To pause the reconciliation indefinitely, enter the following command:
  $ oc patch -n <hosted_cluster_namespace> \ hostedclusters/<hosted_cluster_name> \ -p '{"spec":{"pausedUntil":"true"}}' \ --type=merge
  Copy to Clipboard Toggle word wrap
  The reconciliation is paused until you remove the field from the HostedCluster resource.
  When the pause reconciliation field is populated for the HostedCluster resource, the field is automatically added to the associated HostedControlPlane resource.

To remove the pausedUntil field, enter the following patch command:

oc patch -n <hosted_cluster_namespace> \
  hostedclusters/<hosted_cluster_name> \
  -p '{"spec":{"pausedUntil":null}}' \
  --type=merge

$ oc patch -n <hosted_cluster_namespace> \
  hostedclusters/<hosted_cluster_name> \
  -p '{"spec":{"pausedUntil":null}}' \
  --type=merge

Copy to Clipboard

Toggle word wrap

14.8. Scaling down the data plane to zero
Copia collegamento

If you are not using the hosted control plane, to save the resources and cost you can scale down a data plane to zero.

Note

Ensure you are prepared to scale down the data plane to zero. Because the workload from the worker nodes disappears after scaling down.

Procedure

Set the kubeconfig file to access the hosted cluster by running the following command:
```
export KUBECONFIG=<install_directory>/auth/kubeconfig
```
```
$ export KUBECONFIG=<install_directory>/auth/kubeconfig
```
Copy to Clipboard Toggle word wrap
Get the name of the NodePool resource associated to your hosted cluster by running the following command:
```
oc get nodepool --namespace <hosted_cluster_namespace>
```
```
$ oc get nodepool --namespace <hosted_cluster_namespace>
```
Copy to Clipboard Toggle word wrap

Optional: To prevent the pods from draining, add the nodeDrainTimeout field in the NodePool resource by running the following command:

oc edit nodepool <nodepool_name>  --namespace <hosted_cluster_namespace>

$ oc edit nodepool <nodepool_name>  --namespace <hosted_cluster_namespace>

Copy to Clipboard

Toggle word wrap

Example output

apiVersion: hypershift.openshift.io/v1alpha1
kind: NodePool
metadata:
# ...
  name: nodepool-1
  namespace: clusters
# ...
spec:
  arch: amd64
  clusterName: clustername 
  management:
    autoRepair: false
    replace:
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0
      strategy: RollingUpdate
    upgradeType: Replace
  nodeDrainTimeout: 0s 
# ...

apiVersion: hypershift.openshift.io/v1alpha1
kind: NodePool
metadata:
# ...
  name: nodepool-1
  namespace: clusters
# ...
spec:
  arch: amd64
  clusterName: clustername


  management:
    autoRepair: false
    replace:
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0
      strategy: RollingUpdate
    upgradeType: Replace
  nodeDrainTimeout: 0s


# ...

Copy to Clipboard

Toggle word wrap

1: Defines the name of your hosted cluster.
2: Specifies the total amount of time that the controller spends to drain a node. By default, the nodeDrainTimeout: 0s setting blocks the node draining process.

Note

To allow the node draining process to continue for a certain period of time, you can set the value of the nodeDrainTimeout field accordingly, for example, nodeDrainTimeout: 1m.

Scale down the NodePool resource associated to your hosted cluster by running the following command:
```
oc scale nodepool/<nodepool_name> --namespace <hosted_cluster_namespace> \
  --replicas=0
```
```
$ oc scale nodepool/<nodepool_name> --namespace <hosted_cluster_namespace> \
  --replicas=0
```
Copy to Clipboard Toggle word wrap
Note
After scaling down the data plan to zero, some pods in the control plane stay in the Pending status and the hosted control plane stays up and running. If necessary, you can scale up the NodePool resource.
Optional: Scale up the NodePool resource associated to your hosted cluster by running the following command:
```
oc scale nodepool/<nodepool_name> --namespace <hosted_cluster_namespace> --replicas=1
```
```
$ oc scale nodepool/<nodepool_name> --namespace <hosted_cluster_namespace> --replicas=1
```
Copy to Clipboard Toggle word wrap
After rescaling the NodePool resource, wait for couple of minutes for the NodePool resource to become available in a Ready state.

Verification

Verify that the value for the nodeDrainTimeout field is greater than 0s by running the following command:

oc get nodepool -n <hosted_cluster_namespace> <nodepool_name> -ojsonpath='{.spec.nodeDrainTimeout}'

$ oc get nodepool -n <hosted_cluster_namespace> <nodepool_name> -ojsonpath='{.spec.nodeDrainTimeout}'

Copy to Clipboard

Toggle word wrap

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 14. Troubleshooting hosted control planes

14.1. Gathering information to troubleshoot hosted control planes
Copia collegamento

14.2. Gathering OpenShift Container Platform data for a hosted cluster
Copia collegamento

14.2.1. Gathering data for a hosted cluster by using the CLI
Copia collegamento

14.2.2. Gathering data for a hosted cluster by using the web console
Copia collegamento

14.3. Entering the must-gather command in a disconnected environment
Copia collegamento

14.4. Troubleshooting hosted clusters on OpenShift Virtualization
Copia collegamento

14.4.1. HostedCluster resource is stuck in a partial state
Copia collegamento

14.4.2. No worker nodes are registered
Copia collegamento

14.4.3. Worker nodes are stuck in the NotReady state
Copia collegamento

14.4.4. Ingress and console cluster operators are not coming online
Copia collegamento

14.4.5. Load balancer services for the hosted cluster are not available
Copia collegamento

14.4.6. Hosted cluster PVCs are not available
Copia collegamento

14.4.7. VM nodes are not correctly joining the cluster
Copia collegamento

14.4.8. RHCOS image mirroring fails
Copia collegamento

14.4.9. Return non-bare-metal clusters to the late binding pool
Copia collegamento

14.5. Troubleshooting hosted clusters on bare metal
Copia collegamento

14.5.1. Nodes fail to be added to hosted control planes on bare metal
Copia collegamento

14.6. Restarting hosted control plane components
Copia collegamento

14.7. Pausing the reconciliation of a hosted cluster and hosted control plane
Copia collegamento

14.8. Scaling down the data plane to zero
Copia collegamento

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Rendiamo l’open source più inclusivo

Informazioni su Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 14. Troubleshooting hosted control planes

14.1. Gathering information to troubleshoot hosted control planesCopia collegamentoCollegamento copiato negli appunti!

14.2. Gathering OpenShift Container Platform data for a hosted clusterCopia collegamentoCollegamento copiato negli appunti!

14.2.1. Gathering data for a hosted cluster by using the CLICopia collegamentoCollegamento copiato negli appunti!

14.2.2. Gathering data for a hosted cluster by using the web consoleCopia collegamentoCollegamento copiato negli appunti!

14.3. Entering the must-gather command in a disconnected environmentCopia collegamentoCollegamento copiato negli appunti!

14.4. Troubleshooting hosted clusters on OpenShift VirtualizationCopia collegamentoCollegamento copiato negli appunti!

14.4.1. HostedCluster resource is stuck in a partial stateCopia collegamentoCollegamento copiato negli appunti!

14.4.2. No worker nodes are registeredCopia collegamentoCollegamento copiato negli appunti!

14.4.3. Worker nodes are stuck in the NotReady stateCopia collegamentoCollegamento copiato negli appunti!

14.4.4. Ingress and console cluster operators are not coming onlineCopia collegamentoCollegamento copiato negli appunti!

14.4.5. Load balancer services for the hosted cluster are not availableCopia collegamentoCollegamento copiato negli appunti!

14.4.6. Hosted cluster PVCs are not availableCopia collegamentoCollegamento copiato negli appunti!

14.4.7. VM nodes are not correctly joining the clusterCopia collegamentoCollegamento copiato negli appunti!

14.4.8. RHCOS image mirroring failsCopia collegamentoCollegamento copiato negli appunti!

14.4.9. Return non-bare-metal clusters to the late binding poolCopia collegamentoCollegamento copiato negli appunti!

14.5. Troubleshooting hosted clusters on bare metalCopia collegamentoCollegamento copiato negli appunti!

14.5.1. Nodes fail to be added to hosted control planes on bare metalCopia collegamentoCollegamento copiato negli appunti!

14.6. Restarting hosted control plane componentsCopia collegamentoCollegamento copiato negli appunti!

14.7. Pausing the reconciliation of a hosted cluster and hosted control planeCopia collegamentoCollegamento copiato negli appunti!

14.8. Scaling down the data plane to zeroCopia collegamentoCollegamento copiato negli appunti!

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Rendiamo l’open source più inclusivo

Informazioni su Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

14.1. Gathering information to troubleshoot hosted control planes
Copia collegamento

14.2. Gathering OpenShift Container Platform data for a hosted cluster
Copia collegamento

14.2.1. Gathering data for a hosted cluster by using the CLI
Copia collegamento

14.2.2. Gathering data for a hosted cluster by using the web console
Copia collegamento

14.3. Entering the must-gather command in a disconnected environment
Copia collegamento

14.4. Troubleshooting hosted clusters on OpenShift Virtualization
Copia collegamento

14.4.1. HostedCluster resource is stuck in a partial state
Copia collegamento

14.4.2. No worker nodes are registered
Copia collegamento

14.4.3. Worker nodes are stuck in the NotReady state
Copia collegamento

14.4.4. Ingress and console cluster operators are not coming online
Copia collegamento

14.4.5. Load balancer services for the hosted cluster are not available
Copia collegamento

14.4.6. Hosted cluster PVCs are not available
Copia collegamento

14.4.7. VM nodes are not correctly joining the cluster
Copia collegamento

14.4.8. RHCOS image mirroring fails
Copia collegamento

14.4.9. Return non-bare-metal clusters to the late binding pool
Copia collegamento

14.5. Troubleshooting hosted clusters on bare metal
Copia collegamento

14.5.1. Nodes fail to be added to hosted control planes on bare metal
Copia collegamento

14.6. Restarting hosted control plane components
Copia collegamento

14.7. Pausing the reconciliation of a hosted cluster and hosted control plane
Copia collegamento

14.8. Scaling down the data plane to zero
Copia collegamento