Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 9. Troubleshooting hosted control planes


If you encounter issues with hosted control planes, see the following information to guide you through troubleshooting.

9.1. Gathering information to troubleshoot hosted control planes

When you need to troubleshoot an issue with hosted control plane clusters, you can gather information by running the

must-gather
command. The command generates output for the management cluster and the hosted cluster.

The output for the management cluster contains the following content:

  • Cluster-scoped resources: These resources are node definitions of the management cluster.
  • The hypershift-dump compressed file: This file is useful if you need to share the content with other people.
  • Namespaced resources: These resources include all of the objects from the relevant namespaces, such as config maps, services, events, and logs.
  • Network logs: These logs include the OVN northbound and southbound databases and the status for each one.
  • Hosted clusters: This level of output involves all of the resources inside of the hosted cluster.

The output for the hosted cluster contains the following content:

  • Cluster-scoped resources: These resources include all of the cluster-wide objects, such as nodes and CRDs.
  • Namespaced resources: These resources include all of the objects from the relevant namespaces, such as config maps, services, events, and logs.

Although the output does not contain any secret objects from the cluster, it can contain references to the names of secrets.

Prerequisites

  • You must have
    cluster-admin
    access to the management cluster.
  • You need the
    name
    value for the
    HostedCluster
    resource and the namespace where the CR is deployed.
  • You must have the
    hcp
    command line interface installed. For more information, see Installing the hosted control planes command line interface.
  • You must have the OpenShift CLI (
    oc
    ) installed.
  • You must ensure that the
    kubeconfig
    file is loaded and is pointing to the management cluster.

Procedure

  • To gather the output for troubleshooting, enter the following command:

    $ oc adm must-gather --image=registry.redhat.io/multicluster-engine/must-gather-rhel9:v<mce_version> \
      /usr/bin/gather hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE hosted-cluster-name=HOSTEDCLUSTERNAME \
      --dest-dir=NAME ; tar -cvzf NAME.tgz NAME

    where:

    • You replace
      <mce_version>
      with the version of multicluster engine Operator that you are using; for example,
      2.4
      .
    • The
      hosted-cluster-namespace=HOSTEDCLUSTERNAMESPACE
      parameter is optional. If you do not include it, the command runs as though the hosted cluster is in the default namespace, which is
      clusters
      .
    • The
      --dest-dir=NAME
      parameter is optional. Specify that parameter if you want to save the results of the command to a compressed file, replacing
      NAME
      with the name of the directory where you want to save the results.

If you are a cluster instance administrator, you can pause the reconciliation of a hosted cluster and hosted control plane. You might want to pause reconciliation when you back up and restore an etcd database or when you need to debug problems with a hosted cluster or hosted control plane.

Procedure

  1. To pause reconciliation for a hosted cluster and hosted control plane, populate the

    pausedUntil
    field of the
    HostedCluster
    resource.

    • To pause the reconciliation until a specific time, enter the following command:

      $ oc patch -n <hosted_cluster_namespace> hostedclusters/<hosted_cluster_name> -p '{"spec":{"pausedUntil":"<timestamp>"}}' --type=merge 
      1
      1
      Specify a timestamp in the RFC339 format, for example, 2024-03-03T03:28:48Z. The reconciliation is paused until the specified time is passed.
    • To pause the reconciliation indefinitely, enter the following command:

      $ oc patch -n <hosted_cluster_namespace> hostedclusters/<hosted_cluster_name> -p '{"spec":{"pausedUntil":"true"}}' --type=merge

      The reconciliation is paused until you remove the field from the

      HostedCluster
      resource.

      When the pause reconciliation field is populated for the

      HostedCluster
      resource, the field is automatically added to the associated
      HostedControlPlane
      resource.

  2. To remove the

    pausedUntil
    field, enter the following patch command:

    $ oc patch -n <hosted_cluster_namespace> hostedclusters/<hosted_cluster_name> -p '{"spec":{"pausedUntil":null}}' --type=merge

9.3. Scaling down the data plane to zero

If you are not using the hosted control plane, to save the resources and cost you can scale down a data plane to zero.

Note

Ensure you are prepared to scale down the data plane to zero. Because the workload from the worker nodes disappears after scaling down.

Procedure

  1. Set the

    kubeconfig
    file to access the hosted cluster by running the following command:

    $ export KUBECONFIG=<install_directory>/auth/kubeconfig
  2. Get the name of the

    NodePool
    resource associated to your hosted cluster by running the following command:

    $ oc get nodepool --namespace <HOSTED_CLUSTER_NAMESPACE>
  3. Optional: To prevent the pods from draining, add the

    nodeDrainTimeout
    field in the
    NodePool
    resource by running the following command:

    $ oc edit nodepool <nodepool_name>  --namespace <hosted_cluster_namespace>

    Example output

    apiVersion: hypershift.openshift.io/v1alpha1
    kind: NodePool
    metadata:
    # ...
      name: nodepool-1
      namespace: clusters
    # ...
    spec:
      arch: amd64
      clusterName: clustername 
    1
    
      management:
        autoRepair: false
        replace:
          rollingUpdate:
            maxSurge: 1
            maxUnavailable: 0
          strategy: RollingUpdate
        upgradeType: Replace
      nodeDrainTimeout: 0s 
    2
    
    # ...

    1
    Defines the name of your hosted cluster.
    2
    Specifies the total amount of time that the controller spends to drain a node. By default, the nodeDrainTimeout: 0s setting blocks the node draining process.
    Note

    To allow the node draining process to continue for a certain period of time, you can set the value of the

    nodeDrainTimeout
    field accordingly, for example,
    nodeDrainTimeout: 1m
    .

  4. Scale down the

    NodePool
    resource associated to your hosted cluster by running the following command:

    $ oc scale nodepool/<NODEPOOL_NAME> --namespace <HOSTED_CLUSTER_NAMESPACE> --replicas=0
    Note

    After scaling down the data plan to zero, some pods in the control plane stay in the

    Pending
    status and the hosted control plane stays up and running. If necessary, you can scale up the
    NodePool
    resource.

  5. Optional: Scale up the

    NodePool
    resource associated to your hosted cluster by running the following command:

    $ oc scale nodepool/<NODEPOOL_NAME> --namespace <HOSTED_CLUSTER_NAMESPACE> --replicas=1

    After rescaling the

    NodePool
    resource, wait for couple of minutes for the
    NodePool
    resource to become available in a
    Ready
    state.

Verification

  • Verify that the value for the

    nodeDrainTimeout
    field is greater than
    0s
    by running the following command:

    $ oc get nodepool -n <hosted_cluster_namespace> <nodepool_name> -ojsonpath='{.spec.nodeDrainTimeout}'
Red Hat logoGithubredditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können. Entdecken Sie unsere neuesten Updates.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

Theme

© 2026 Red Hat
Nach oben