5.5. 对初始化集群失败进行故障排除


安装程序使用 Cluster Version Operator 创建 OpenShift Container Platform 集群的所有组件。当安装程序无法初始化集群时,您可以从 ClusterVersionClusterOperator 对象检索最重要的信息。

流程

  1. 运行以下命令检查 ClusterVersion 对象:

    $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusterversion -o yaml
    Copy to Clipboard Toggle word wrap

    输出示例

    apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    metadata:
      creationTimestamp: 2019-02-27T22:24:21Z
      generation: 1
      name: version
      resourceVersion: "19927"
      selfLink: /apis/config.openshift.io/v1/clusterversions/version
      uid: 6e0f4cf8-3ade-11e9-9034-0a923b47ded4
    spec:
      channel: stable-4.1
      clusterID: 5ec312f9-f729-429d-a454-61d4906896ca
    status:
      availableUpdates: null
      conditions:
      - lastTransitionTime: 2019-02-27T22:50:30Z
        message: Done applying 4.1.1
        status: "True"
        type: Available
      - lastTransitionTime: 2019-02-27T22:50:30Z
        status: "False"
        type: Failing
      - lastTransitionTime: 2019-02-27T22:50:30Z
        message: Cluster version is 4.1.1
        status: "False"
        type: Progressing
      - lastTransitionTime: 2019-02-27T22:24:31Z
        message: 'Unable to retrieve available updates: unknown version 4.1.1
        reason: RemoteFailed
        status: "False"
        type: RetrievedUpdates
      desired:
        image: registry.svc.ci.openshift.org/openshift/origin-release@sha256:91e6f754975963e7db1a9958075eb609ad226968623939d262d1cf45e9dbc39a
        version: 4.1.1
      history:
      - completionTime: 2019-02-27T22:50:30Z
        image: registry.svc.ci.openshift.org/openshift/origin-release@sha256:91e6f754975963e7db1a9958075eb609ad226968623939d262d1cf45e9dbc39a
        startedTime: 2019-02-27T22:24:31Z
        state: Completed
        version: 4.1.1
      observedGeneration: 1
      versionHash: Wa7as_ik1qE=
    Copy to Clipboard Toggle word wrap

  2. 运行以下命令来查看条件:

    $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusterversion version \
         -o=jsonpath='{range .status.conditions[*]}{.type}{" "}{.status}{" "}{.message}{"\n"}{end}'
    Copy to Clipboard Toggle word wrap

    某些最重要的条件包括 FailingAvailableProgressing

    输出示例

    Available True Done applying 4.1.1
    Failing False
    Progressing False Cluster version is 4.0.0-0.alpha-2019-02-26-194020
    RetrievedUpdates False Unable to retrieve available updates: unknown version 4.1.1
    Copy to Clipboard Toggle word wrap

  3. 运行以下命令检查 ClusterOperator 对象:

    $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator
    Copy to Clipboard Toggle word wrap

    该命令返回集群 Operator 的状态。

    输出示例

    NAME                                  VERSION   AVAILABLE   PROGRESSING   FAILING   SINCE
    cluster-baremetal-operator                      True        False         False     17m
    cluster-autoscaler                              True        False         False     17m
    cluster-storage-operator                        True        False         False     10m
    console                                         True        False         False     7m21s
    dns                                             True        False         False     31m
    image-registry                                  True        False         False     9m58s
    ingress                                         True        False         False     10m
    kube-apiserver                                  True        False         False     28m
    kube-controller-manager                         True        False         False     21m
    kube-scheduler                                  True        False         False     25m
    machine-api                                     True        False         False     17m
    machine-config                                  True        False         False     17m
    marketplace-operator                            True        False         False     10m
    monitoring                                      True        False         False     8m23s
    network                                         True        False         False     13m
    node-tuning                                     True        False         False     11m
    openshift-apiserver                             True        False         False     15m
    openshift-authentication                        True        False         False     20m
    openshift-cloud-credential-operator             True        False         False     18m
    openshift-controller-manager                    True        False         False     10m
    openshift-samples                               True        False         False     8m42s
    operator-lifecycle-manager                      True        False         False     17m
    service-ca                                      True        False         False     30m
    Copy to Clipboard Toggle word wrap

  4. 运行以下命令检查单个集群 Operator:

    $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator <operator> -oyaml 
    1
    Copy to Clipboard Toggle word wrap
    1
    <operator> 替换为集群 Operator 的名称。此命令可用于识别集群 Operator 没有达到 Available 状态的原因或处于 Failed 状态。

    输出示例

    apiVersion: config.openshift.io/v1
    kind: ClusterOperator
    metadata:
      creationTimestamp: 2019-02-27T22:47:04Z
      generation: 1
      name: monitoring
      resourceVersion: "24677"
      selfLink: /apis/config.openshift.io/v1/clusteroperators/monitoring
      uid: 9a6a5ef9-3ae1-11e9-bad4-0a97b6ba9358
    spec: {}
    status:
      conditions:
      - lastTransitionTime: 2019-02-27T22:49:10Z
        message: Successfully rolled out the stack.
        status: "True"
        type: Available
      - lastTransitionTime: 2019-02-27T22:49:10Z
        status: "False"
        type: Progressing
      - lastTransitionTime: 2019-02-27T22:49:10Z
        status: "False"
        type: Failing
      extension: null
      relatedObjects: null
      version: ""
    Copy to Clipboard Toggle word wrap

  5. 要获取集群 Operator 的状态条件,请运行以下命令:

    $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator <operator> \
         -o=jsonpath='{range .status.conditions[*]}{.type}{" "}{.status}{" "}{.message}{"\n"}{end}'
    Copy to Clipboard Toggle word wrap

    <operator> 替换为以上其中一个 Operator 的名称。

    输出示例

    Available True Successfully rolled out the stack
    Progressing False
    Failing False
    Copy to Clipboard Toggle word wrap

  6. 要检索集群 Operator 拥有的对象列表,请执行以下命令:

    oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator kube-apiserver \
       -o=jsonpath='{.status.relatedObjects}'
    Copy to Clipboard Toggle word wrap

    输出示例

    [map[resource:kubeapiservers group:operator.openshift.io name:cluster] map[group: name:openshift-config resource:namespaces] map[group: name:openshift-config-managed resource:namespaces] map[group: name:openshift-kube-apiserver-operator resource:namespaces] map[group: name:openshift-kube-apiserver resource:namespaces]]
    Copy to Clipboard Toggle word wrap

返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat