1.28. metrics-collector 故障排除
当 observability-client-ca-certificate
secret 没有在受管集群中被重新刷新时,您可能会收到一个内部服务器错误。
1.28.1. 症状: metrics-collector 无法验证 observability-client-ca-certificate
可能有一个受管集群,其中的指标不可用。如果出现这种情况,您可能会从 metrics-collector
部署中收到以下错误:
error: response status code is 500 Internal Server Error, response body is x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "observability-client-ca-certificate")
1.28.2. 解决问题: metrics-collector 无法验证 observability-client-ca-certificate
如果您有这个问题,请完成以下步骤:
- 登录到受管集群。
删除名为
observability-controller-open-cluster-management.io-observability-signer-client-cert
的 secret,该 secret 位于open-cluster-management-addon-observability
命名空间中。运行以下命令:oc delete observability-controller-open-cluster-management.io-observability-signer-client-cert -n open-cluster-management-addon-observability
注:
observability-controller-open-cluster-management.io-observability-signer-client-cert
会自动使用新证书重新创建。
重新创建 metrics-collector
部署并更新 observability-controller-open-cluster-management.io-observability-signer-client-cert
secret。