1.21. 独立订阅内存故障排除
multicluster-operators-standalone-subscription
pod 会因为内存问题而定期重启。
1.21.1. 症状:独立订阅内存
当 Operator Lifecycle Manager 部署所有 operator 而不仅仅是 multicluster-subscription-operator 时,multicluster-operators-standalone-subscription
pod 会重启,因为没有为独立订阅容器分配足够内存。
multicluster-operators-standalone-subscription
pod 的内存限值在 multicluster subscription community operator CSV 中增加到 2GB,但此资源限制设置会被 Operator Lifecycle Manager 忽略。
1.21.2. 解决问题:独立订阅内存
安装后,找到订阅 multicluster subscription community operator 的 operator 订阅 CR。运行以下命令:
% oc get sub -n open-cluster-management acm-operator-subscription
编辑 operator 订阅自定义资源,添加
spec.config.resources
.yaml
文件,以定义资源限值。注: 不要创建新的、订阅了同一个多集群订阅社区 operator 的订阅自定义资源。因为两个 operator 订阅都连接到一个 operator,operator Pod 会被
"killed"
并由两个 operator 订阅自定义资源重启。请参阅以下更新的
.yaml
文件示例:apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: multicluster-operators-subscription-alpha-community-operators-openshift-marketplace namespace: open-cluster-management spec: channel: release-2.2 config: resources: limits: cpu: 750m memory: 2Gi requests: cpu: 150m memory: 128Mi installPlanApproval: Automatic name: multicluster-operators-subscription source: community-operators sourceNamespace: openshift-marketplace
保存资源后,确保独立订阅 Pod 被重启为有 2GB 内存限制。运行以下命令:
% oc get pods -n open-cluster-management multicluster-operators-standalone-subscription-7c8cbf885f-c94kz -o yaml
apiVersion: v1 kind: Pod ... spec: containers: - image: quay.io/open-cluster-management/multicluster-operators-subscription:community-2.2 ... resources: limits: cpu: 750m memory: 2Gi requests: cpu: 150m memory: 128Mi ... status: qosClass: Burstable