13.2. 使用污点和容限来控制日志记录 pod 放置
通过污点和容限,节点可以控制哪些 pod 应该(或不应该)调度到节点上。
13.2.1. 了解污点和容限
通过使用污点(taint),节点可以拒绝调度 pod,除非 pod 具有匹配的容限(toleration)。
您可以通过节点规格(NodeSpec
)将污点应用到节点
,并通过 Pod
规格(PodSpec
)将容限应用到 pod。当您应用污点时,调度程序无法将 pod 放置到该节点上,除非 pod 可以容限该污点。
节点规格中的污点示例
apiVersion: v1 kind: Node metadata: name: my-node #... spec: taints: - effect: NoExecute key: key1 value: value1 #...
Pod
规格中的容限示例
apiVersion: v1 kind: Pod metadata: name: my-pod #... spec: tolerations: - key: "key1" operator: "Equal" value: "value1" effect: "NoExecute" tolerationSeconds: 3600 #...
污点与容限由 key、value 和 effect 组成。
参数 | 描述 | ||||||
---|---|---|---|---|---|---|---|
|
| ||||||
|
| ||||||
| effect 的值包括:
| ||||||
|
|
如果向 control plane 节点添加了一个
NoSchedule
污点,节点必须具有node-role.kubernetes.io/master=:NoSchedule
污点,这默认会添加。例如:
apiVersion: v1 kind: Node metadata: annotations: machine.openshift.io/machine: openshift-machine-api/ci-ln-62s7gtb-f76d1-v8jxv-master-0 machineconfiguration.openshift.io/currentConfig: rendered-master-cdc1ab7da414629332cc4c3926e6e59c name: my-node #... spec: taints: - effect: NoSchedule key: node-role.kubernetes.io/master #...
容限与污点匹配:
如果
operator
参数设为Equal
:-
key
参数相同; -
value
参数相同; -
effect
参数相同。
-
如果
operator
参数设为Exists
:-
key
参数相同; -
effect
参数相同。
-
OpenShift Container Platform 中内置了以下污点:
-
node.kubernetes.io/not-ready
:节点未就绪。这与节点状况Ready=False
对应。 -
node.kubernetes.io/unreachable
:节点无法从节点控制器访问。这与节点状况Ready=Unknown
对应。 -
node.kubernetes.io/memory-pressure
:节点存在内存压力问题。这与节点状况MemoryPressure=True
对应。 -
node.kubernetes.io/disk-pressure
:节点存在磁盘压力问题。这与节点状况DiskPressure=True
对应。 -
node.kubernetes.io/network-unavailable
:节点网络不可用。 -
node.kubernetes.io/unschedulable
:节点不可调度。 -
node.cloudprovider.kubernetes.io/uninitialized
:当节点控制器通过外部云提供商启动时,在节点上设置这个污点来将其标记为不可用。在云控制器管理器中的某个控制器初始化这个节点后,kubelet 会移除此污点。 node.kubernetes.io/pid-pressure
:节点具有 pid 压力。这与节点状况PIDPressure=True
对应。重要OpenShift Container Platform 不设置默认的 pid.available
evictionHard
。
13.2.2. Loki pod 放置
您可以通过在 pod 上使用容忍度或节点选择器来控制 Loki pod 在哪些节点上运行,并防止其他工作负载使用这些节点。
您可以使用 LokiStack 自定义资源 (CR) 将容限应用到日志存储 pod,并将污点应用到具有节点规格的节点。节点上的污点是一个 key:value
对,它指示节点排斥所有不允许污点的 pod。通过使用不在其他 pod 上的特定 key:value
对,可确保只有日志存储 pod 能够在该节点上运行。
带有节点选择器的 LokiStack 示例
apiVersion: loki.grafana.com/v1 kind: LokiStack metadata: name: logging-loki namespace: openshift-logging spec: # ... template: compactor: 1 nodeSelector: node-role.kubernetes.io/infra: "" 2 distributor: nodeSelector: node-role.kubernetes.io/infra: "" gateway: nodeSelector: node-role.kubernetes.io/infra: "" indexGateway: nodeSelector: node-role.kubernetes.io/infra: "" ingester: nodeSelector: node-role.kubernetes.io/infra: "" querier: nodeSelector: node-role.kubernetes.io/infra: "" queryFrontend: nodeSelector: node-role.kubernetes.io/infra: "" ruler: nodeSelector: node-role.kubernetes.io/infra: "" # ...
在上例配置中,所有 Loki pod 都移到包含 node-role.kubernetes.io/infra: ""
标签的节点。
带有节点选择器和容限的 LokiStack CR 示例
apiVersion: loki.grafana.com/v1 kind: LokiStack metadata: name: logging-loki namespace: openshift-logging spec: # ... template: compactor: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved distributor: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved indexGateway: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved ingester: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved querier: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved queryFrontend: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved ruler: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved gateway: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved # ...
要配置 LokiStack (CR) 的 nodeSelector
和 tolerations
字段,您可以使用 oc explain
命令查看特定资源的描述和字段:
$ oc explain lokistack.spec.template
输出示例
KIND: LokiStack VERSION: loki.grafana.com/v1 RESOURCE: template <Object> DESCRIPTION: Template defines the resource/limits/tolerations/nodeselectors per component FIELDS: compactor <Object> Compactor defines the compaction component spec. distributor <Object> Distributor defines the distributor component spec. ...
如需更多信息,您可以添加一个特定字段:
$ oc explain lokistack.spec.template.compactor
输出示例
KIND: LokiStack VERSION: loki.grafana.com/v1 RESOURCE: compactor <Object> DESCRIPTION: Compactor defines the compaction component spec. FIELDS: nodeSelector <map[string]string> NodeSelector defines the labels required by a node to schedule the component onto it. ...
13.2.3. 使用容忍度来控制日志收集器 pod 放置
默认情况下,日志收集器 pod 具有以下 tolerations
配置:
apiVersion: v1 kind: Pod metadata: name: collector-example namespace: openshift-logging spec: # ... collection: type: vector tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoSchedule key: node.kubernetes.io/disk-pressure operator: Exists - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists - effect: NoSchedule key: node.kubernetes.io/pid-pressure operator: Exists - effect: NoSchedule key: node.kubernetes.io/unschedulable operator: Exists # ...
先决条件
-
已安装 Red Hat OpenShift Logging Operator 和 OpenShift CLI (
oc
)。
流程
运行以下命令,将污点添加到要在其上调度日志记录收集器 pod 的节点:
$ oc adm taint nodes <node_name> <key>=<value>:<effect>
示例命令
$ oc adm taint nodes node1 collector=node:NoExecute
本例在
node1
上放置一个键为collector
且值为node
的污点,污点效果是NoExecute
。您必须使用NoExecute
污点设置。NoExecute
仅调度与污点匹配的 pod,并删除不匹配的现有 pod。编辑
ClusterLogging
自定义资源(CR)的collection
小节,以配置日志记录收集器 Pod 的容忍度:apiVersion: logging.openshift.io/v1 kind: ClusterLogging metadata: # ... spec: # ... collection: type: vector tolerations: - key: collector 1 operator: Exists 2 effect: NoExecute 3 tolerationSeconds: 6000 4 resources: limits: memory: 2Gi requests: cpu: 100m memory: 1Gi # ...
此容忍度与 oc adm taint
命令创建的污点匹配。具有此容忍度的 pod 可以调度到 node1
上。
13.2.4. 配置日志记录收集器的资源和调度
管理员可以通过创建位于同一命名空间中的 ClusterLogging
自定义资源(CR)来修改收集器的资源或调度,其名称与它支持的 ClusterLogForwarder
CR 的名称相同。
在部署中使用多个日志转发器时,ClusterClusterLogging
CR 的适用小节是 managementState
和 collection
。所有其他小节将被忽略。
先决条件
- 有管理员权限。
- 已安装 Red Hat OpenShift Logging Operator 版本 5.8 或更新版本。
-
您已创建了
ClusterLogForwarder
CR。
流程
创建支持现有
ClusterLogForwarder
CR 的ClusterLogging
CR:ClusterLogging
CR YAML 示例apiVersion: logging.openshift.io/v1 kind: ClusterLogging metadata: name: <name> 1 namespace: <namespace> 2 spec: managementState: "Managed" collection: type: "vector" tolerations: - key: "logging" operator: "Exists" effect: "NoExecute" tolerationSeconds: 6000 resources: limits: memory: 1Gi requests: cpu: 100m memory: 1Gi nodeSelector: collector: needed # ...
运行以下命令来应用
ClusterLogging
CR:$ oc apply -f <filename>.yaml
13.2.5. 查看日志记录收集器 Pod
您可以查看日志记录收集器 Pod 及其运行的对应节点。
流程
在项目中运行以下命令查看日志记录收集器 Pod 及其详情:
$ oc get pods --selector component=collector -o wide -n <project_name>
输出示例
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES collector-8d69v 1/1 Running 0 134m 10.130.2.30 master1.example.com <none> <none> collector-bd225 1/1 Running 0 134m 10.131.1.11 master2.example.com <none> <none> collector-cvrzs 1/1 Running 0 134m 10.130.0.21 master3.example.com <none> <none> collector-gpqg2 1/1 Running 0 134m 10.128.2.27 worker1.example.com <none> <none> collector-l9j7j 1/1 Running 0 134m 10.129.2.31 worker2.example.com <none> <none>