第 4 章使用 CPU Manager

CPU Manager 管理 CPU 组并限制特定 CPU 的负载。

CPU Manager 对于有以下属性的负载有用：

需要尽可能多的 CPU 时间。
对处理器缓存丢失非常敏感。
低延迟网络应用程序。
需要与其他进程协调，并从共享一个处理器缓存中受益。

4.1. 设置 CPU Manager

流程

可选：标记节点：

# oc label node perf-node.example.com cpumanager=true

编辑启用 CPU Manager 的节点的 MachineConfigPool 。在这个示例中，所有 worker 都启用了 CPU Manager：
```
# oc edit machineconfigpool worker
```

在 worker MachineConfigPool 中添加一个标签：

metadata:
  creationTimestamp: 2019-xx-xxx
  generation: 3
  labels:
    custom-kubelet: cpumanager-enabled

创建 KubeletConfig，cpumanager-kubeletconfig.yaml，自定义资源 (CR) 。使用上一步中创建的标签，以新的 KubeletConfig更新正确的节点。请参见 MachineConfigPoolSelector 部分：

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: cpumanager-enabled
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: cpumanager-enabled
  kubeletConfig:
     cpuManagerPolicy: static
     cpuManagerReconcilePeriod: 5s

创建动态 KubeletConfig：
```
# oc create -f cpumanager-kubeletconfig.yaml
```
这会在 KubeletConfig 中添加 CPU Manager 功能。如果需要，Machine Config Operator (MCO) 将重启节点。要启用 CPU Manager，则不需要重启。

检查合并的 KubeletConfig：

# oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7

       "ownerReferences": [
            {
                "apiVersion": "machineconfiguration.openshift.io/v1",
                "kind": "KubeletConfig",
                "name": "cpumanager-enabled",
                "uid": "7ed5616d-6b72-11e9-aae1-021e1ce18878"
            }
        ],

检查 worker 是否有更新的 kubelet.conf：

# oc debug node/perf-node.example.com
sh-4.4# cat /host/etc/kubernetes/kubelet.conf | grep cpuManager
cpuManagerPolicy: static        1
cpuManagerReconcilePeriod: 5s   2

1 2: 当创建 KubeletConfig CR 时会定义这些设置。

创建请求一个或多个内核的 Pod。限制和请求都必须将其 CPU 值设置为一个整数。这是专用于这个 Pod 的内核数：

# cat cpumanager-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  generateName: cpumanager-
spec:
  containers:
  - name: cpumanager
    image: gcr.io/google_containers/pause-amd64:3.0
    resources:
      requests:
        cpu: 1
        memory: "1G"
      limits:
        cpu: 1
        memory: "1G"
  nodeSelector:
    cpumanager: "true"

创建 Pod：
```
# oc create -f cpumanager-pod.yaml
```

确定为您标记的节点调度了 Pod：

# oc describe pod cpumanager
Name:               cpumanager-6cqz7
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:  perf-node.example.com/xxx.xx.xx.xxx
...
 Limits:
      cpu:     1
      memory:  1G
    Requests:
      cpu:        1
      memory:     1G
...
QoS Class:       Guaranteed
Node-Selectors:  cpumanager=true

确认正确配置了 cgroups。获取 pause 进程的进程 ID（PID）：

# ├─init.scope
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 17
└─kubepods.slice
  ├─kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice
  │ ├─crio-b5437308f1a574c542bdf08563b865c0345c8f8c0b0a655612c.scope
  │ └─32706 /pause

服务质量（QoS）等级为 Guaranteed 的 pod 被放置到 kubepods.slice 中。其它 QoS 等级的 pod 会位于 kubepods 的子 cgroups 中：

# cd /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
# for i in `ls cpuset.cpus tasks` ; do echo -n "$i "; cat $i ; done
cpuset.cpus 1
tasks 32706

检查任务允许的 CPU 列表：

# grep ^Cpus_allowed_list /proc/32706/status
 Cpus_allowed_list:    1

确认系统中的另一个 pod（在这个示例中，QoS 等级为 burstable 的 pod）不能在为等级为Guaranteed 的 pod 分配的内核中运行：

# cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus

0

# oc describe node perf-node.example.com
...
Capacity:
 attachable-volumes-aws-ebs:  39
 cpu:                         2
 ephemeral-storage:           124768236Ki
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      8162900Ki
 pods:                        250
Allocatable:
 attachable-volumes-aws-ebs:  39
 cpu:                         1500m
 ephemeral-storage:           124768236Ki
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      7548500Ki
 pods:                        250
-------                               ----                           ------------  ----------  ---------------  -------------  ---
  default                                 cpumanager-6cqz7               1 (66%)       1 (66%)     1G (12%)         1G (12%)       29m

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests          Limits
  --------                    --------          ------
  cpu                         1440m (96%)       1 (66%)

这个 VM 有两个 CPU 内核。将 kube-reserved 设定为 500 毫秒，这意味着，一个内核的一半被从节点的总容量中减小，以达到 Node Allocatable 的数量。您可以看到 Allocatable CPU 是 1500 毫秒。这意味着您可以运行一个 CPU Manager pod，因为每个 pod 需要一个完整的内核。一个完整的内核等于 1000 毫秒。如果您尝试调度第二个 pod，系统将接受该 pod，但不会调度它：

NAME                    READY   STATUS    RESTARTS   AGE
cpumanager-6cqz7        1/1     Running   0          33m
cpumanager-7qc2t        0/1     Pending   0          11s

第 4 章使用 CPU Manager

4.1. 设置 CPU Manager

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Red Hat legal and privacy links

Red Hat legal and privacy links

第 4 章 使用 CPU Manager

4.1. 设置 CPU Manager

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Red Hat legal and privacy links

Red Hat legal and privacy links

第 4 章使用 CPU Manager