Questo contenuto non è disponibile nella lingua selezionata.
Chapter 8. Using CPU Manager and Topology Manager
CPU Manager manages groups of CPUs and constrains workloads to specific CPUs.
CPU Manager is useful for workloads that have some of these attributes:
- Require as much CPU time as possible.
- Are sensitive to processor cache misses.
- Are low-latency network applications.
- Coordinate with other processes and benefit from sharing a single processor cache.
Topology Manager collects hints from the CPU Manager, Device Manager, and other Hint Providers to align pod resources, such as CPU, SR-IOV VFs, and other device resources, for all Quality of Service (QoS) classes on the same non-uniform memory access (NUMA) node.
Topology Manager uses topology information from the collected hints to decide if a pod can be accepted or rejected on a node, based on the configured Topology Manager policy and pod resources requested.
Topology Manager is useful for workloads that use hardware accelerators to support latency-critical execution and high throughput parallel computation.
To use Topology Manager you must configure CPU Manager with the static
policy.
8.1. Setting up CPU Manager Copia collegamentoCollegamento copiato negli appunti!
To configure CPU manager, create a KubeletConfig custom resource (CR) and apply it to the desired set of nodes.
Procedure
Label a node by running the following command:
oc label node perf-node.example.com cpumanager=true
# oc label node perf-node.example.com cpumanager=true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To enable CPU Manager for all compute nodes, edit the CR by running the following command:
oc edit machineconfigpool worker
# oc edit machineconfigpool worker
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
custom-kubelet: cpumanager-enabled
label tometadata.labels
section.metadata: creationTimestamp: 2020-xx-xxx generation: 3 labels: custom-kubelet: cpumanager-enabled
metadata: creationTimestamp: 2020-xx-xxx generation: 3 labels: custom-kubelet: cpumanager-enabled
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
KubeletConfig
,cpumanager-kubeletconfig.yaml
, custom resource (CR). Refer to the label created in the previous step to have the correct nodes updated with the new kubelet config. See themachineConfigPoolSelector
section:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify a policy:
-
none
. This policy explicitly enables the existing default CPU affinity scheme, providing no affinity beyond what the scheduler does automatically. This is the default policy. -
static
. This policy allows containers in guaranteed pods with integer CPU requests. It also limits access to exclusive CPUs on the node. Ifstatic
, you must use a lowercases
.
-
- 2
- Optional. Specify the CPU Manager reconcile frequency. The default is
5s
.
Create the dynamic kubelet config by running the following command:
oc create -f cpumanager-kubeletconfig.yaml
# oc create -f cpumanager-kubeletconfig.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This adds the CPU Manager feature to the kubelet config and, if needed, the Machine Config Operator (MCO) reboots the node. To enable CPU Manager, a reboot is not needed.
Check for the merged kubelet config by running the following command:
oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
# oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the compute node for the updated
kubelet.conf
file by running the following command:oc debug node/perf-node.example.com
# oc debug node/perf-node.example.com sh-4.2# cat /host/etc/kubernetes/kubelet.conf | grep cpuManager
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
cpuManagerPolicy: static cpuManagerReconcilePeriod: 5s
cpuManagerPolicy: static
1 cpuManagerReconcilePeriod: 5s
2 Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a project by running the following command:
oc new-project <project_name>
$ oc new-project <project_name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a pod that requests a core or multiple cores. Both limits and requests must have their CPU value set to a whole integer. That is the number of cores that will be dedicated to this pod:
cat cpumanager-pod.yaml
# cat cpumanager-pod.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the pod:
oc create -f cpumanager-pod.yaml
# oc create -f cpumanager-pod.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the pod is scheduled to the node that you labeled by running the following command:
oc describe pod cpumanager
# oc describe pod cpumanager
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that a CPU has been exclusively assigned to the pod by running the following command:
oc describe node --selector='cpumanager=true' | grep -i cpumanager- -B2
# oc describe node --selector='cpumanager=true' | grep -i cpumanager- -B2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME CPU Requests CPU Limits Memory Requests Memory Limits Age cpuman cpumanager-mlrrz 1 (28%) 1 (28%) 1G (13%) 1G (13%) 27m
NAMESPACE NAME CPU Requests CPU Limits Memory Requests Memory Limits Age cpuman cpumanager-mlrrz 1 (28%) 1 (28%) 1G (13%) 1G (13%) 27m
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the
cgroups
are set up correctly. Get the process ID (PID) of thepause
process by running the following commands:oc debug node/perf-node.example.com
# oc debug node/perf-node.example.com
Copy to Clipboard Copied! Toggle word wrap Toggle overflow systemctl status | grep -B5 pause
sh-4.2# systemctl status | grep -B5 pause
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf the output returns multiple pause process entries, you must identify the correct pause process.
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that pods of quality of service (QoS) tier
Guaranteed
are placed within thekubepods.slice
subdirectory by running the following commands:cd /sys/fs/cgroup/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
# cd /sys/fs/cgroup/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
Copy to Clipboard Copied! Toggle word wrap Toggle overflow for i in `ls cpuset.cpus cgroup.procs` ; do echo -n "$i "; cat $i ; done
# for i in `ls cpuset.cpus cgroup.procs` ; do echo -n "$i "; cat $i ; done
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NotePods of other QoS tiers end up in child
cgroups
of the parentkubepods
.Example output
cpuset.cpus 1 tasks 32706
cpuset.cpus 1 tasks 32706
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the allowed CPU list for the task by running the following command:
grep ^Cpus_allowed_list /proc/32706/status
# grep ^Cpus_allowed_list /proc/32706/status
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Cpus_allowed_list: 1
Cpus_allowed_list: 1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that another pod on the system cannot run on the core allocated for the
Guaranteed
pod. For example, to verify the pod in thebesteffort
QoS tier, run the following commands:cat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
# cat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc describe node perf-node.example.com
# oc describe node perf-node.example.com
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This VM has two CPU cores. The
system-reserved
setting reserves 500 millicores, meaning that half of one core is subtracted from the total capacity of the node to arrive at theNode Allocatable
amount. You can see thatAllocatable CPU
is 1500 millicores. This means you can run one of the CPU Manager pods since each will take one whole core. A whole core is equivalent to 1000 millicores. If you try to schedule a second pod, the system will accept the pod, but it will never be scheduled:NAME READY STATUS RESTARTS AGE cpumanager-6cqz7 1/1 Running 0 33m cpumanager-7qc2t 0/1 Pending 0 11s
NAME READY STATUS RESTARTS AGE cpumanager-6cqz7 1/1 Running 0 33m cpumanager-7qc2t 0/1 Pending 0 11s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.2. Topology Manager policies Copia collegamentoCollegamento copiato negli appunti!
Topology Manager aligns Pod
resources of all Quality of Service (QoS) classes by collecting topology hints from Hint Providers, such as CPU Manager and Device Manager, and using the collected hints to align the Pod
resources.
Topology Manager supports four allocation policies, which you assign in the KubeletConfig
custom resource (CR) named cpumanager-enabled
:
none
policy- This is the default policy and does not perform any topology alignment.
best-effort
policy-
For each container in a pod with the
best-effort
topology management policy, kubelet tries to align all the required resources on a NUMA node according to the preferred NUMA node affinity for that container. Even if the allocation is not possible due to insufficient resources, the Topology Manager still admits the pod but the allocation is shared with other NUMA nodes. restricted
policy-
For each container in a pod with the
restricted
topology management policy, kubelet determines the theoretical minimum number of NUMA nodes that can fulfill the request. If the actual allocation requires more than the that number of NUMA nodes, the Topology Manager rejects the admission, placing the pod in aTerminated
state. If the number of NUMA nodes can fulfill the request, the Topology Manager admits the pod and the pod starts running. single-numa-node
policy-
For each container in a pod with the
single-numa-node
topology management policy, kubelet admits the pod if all the resources required by the pod can be allocated on the same NUMA node. If a single NUMA node affinity is not possible, the Topology Manager rejects the pod from the node. This results in a pod in aTerminated
state with a pod admission failure.
8.3. Setting up Topology Manager Copia collegamentoCollegamento copiato negli appunti!
To use Topology Manager, you must configure an allocation policy in the KubeletConfig
custom resource (CR) named cpumanager-enabled
. This file might exist if you have set up CPU Manager. If the file does not exist, you can create the file.
Prerequisites
-
Configure the CPU Manager policy to be
static
.
Procedure
To activate Topology Manager:
Configure the Topology Manager allocation policy in the custom resource.
oc edit KubeletConfig cpumanager-enabled
$ oc edit KubeletConfig cpumanager-enabled
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.4. Pod interactions with Topology Manager policies Copia collegamentoCollegamento copiato negli appunti!
The example Pod
specs illustrate pod interactions with Topology Manager.
The following pod runs in the BestEffort
QoS class because no resource requests or limits are specified.
spec: containers: - name: nginx image: nginx
spec:
containers:
- name: nginx
image: nginx
The next pod runs in the Burstable
QoS class because requests are less than limits.
If the selected policy is anything other than none
, Topology Manager would process all the pods and it enforces resource alignment only for the Guaranteed
Qos Pod
specification. When the Topology Manager policy is set to none
, the relevant containers are pinned to any available CPU without considering NUMA affinity. This is the default behavior and it does not optimize for performance-sensitive workloads. Other values enable the use of topology awareness information from device plugins core resources, such as CPU and memory. The Topology Manager attempts to align the CPU, memory, and device allocations according to the topology of the node when the policy is set to other values than none
. For more information about the available values, see Topology Manager policies.
The following example pod runs in the Guaranteed
QoS class because requests are equal to limits.
Topology Manager would consider this pod. The Topology Manager would consult the Hint Providers, which are the CPU Manager, the Device Manager, and the Memory Manager, to get topology hints for the pod.
Topology Manager will use this information to store the best topology for this container. In the case of this pod, CPU Manager and Device Manager will use this stored information at the resource allocation stage.