6.3. 调度 NUMA 感知工作负载

运行对延迟敏感工作负载的集群通常具有性能配置集，以帮助最小化工作负载延迟并优化性能。NUMA 感知调度程序根据可用的节点 NUMA 资源部署工作负载，并遵循应用到节点的任何性能配置集设置。NUMA 感知部署和工作负载的性能配置集相结合，确保以最大化性能的方式调度工作负载。

6.3.1. 创建 NUMAResourcesOperator 自定义资源
复制链接

安装 NUMA Resources Operator 后，创建 NUMAResourcesOperator 自定义资源 (CR) 来指示 NUMA Resources Operator 安装支持 NUMA 感知调度程序所需的所有集群基础架构，包括守护进程集和 API。

先决条件

安装 OpenShift CLI（oc）。
以具有 cluster-admin 特权的用户身份登录。
安装 NUMA Resources Operator。

流程

创建 MachineConfigPool 自定义资源，为 worker 节点启用自定义 kubelet 配置：

将以下 YAML 保存到 nro-machineconfig.yaml 文件中：

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  labels:
    cnf-worker-tuning: enabled
    machineconfiguration.openshift.io/mco-built-in: ""
    pools.operator.machineconfiguration.openshift.io/worker: ""
  name: worker
spec:
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: worker
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  labels:
    cnf-worker-tuning: enabled
    machineconfiguration.openshift.io/mco-built-in: ""
    pools.operator.machineconfiguration.openshift.io/worker: ""
  name: worker
spec:
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: worker
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""

Copy to Clipboard

Toggle word wrap

运行以下命令来创建 MachineConfigPool CR：
```
oc create -f nro-machineconfig.yaml
```
```
$ oc create -f nro-machineconfig.yaml
```
Copy to Clipboard Toggle word wrap

创建 NUMAResourcesOperator 自定义资源：

将以下 YAML 保存到 nrop.yaml 文件中：

apiVersion: nodetopology.openshift.io/v1alpha1
kind: NUMAResourcesOperator
metadata:
  name: numaresourcesoperator
spec:
  nodeGroups:
  - machineConfigPoolSelector:
      matchLabels:
        pools.operator.machineconfiguration.openshift.io/worker: ""

apiVersion: nodetopology.openshift.io/v1alpha1
kind: NUMAResourcesOperator
metadata:
  name: numaresourcesoperator
spec:
  nodeGroups:
  - machineConfigPoolSelector:
      matchLabels:
        pools.operator.machineconfiguration.openshift.io/worker: ""

Copy to Clipboard

Toggle word wrap

1: 应该与相关 MachineConfigPool CR 中的 worker 节点匹配。

运行以下命令来创建 NUMAResourcesOperator CR：
```
oc create -f nrop.yaml
```
```
$ oc create -f nrop.yaml
```
Copy to Clipboard Toggle word wrap

验证

运行以下命令，验证 NUMA Resources Operator 是否已成功部署：

oc get numaresourcesoperators.nodetopology.openshift.io

$ oc get numaresourcesoperators.nodetopology.openshift.io

Copy to Clipboard

Toggle word wrap

输出示例

NAME                    AGE
numaresourcesoperator   10m

NAME                    AGE
numaresourcesoperator   10m

Copy to Clipboard

Toggle word wrap

6.3.2. 部署 NUMA 感知辅助 pod 调度程序
复制链接

安装 NUMA Resources Operator 后，执行以下操作来部署 NUMA 感知辅助 pod 调度程序：

为所需机器配置集配置 pod admittance 策略
创建所需的机器配置池
部署 NUMA 感知二级调度程序

先决条件

安装 OpenShift CLI（oc）。
以具有 cluster-admin 特权的用户身份登录。
安装 NUMA Resources Operator。

流程

创建 KubeletConfig 自定义资源，为机器配置集配置 pod admittance 策略：

将以下 YAML 保存到 nro-kubeletconfig.yaml 文件中：

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: cnf-worker-tuning
spec:
  machineConfigPoolSelector:
    matchLabels:
      cnf-worker-tuning: enabled
  kubeletConfig:
    cpuManagerPolicy: "static" 
    cpuManagerReconcilePeriod: "5s"
    reservedSystemCPUs: "0,1"
    memoryManagerPolicy: "Static" 
    evictionHard:
      memory.available: "100Mi"
    kubeReserved:
      memory: "512Mi"
    reservedMemory:
      - numaNode: 0
        limits:
          memory: "1124Mi"
    systemReserved:
      memory: "512Mi"
    topologyManagerPolicy: "single-numa-node" 
    topologyManagerScope: "pod"

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: cnf-worker-tuning
spec:
  machineConfigPoolSelector:
    matchLabels:
      cnf-worker-tuning: enabled
  kubeletConfig:
    cpuManagerPolicy: "static"


    cpuManagerReconcilePeriod: "5s"
    reservedSystemCPUs: "0,1"
    memoryManagerPolicy: "Static"


    evictionHard:
      memory.available: "100Mi"
    kubeReserved:
      memory: "512Mi"
    reservedMemory:
      - numaNode: 0
        limits:
          memory: "1124Mi"
    systemReserved:
      memory: "512Mi"
    topologyManagerPolicy: "single-numa-node"


    topologyManagerScope: "pod"

Copy to Clipboard

Toggle word wrap

1: 对于 cpuManagerPolicy，static 必须使用小写 s。
2: 对于 memoryManagerPolicy，Static 必须使用大写 S。
3: topologyManagerPolicy 必须设置为 single-numa-node。

运行以下命令来创建 KubeletConfig 自定义资源 (CR)：
```
oc create -f nro-kubeletconfig.yaml
```
```
$ oc create -f nro-kubeletconfig.yaml
```
Copy to Clipboard Toggle word wrap

创建 NUMAResourcesScheduler 自定义资源来部署 NUMA 感知自定义 pod 调度程序：

将以下 YAML 保存到 nro-scheduler.yaml 文件中：

apiVersion: nodetopology.openshift.io/v1alpha1
kind: NUMAResourcesScheduler
metadata:
  name: numaresourcesscheduler
spec:
  imageSpec: "registry.redhat.io/openshift4/noderesourcetopology-scheduler-container-rhel8:v4.12"

apiVersion: nodetopology.openshift.io/v1alpha1
kind: NUMAResourcesScheduler
metadata:
  name: numaresourcesscheduler
spec:
  imageSpec: "registry.redhat.io/openshift4/noderesourcetopology-scheduler-container-rhel8:v4.12"

Copy to Clipboard

Toggle word wrap

运行以下命令来创建 NUMAResourcesScheduler CR：
```
oc create -f nro-scheduler.yaml
```
```
$ oc create -f nro-scheduler.yaml
```
Copy to Clipboard Toggle word wrap

验证

运行以下命令验证所需资源是否已成功部署：

oc get all -n openshift-numaresources

$ oc get all -n openshift-numaresources

Copy to Clipboard

Toggle word wrap

输出示例

NAME                                                    READY   STATUS    RESTARTS   AGE
pod/numaresources-controller-manager-7575848485-bns4s   1/1     Running   0          13m
pod/numaresourcesoperator-worker-dvj4n                  2/2     Running   0          16m
pod/numaresourcesoperator-worker-lcg4t                  2/2     Running   0          16m
pod/secondary-scheduler-56994cf6cf-7qf4q                1/1     Running   0          16m
NAME                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
daemonset.apps/numaresourcesoperator-worker   2         2         2       2            2           node-role.kubernetes.io/worker=   16m
NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/numaresources-controller-manager   1/1     1            1           13m
deployment.apps/secondary-scheduler                1/1     1            1           16m
NAME                                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/numaresources-controller-manager-7575848485   1         1         1       13m
replicaset.apps/secondary-scheduler-56994cf6cf                1         1         1       16m

NAME                                                    READY   STATUS    RESTARTS   AGE
pod/numaresources-controller-manager-7575848485-bns4s   1/1     Running   0          13m
pod/numaresourcesoperator-worker-dvj4n                  2/2     Running   0          16m
pod/numaresourcesoperator-worker-lcg4t                  2/2     Running   0          16m
pod/secondary-scheduler-56994cf6cf-7qf4q                1/1     Running   0          16m
NAME                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
daemonset.apps/numaresourcesoperator-worker   2         2         2       2            2           node-role.kubernetes.io/worker=   16m
NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/numaresources-controller-manager   1/1     1            1           13m
deployment.apps/secondary-scheduler                1/1     1            1           16m
NAME                                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/numaresources-controller-manager-7575848485   1         1         1       13m
replicaset.apps/secondary-scheduler-56994cf6cf                1         1         1       16m

Copy to Clipboard

Toggle word wrap

其他资源

关于性能配置集创建器.

6.3.3. 使用 NUMA 感知调度程序调度工作负载
复制链接

您可以使用 Deployment CR 将工作负载调度到 NUMA 感知调度程序，该 CR 指定处理工作负载的最低所需资源。

以下示例部署使用 NUMA 感知调度示例工作负载。

先决条件

安装 OpenShift CLI（oc）。
以具有 cluster-admin 特权的用户身份登录。
安装 NUMA Resources Operator 并部署 NUMA 感知辅助调度程序。

流程

运行以下命令，获取集群中部署的 NUMA 感知调度程序名称：

oc get numaresourcesschedulers.nodetopology.openshift.io numaresourcesscheduler -o json | jq '.status.schedulerName'

$ oc get numaresourcesschedulers.nodetopology.openshift.io numaresourcesscheduler -o json | jq '.status.schedulerName'

Copy to Clipboard

Toggle word wrap

输出示例

topo-aware-scheduler

topo-aware-scheduler

Copy to Clipboard

Toggle word wrap

创建一个 Deployment CR，它使用名为 topo-aware-scheduler 的调度程序，例如：

将以下 YAML 保存到 nro-deployment.yaml 文件中：

apiVersion: apps/v1
kind: Deployment
metadata:
  name: numa-deployment-1
  namespace: openshift-numaresources
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      schedulerName: topo-aware-scheduler 
      containers:
      - name: ctnr
        image: quay.io/openshifttest/hello-openshift:openshift
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: "100Mi"
            cpu: "10"
          requests:
            memory: "100Mi"
            cpu: "10"
      - name: ctnr2
        image: registry.access.redhat.com/rhel:latest
        imagePullPolicy: IfNotPresent
        command: ["/bin/sh", "-c"]
        args: [ "while true; do sleep 1h; done;" ]
        resources:
          limits:
            memory: "100Mi"
            cpu: "8"
          requests:
            memory: "100Mi"
            cpu: "8"

apiVersion: apps/v1
kind: Deployment
metadata:
  name: numa-deployment-1
  namespace: openshift-numaresources
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      schedulerName: topo-aware-scheduler


      containers:
      - name: ctnr
        image: quay.io/openshifttest/hello-openshift:openshift
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: "100Mi"
            cpu: "10"
          requests:
            memory: "100Mi"
            cpu: "10"
      - name: ctnr2
        image: registry.access.redhat.com/rhel:latest
        imagePullPolicy: IfNotPresent
        command: ["/bin/sh", "-c"]
        args: [ "while true; do sleep 1h; done;" ]
        resources:
          limits:
            memory: "100Mi"
            cpu: "8"
          requests:
            memory: "100Mi"
            cpu: "8"

Copy to Clipboard

Toggle word wrap

1: schedulerName 必须与集群中部署的 NUMA 感知调度程序的名称匹配，如 topo-aware-scheduler。

运行以下命令来创建 Deployment CR：
```
oc create -f nro-deployment.yaml
```
```
$ oc create -f nro-deployment.yaml
```
Copy to Clipboard Toggle word wrap

验证

验证部署是否成功：

oc get pods -n openshift-numaresources

$ oc get pods -n openshift-numaresources

Copy to Clipboard

Toggle word wrap

输出示例

NAME                                                READY   STATUS    RESTARTS   AGE
numa-deployment-1-56954b7b46-pfgw8                  2/2     Running   0          129m
numaresources-controller-manager-7575848485-bns4s   1/1     Running   0          15h
numaresourcesoperator-worker-dvj4n                  2/2     Running   0          18h
numaresourcesoperator-worker-lcg4t                  2/2     Running   0          16h
secondary-scheduler-56994cf6cf-7qf4q                1/1     Running   0          18h

NAME                                                READY   STATUS    RESTARTS   AGE
numa-deployment-1-56954b7b46-pfgw8                  2/2     Running   0          129m
numaresources-controller-manager-7575848485-bns4s   1/1     Running   0          15h
numaresourcesoperator-worker-dvj4n                  2/2     Running   0          18h
numaresourcesoperator-worker-lcg4t                  2/2     Running   0          16h
secondary-scheduler-56994cf6cf-7qf4q                1/1     Running   0          18h

Copy to Clipboard

Toggle word wrap

运行以下命令，验证 topo-aware-scheduler 是否在调度部署的 pod：

oc describe pod numa-deployment-1-56954b7b46-pfgw8 -n openshift-numaresources

$ oc describe pod numa-deployment-1-56954b7b46-pfgw8 -n openshift-numaresources

Copy to Clipboard

Toggle word wrap

输出示例

Events:
  Type    Reason          Age   From                  Message
  ----    ------          ----  ----                  -------
  Normal  Scheduled       130m  topo-aware-scheduler  Successfully assigned openshift-numaresources/numa-deployment-1-56954b7b46-pfgw8 to compute-0.example.com

Events:
  Type    Reason          Age   From                  Message
  ----    ------          ----  ----                  -------
  Normal  Scheduled       130m  topo-aware-scheduler  Successfully assigned openshift-numaresources/numa-deployment-1-56954b7b46-pfgw8 to compute-0.example.com

Copy to Clipboard

Toggle word wrap

注意

请求的资源超过可用于调度的部署将失败，并显示 MinimumReplicasUnavailable 错误。当所需资源可用时，部署会成功。Pod 会一直处于 Pending 状态，直到所需资源可用。

验证是否为节点列出了预期的分配资源。运行以下命令:

oc describe noderesourcetopologies.topology.node.k8s.io

$ oc describe noderesourcetopologies.topology.node.k8s.io

Copy to Clipboard

Toggle word wrap

输出示例

...

Zones:
  Costs:
    Name:   node-0
    Value:  10
    Name:   node-1
    Value:  21
  Name:     node-0
  Resources:
    Allocatable:  39
    Available:    21 
    Capacity:     40
    Name:         cpu
    Allocatable:  6442450944
    Available:    6442450944
    Capacity:     6442450944
    Name:         hugepages-1Gi
    Allocatable:  134217728
    Available:    134217728
    Capacity:     134217728
    Name:         hugepages-2Mi
    Allocatable:  262415904768
    Available:    262206189568
    Capacity:     270146007040
    Name:         memory
  Type:           Node

...

Zones:
  Costs:
    Name:   node-0
    Value:  10
    Name:   node-1
    Value:  21
  Name:     node-0
  Resources:
    Allocatable:  39
    Available:    21


    Capacity:     40
    Name:         cpu
    Allocatable:  6442450944
    Available:    6442450944
    Capacity:     6442450944
    Name:         hugepages-1Gi
    Allocatable:  134217728
    Available:    134217728
    Capacity:     134217728
    Name:         hugepages-2Mi
    Allocatable:  262415904768
    Available:    262206189568
    Capacity:     270146007040
    Name:         memory
  Type:           Node

Copy to Clipboard

Toggle word wrap

1: 由于已分配给有保证 pod 的资源，可用的容量会减少。

通过保证 pod 使用的资源从 noderesourcetopologies.topology.node.k8s.io 中列出的可用节点资源中减去。

对具有 Best-effort 或 Burstable 服务质量 (qosClass) 的pod 的资源分配不会反映在 noderesourcetopologies.topology.node.k8s.io 下的 NUMA 节点资源中。如果 pod 消耗的资源没有反映在节点资源计算中，请运行以下命令验证 pod 的 Guaranteed 具有 qosClass：
```
oc get pod <pod_name> -n <pod_namespace> -o jsonpath="{ .status.qosClass }"
```
```
$ oc get pod <pod_name> -n <pod_namespace> -o jsonpath="{ .status.qosClass }"
```
Copy to Clipboard Toggle word wrap
输出示例
```
Guaranteed
```
```
Guaranteed
```
Copy to Clipboard Toggle word wrap

返回顶部

6.3. 调度 NUMA 感知工作负载

6.3.1. 创建 NUMAResourcesOperator 自定义资源
复制链接

6.3.2. 部署 NUMA 感知辅助 pod 调度程序
复制链接

6.3.3. 使用 NUMA 感知调度程序调度工作负载
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

6.3. 调度 NUMA 感知工作负载

6.3.1. 创建 NUMAResourcesOperator 自定义资源复制链接链接已复制到粘贴板!

6.3.2. 部署 NUMA 感知辅助 pod 调度程序复制链接链接已复制到粘贴板!

6.3.3. 使用 NUMA 感知调度程序调度工作负载复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

6.3.1. 创建 NUMAResourcesOperator 自定义资源
复制链接

6.3.2. 部署 NUMA 感知辅助 pod 调度程序
复制链接

6.3.3. 使用 NUMA 感知调度程序调度工作负载
复制链接