Chapter 15. Performance Addon Operator for low latency nodes

15.1. Understanding low latency
Copiar enlace

The emergence of Edge computing in the area of Telco / 5G plays a key role in reducing latency and congestion problems and improving application performance.

Simply put, latency determines how fast data (packets) moves from the sender to receiver and returns to the sender after processing by the receiver. Obviously, maintaining a network architecture with the lowest possible delay of latency speeds is key for meeting the network performance requirements of 5G. Compared to 4G technology, with an average latency of 50ms, 5G is targeted to reach latency numbers of 1ms or less. This reduction in latency boosts wireless throughput by a factor of 10.

Many of the deployed applications in the Telco space require low latency that can only tolerate zero packet loss. Tuning for zero packet loss helps mitigate the inherent issues that degrade network performance. For more information, see Tuning for Zero Packet Loss in Red Hat OpenStack Platform (RHOSP).

The Edge computing initiative also comes in to play for reducing latency rates. Think of it as literally being on the edge of the cloud and closer to the user. This greatly reduces the distance between the user and distant data centers, resulting in reduced application response times and performance latency.

Administrators must be able to manage their many Edge sites and local services in a centralized way so that all of the deployments can run at the lowest possible management cost. They also need an easy way to deploy and configure certain nodes of their cluster for real-time low latency and high-performance purposes. Low latency nodes are useful for applications such as Cloud-native Network Functions (CNF) and Data Plane Development Kit (DPDK).

OpenShift Container Platform currently provides mechanisms to tune software on an OpenShift Container Platform cluster for real-time running and low latency (around <20 microseconds reaction time). This includes tuning the kernel and OpenShift Container Platform set values, installing a kernel, and reconfiguring the machine. But this method requires setting up four different Operators and performing many configurations that, when done manually, is complex and could be prone to mistakes.

OpenShift Container Platform provides a Performance Addon Operator to implement automatic tuning in order to achieve low latency performance for OpenShift applications. The cluster administrator uses this performance profile configuration that makes it easier to make these changes in a more reliable way. The administrator can specify whether to update the kernel to kernel-rt, the CPUs that will be reserved for housekeeping, and the CPUs that will be used for running the workloads.

15.2. Installing the Performance Addon Operator
Copiar enlace

Performance Addon Operator provides the ability to enable advanced node performance tunings on a set of nodes. As a cluster administrator, you can install Performance Addon Operator using the OpenShift Container Platform CLI or the web console.

15.2.1. Installing the Operator using the CLI
Copiar enlace

As a cluster administrator, you can install the Operator using the CLI.

Prerequisites

A cluster installed on bare-metal hardware.
Install the OpenShift CLI (oc).
Log in as a user with cluster-admin privileges.

Procedure

Create a namespace for the Performance Addon Operator by completing the following actions:
1. Create the following Namespace Custom Resource (CR) that defines the openshift-performance-addon-operator namespace, and then save the YAML in the pao-namespace.yaml file:
  apiVersion: v1 kind: Namespace metadata: name: openshift-performance-addon-operator
  Copy to Clipboard Toggle word wrap
2. Create the namespace by running the following command:
  $ oc create -f pao-namespace.yaml
  Copy to Clipboard Toggle word wrap

Install the Performance Addon Operator in the namespace you created in the previous step by creating the following objects:

Create the following OperatorGroup CR and save the YAML in the pao-operatorgroup.yaml file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-performance-addon-operator
  namespace: openshift-performance-addon-operator
spec:
  targetNamespaces:
  - openshift-performance-addon-operator

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-performance-addon-operator
  namespace: openshift-performance-addon-operator
spec:
  targetNamespaces:
  - openshift-performance-addon-operator

Copy to Clipboard

Toggle word wrap

Create the OperatorGroup CR by running the following command:
```
oc create -f pao-operatorgroup.yaml
```
```
$ oc create -f pao-operatorgroup.yaml
```
Copy to Clipboard Toggle word wrap

Run the following command to get the channel value required for the next step.

oc get packagemanifest performance-addon-operator -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'

$ oc get packagemanifest performance-addon-operator -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'

Copy to Clipboard

Toggle word wrap

Example output

4.6

4.6

Copy to Clipboard

Toggle word wrap

Create the following Subscription CR and save the YAML in the pao-sub.yaml file:

Example Subscription

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-performance-addon-operator-subscription
  namespace: openshift-performance-addon-operator
spec:
  channel: "<channel>" 
  name: performance-addon-operator
  source: redhat-operators 
  sourceNamespace: openshift-marketplace

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-performance-addon-operator-subscription
  namespace: openshift-performance-addon-operator
spec:
  channel: "<channel>"

1


  name: performance-addon-operator
  source: redhat-operators

2


  sourceNamespace: openshift-marketplace

Copy to Clipboard

Toggle word wrap

1: Specify the value from you obtained in the previous step for the .status.defaultChannel parameter.
2: You must specify the redhat-operators value.

Create the Subscription object by running the following command:
```
oc create -f pao-sub.yaml
```
```
$ oc create -f pao-sub.yaml
```
Copy to Clipboard Toggle word wrap
Change to the openshift-performance-addon-operator project:
```
oc project openshift-performance-addon-operator
```
```
$ oc project openshift-performance-addon-operator
```
Copy to Clipboard Toggle word wrap

15.2.2. Installing the Performance Addon Operator using the web console
Copiar enlace

As a cluster administrator, you can install the Performance Addon Operator using the web console.

Note

You must create the Namespace CR and OperatorGroup CR as mentioned in the previous section.

Procedure

Install the Performance Addon Operator using the OpenShift Container Platform web console:
1. In the OpenShift Container Platform web console, click Operators OperatorHub.
2. Choose Performance Addon Operator from the list of available Operators, and then click Install.
3. On the Install Operator page, under A specific namespace on the cluster select openshift-performance-addon-operator. Then, click Install.
Optional: Verify that the performance-addon-operator installed successfully:
1. Switch to the Operators Installed Operators page.
2. Ensure that Performance Addon Operator is listed in the openshift-operators project with a Status of Succeeded.
  Note
  During installation an Operator might display a Failed status. If the installation later succeeds with a Succeeded message, you can ignore the Failed message.
  If the Operator does not appear as installed, you can troubleshoot further:
  - Go to the Operators Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
  - Go to the Workloads Pods page and check the logs for pods in the openshift-operators project.

15.3. Upgrading Performance Addon Operator
Copiar enlace

You can manually upgrade to the next minor version of Performance Addon Operator and monitor the status of an update by using the web console.

15.3.1. About upgrading Performance Addon Operator
Copiar enlace

You can upgrade to the next minor version of Performance Addon Operator by using the OpenShift web console to change the channel of your Operator subscription.
You can enable automatic z-stream updates during Performance Addon Operator installation.
Updates are delivered via the Marketplace Operator, which is deployed during OpenShift Container Platform installation.The Marketplace Operator makes external Operators available to your cluster.
The amount of time an update takes to complete depends on your network connection. Most automatic updates complete within fifteen minutes.

15.3.1.1. How Performance Addon Operator upgrades affect your cluster
Copiar enlace

Neither the low latency tuning nor huge pages are affected.
Updating the Operator should not cause any unexpected reboots.

15.3.1.2. Upgrading Performance Addon Operator to the next minor version
Copiar enlace

You can manually upgrade Performance Addon Operator to the next minor version by using the OpenShift Container Platform web console to change the channel of your Operator subscription.

Prerequisites

Access to the cluster as a user with the cluster-admin role.

Procedure

Access the OpenShift web console and navigate to Operators Installed Operators.
Click Performance Addon Operator to open the Operator Details page.
Click the Subscription tab to open the Subscription Overview page.
In the Channel pane, click the pencil icon on the right side of the version number to open the Change Subscription Update Channel window.
Select the next minor version. For example, if you want to upgrade to Performance Addon Operator 4.6, select 4.6.
Click Save.
Check the status of the upgrade by navigating to Operators Installed Operators. You can also check the status by running the following oc command:
```
oc get csv -n openshift-performance-addon-operator
```
```
$ oc get csv -n openshift-performance-addon-operator
```
Copy to Clipboard Toggle word wrap

15.3.2. Monitoring upgrade status
Copiar enlace

The best way to monitor Performance Addon Operator upgrade status is to watch the ClusterServiceVersion (CSV) PHASE. You can also monitor the CSV conditions in the web console or by running the oc get csv command.

Note

The PHASE and conditions values are approximations that are based on available information.

Prerequisites

Access to the cluster as a user with the cluster-admin role.
Install the OpenShift CLI (oc).

Procedure

Run the following command:
```
oc get csv
```
```
$ oc get csv
```
Copy to Clipboard Toggle word wrap

Review the output, checking the PHASE field. For example:

VERSION    REPLACES                                         PHASE
4.6.0      performance-addon-operator.v4.5.0                Installing
4.5.0                                                       Replacing

VERSION    REPLACES                                         PHASE
4.6.0      performance-addon-operator.v4.5.0                Installing
4.5.0                                                       Replacing

Copy to Clipboard

Toggle word wrap

Run get csv again to verify the output:

oc get csv

# oc get csv

Copy to Clipboard

Toggle word wrap

Example output

NAME                                DISPLAY                      VERSION   REPLACES                            PHASE
performance-addon-operator.v4.5.0   Performance Addon Operator   4.6.0     performance-addon-operator.v4.5.0   Succeeded

NAME                                DISPLAY                      VERSION   REPLACES                            PHASE
performance-addon-operator.v4.5.0   Performance Addon Operator   4.6.0     performance-addon-operator.v4.5.0   Succeeded

Copy to Clipboard

Toggle word wrap

15.4. Provisioning real-time and low latency workloads
Copiar enlace

Many industries and organizations need extremely high performance computing and might require low and predictable latency, especially in the financial and telecommunications industries. For these industries, with their unique requirements, OpenShift Container Platform provides a Performance Addon Operator to implement automatic tuning to achieve low latency performance and consistent response time for OpenShift Container Platform applications.

The cluster administrator uses this performance profile configuration that makes it easier to make these changes in a more reliable way. The administrator can specify whether to update the kernel to kernel-rt (real-time), the CPUs that will be reserved for housekeeping, and the CPUs that are used for running the workloads.

Warning

The usage of execution probes in conjunction with applications that require guaranteed CPUs can cause latency spikes. It is recommended to use other probes, such as a properly configured set of network probes, as an alternative.

15.4.1. Known limitations for real-time
Copiar enlace

Note

The RT kernel is only supported on worker nodes.

To fully utilize the real-time mode, the containers must run with elevated privileges. See Set capabilities for a Container for information on granting privileges.

OpenShift Container Platform restricts the allowed capabilities, so you might need to create a SecurityContext as well.

Note

This procedure is fully supported with bare metal installations using Red Hat Enterprise Linux CoreOS (RHCOS) systems.

Establishing the right performance expectations refers to the fact that the real-time kernel is not a panacea. Its objective is consistent, low-latency determinism offering predictable response times. There is some additional kernel overhead associated with the real-time kernel. This is due primarily to handling hardware interruptions in separately scheduled threads. The increased overhead in some workloads results in some degradation in overall throughput. The exact amount of degradation is very workload dependent, ranging from 0% to 30%. However, it is the cost of determinism.

15.4.2. Provisioning a worker with real-time capabilities
Copiar enlace

Install Performance Addon Operator to the cluster.
Optional: Add a node to the OpenShift Container Platform cluster. See Setting BIOS parameters.
Add the label worker-rt to the worker nodes that require the real-time capability by using the oc command.

Create a new machine config pool for real-time nodes:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-rt
  labels:
    machineconfiguration.openshift.io/role: worker-rt
spec:
  machineConfigSelector:
    matchExpressions:
      - {
           key: machineconfiguration.openshift.io/role,
           operator: In,
           values: [worker, worker-rt],
        }
  paused: false
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-rt: ""

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-rt
  labels:
    machineconfiguration.openshift.io/role: worker-rt
spec:
  machineConfigSelector:
    matchExpressions:
      - {
           key: machineconfiguration.openshift.io/role,
           operator: In,
           values: [worker, worker-rt],
        }
  paused: false
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-rt: ""

Copy to Clipboard

Toggle word wrap

Note that a machine config pool worker-rt is created for group of nodes that have the label worker-rt.

Add the node to the proper machine config pool by using node role labels.
Note
You must decide which nodes are configured with real-time workloads. You could configure all of the nodes in the cluster, or a subset of the nodes. The Performance Addon Operator that expects all of the nodes are part of a dedicated machine config pool. If you use all of the nodes, you must point the Performance Addon Operator to the worker node role label. If you use a subset, you must group the nodes into a new machine config pool.
Create the PerformanceProfile with the proper set of housekeeping cores and realTimeKernel: enabled: true.

You must set machineConfigPoolSelector in PerformanceProfile:

  apiVersion: performance.openshift.io/v2
  kind: PerformanceProfile
  metadata:
   name: example-performanceprofile
  spec:
  ...
    realTimeKernel:
      enabled: true
    nodeSelector:
       node-role.kubernetes.io/worker-rt: ""
    machineConfigPoolSelector:
       machineconfiguration.openshift.io/role: worker-rt

  apiVersion: performance.openshift.io/v2
  kind: PerformanceProfile
  metadata:
   name: example-performanceprofile
  spec:
  ...
    realTimeKernel:
      enabled: true
    nodeSelector:
       node-role.kubernetes.io/worker-rt: ""
    machineConfigPoolSelector:
       machineconfiguration.openshift.io/role: worker-rt

Copy to Clipboard

Toggle word wrap

Verify that a matching machine config pool exists with a label:

oc describe mcp/worker-rt

$ oc describe mcp/worker-rt

Copy to Clipboard

Toggle word wrap

Example output

Name:         worker-rt
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-rt

Name:         worker-rt
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-rt

Copy to Clipboard

Toggle word wrap

OpenShift Container Platform will start configuring the nodes, which might involve multiple reboots. Wait for the nodes to settle. This can take a long time depending on the specific hardware you use, but 20 minutes per node is expected.
Verify everything is working as expected.

15.4.3. Verifying the real-time kernel installation
Copiar enlace

Use this command to verify that the real-time kernel is installed:

oc get node -o wide

$ oc get node -o wide

Copy to Clipboard

Toggle word wrap

Note the worker with the role worker-rt that contains the string 4.18.0-211.rt5.23.el8.x86_64:

NAME                               	STATUS   ROLES           	AGE 	VERSION                  	INTERNAL-IP
EXTERNAL-IP   OS-IMAGE                                       	KERNEL-VERSION
CONTAINER-RUNTIME
rt-worker-0.example.com	            Ready	 worker,worker-rt   5d17h   v1.22.1
128.66.135.107   <none>    	        Red Hat Enterprise Linux CoreOS 46.82.202008252340-0 (Ootpa)
4.18.0-211.rt5.23.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
[...]

NAME                               	STATUS   ROLES           	AGE 	VERSION                  	INTERNAL-IP
EXTERNAL-IP   OS-IMAGE                                       	KERNEL-VERSION
CONTAINER-RUNTIME
rt-worker-0.example.com	            Ready	 worker,worker-rt   5d17h   v1.22.1
128.66.135.107   <none>    	        Red Hat Enterprise Linux CoreOS 46.82.202008252340-0 (Ootpa)
4.18.0-211.rt5.23.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
[...]

Copy to Clipboard

Toggle word wrap

15.4.4. Creating a workload that works in real-time
Copiar enlace

Use the following procedures for preparing a workload that will use real-time capabilities.

Procedure

Create a pod with a QoS class of Guaranteed.
Optional: Disable CPU load balancing for DPDK.
Assign a proper node selector.

When writing your applications, follow the general recommendations described in Application tuning and deployment.

15.4.5. Creating a pod with a QoS class of Guaranteed
Copiar enlace

Keep the following in mind when you create a pod that is given a QoS class of Guaranteed:

Every container in the pod must have a memory limit and a memory request, and they must be the same.
Every container in the pod must have a CPU limit and a CPU request, and they must be the same.

The following example shows the configuration file for a pod that has one container. The container has a memory limit and a memory request, both equal to 200 MiB. The container has a CPU limit and a CPU request, both equal to 1 CPU.

apiVersion: v1
kind: Pod
metadata:
  name: qos-demo
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-ctr
    image: <image-pull-spec>
    resources:
      limits:
        memory: "200Mi"
        cpu: "1"
      requests:
        memory: "200Mi"
        cpu: "1"

apiVersion: v1
kind: Pod
metadata:
  name: qos-demo
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-ctr
    image: <image-pull-spec>
    resources:
      limits:
        memory: "200Mi"
        cpu: "1"
      requests:
        memory: "200Mi"
        cpu: "1"

Copy to Clipboard

Toggle word wrap

Create the pod:

oc  apply -f qos-pod.yaml --namespace=qos-example

$ oc  apply -f qos-pod.yaml --namespace=qos-example

Copy to Clipboard

Toggle word wrap

View detailed information about the pod:
```
oc get pod qos-demo --namespace=qos-example --output=yaml
```
```
$ oc get pod qos-demo --namespace=qos-example --output=yaml
```
Copy to Clipboard Toggle word wrap
Example output
```
spec:
  containers:
    ...
status:
  qosClass: Guaranteed
```
```
spec:
  containers:
    ...
status:
  qosClass: Guaranteed
```
Copy to Clipboard Toggle word wrap
Note
If a container specifies its own memory limit, but does not specify a memory request, OpenShift Container Platform automatically assigns a memory request that matches the limit. Similarly, if a container specifies its own CPU limit, but does not specify a CPU request, OpenShift Container Platform automatically assigns a CPU request that matches the limit.

15.4.6. Optional: Disabling CPU load balancing for DPDK
Copiar enlace

Functionality to disable or enable CPU load balancing is implemented on the CRI-O level. The code under the CRI-O disables or enables CPU load balancing only when the following requirements are met.

The pod must use the performance-<profile-name> runtime class. You can get the proper name by looking at the status of the performance profile, as shown here:

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
...
status:
  ...
  runtimeClass: performance-manual

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
...
status:
  ...
  runtimeClass: performance-manual

Copy to Clipboard

Toggle word wrap

The pod must have the cpu-load-balancing.crio.io: true annotation.

The Performance Addon Operator is responsible for the creation of the high-performance runtime handler config snippet under relevant nodes and for creation of the high-performance runtime class under the cluster. It will have the same content as default runtime handler except it enables the CPU load balancing configuration functionality.

To disable the CPU load balancing for the pod, the Pod specification must include the following fields:

apiVersion: v1
kind: Pod
metadata:
  ...
  annotations:
    ...
    cpu-load-balancing.crio.io: "true"
    ...
  ...
spec:
  ...
  runtimeClassName: performance-<profile_name>
  ...

apiVersion: v1
kind: Pod
metadata:
  ...
  annotations:
    ...
    cpu-load-balancing.crio.io: "true"
    ...
  ...
spec:
  ...
  runtimeClassName: performance-<profile_name>
  ...

Copy to Clipboard

Toggle word wrap

Note

Only disable CPU load balancing when the CPU manager static policy is enabled and for pods with guaranteed QoS that use whole CPUs. Otherwise, disabling CPU load balancing can affect the performance of other containers in the cluster.

15.4.7. Assigning a proper node selector
Copiar enlace

The preferred way to assign a pod to nodes is to use the same node selector the performance profile used, as shown here:

apiVersion: v1
kind: Pod
metadata:
  name: example
spec:
  # ...
  nodeSelector:
    node-role.kubernetes.io/worker-rt: ""

apiVersion: v1
kind: Pod
metadata:
  name: example
spec:
  # ...
  nodeSelector:
    node-role.kubernetes.io/worker-rt: ""

Copy to Clipboard

Toggle word wrap

For more information, see Placing pods on specific nodes using node selectors.

15.4.8. Scheduling a workload onto a worker with real-time capabilities
Copiar enlace

Use label selectors that match the nodes attached to the machine config pool that was configured for low latency by the Performance Addon Operator. For more information, see Assigning pods to nodes.

15.5. Configuring huge pages
Copiar enlace

Nodes must pre-allocate huge pages used in an OpenShift Container Platform cluster. Use the Performance Addon Operator to allocate huge pages on a specific node.

OpenShift Container Platform provides a method for creating and allocating huge pages. Performance Addon Operator provides an easier method for doing this using the performance profile.

For example, in the hugepages pages section of the performance profile, you can specify multiple blocks of size, count, and, optionally, node:

hugepages:
   defaultHugepagesSize: "1G"
   pages:
   - size:  "1G"
     count:  4
     node:  0

hugepages:
   defaultHugepagesSize: "1G"
   pages:
   - size:  "1G"
     count:  4
     node:  0

1

Copy to Clipboard

Toggle word wrap

1: node is the NUMA node in which the huge pages are allocated. If you omit node, the pages are evenly spread across all NUMA nodes.

Note

Wait for the relevant machine config pool status that indicates the update is finished.

These are the only configuration steps you need to do to allocate huge pages.

Verification

To verify the configuration, see the /proc/meminfo file on the node:

oc debug node/ip-10-0-141-105.ec2.internal

$ oc debug node/ip-10-0-141-105.ec2.internal

Copy to Clipboard

Toggle word wrap

grep -i huge /proc/meminfo

# grep -i huge /proc/meminfo

Copy to Clipboard

Toggle word wrap

Example output

AnonHugePages:    ###### ##
ShmemHugePages:        0 kB
HugePages_Total:       2
HugePages_Free:        2
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       #### ##
Hugetlb:            #### ##

AnonHugePages:    ###### ##
ShmemHugePages:        0 kB
HugePages_Total:       2
HugePages_Free:        2
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       #### ##
Hugetlb:            #### ##

Copy to Clipboard

Toggle word wrap

Use oc describe to report the new size:

oc describe node worker-0.ocp4poc.example.com | grep -i huge

$ oc describe node worker-0.ocp4poc.example.com | grep -i huge

Copy to Clipboard

Toggle word wrap

Example output

                                   hugepages-1g=true
 hugepages-###:  ###
 hugepages-###:  ###

                                   hugepages-1g=true
 hugepages-###:  ###
 hugepages-###:  ###

Copy to Clipboard

Toggle word wrap

15.6. Allocating multiple huge page sizes
Copiar enlace

You can request huge pages with different sizes under the same container. This allows you to define more complicated pods consisting of containers with different huge page size needs.

For example, you can define sizes 1G and 2M and the Performance Addon Operator will configure both sizes on the node, as shown here:

spec:
  hugepages:
    defaultHugepagesSize: 1G
    pages:
    - count: 1024
      node: 0
      size: 2M
    - count: 4
      node: 1
      size: 1G

spec:
  hugepages:
    defaultHugepagesSize: 1G
    pages:
    - count: 1024
      node: 0
      size: 2M
    - count: 4
      node: 1
      size: 1G

Copy to Clipboard

Toggle word wrap

15.7. Restricting CPUs for infra and application containers
Copiar enlace

Generic housekeeping and workload tasks use CPUs in a way that may impact latency-sensitive processes. By default, the container runtime uses all online CPUs to run all containers together, which can result in context switches and spikes in latency. Partitioning the CPUs prevents noisy processes from interfering with latency-sensitive processes by separating them from each other. The following table describes how processes run on a CPU after you have tuned the node using the Performance Addon Operator:

Expand

Table 15.1. Process' CPU assignments
Process type	Details
`Burstable` and `BestEffort` pods	Runs on any CPU except where low latency workload is running
Infrastructure pods	Runs on any CPU except where low latency workload is running
Interrupts	Redirects to reserved CPUs (optional in OpenShift Container Platform 4.6 and later)
Kernel processes	Pins to reserved CPUs
Latency-sensitive workload pods	Pins to a specific set of exclusive CPUs from the isolated pool
OS processes/systemd services	Pins to reserved CPUs

The allocatable capacity of cores on a node for pods of all QoS process types, Burstable, BestEffort, or Guaranteed, is equal to the capacity of the isolated pool. The capacity of the reserved pool is removed from the node’s total core capacity for use by the cluster and operating system housekeeping duties.

Example 1

A node features a capacity of 100 cores. Using a performance profile, the cluster administrator allocates 50 cores to the isolated pool and 50 cores to the reserved pool. The cluster administrator assigns 25 cores to QoS Guaranteed pods and 25 cores for BestEffort or Burstable pods. This matches the capacity of the isolated pool.

Example 2

A node features a capacity of 100 cores. Using a performance profile, the cluster administrator allocates 50 cores to the isolated pool and 50 cores to the reserved pool. The cluster administrator assigns 50 cores to QoS Guaranteed pods and one core for BestEffort or Burstable pods. This exceeds the capacity of the isolated pool by one core. Pod scheduling fails because of insufficient CPU capacity.

The exact partitioning pattern to use depends on many factors like hardware, workload characteristics and the expected system load. Some sample use cases are as follows:

If the latency-sensitive workload uses specific hardware, such as a network interface card (NIC), ensure that the CPUs in the isolated pool are as close as possible to this hardware. At a minimum, you should place the workload in the same Non-Uniform Memory Access (NUMA) node.
The reserved pool is used for handling all interrupts. When depending on system networking, allocate a sufficiently-sized reserve pool to handle all the incoming packet interrupts. In 4.6 and later versions, workloads can optionally be labeled as sensitive. The decision regarding which specific CPUs should be used for reserved and isolated partitions requires detailed analysis and measurements. Factors like NUMA affinity of devices and memory play a role. The selection also depends on the workload architecture and the specific use case.

Important

The reserved and isolated CPU pools must not overlap and together must span all available cores in the worker node.

To ensure that housekeeping tasks and workloads do not interfere with each other, specify two groups of CPUs in the spec section of the performance profile.

isolated - Specifies the CPUs for the application container workloads. Workloads running on these CPUs experience the lowest latency as well as zero interruptions and can, for example, reach high zero packet loss bandwidth.
reserved - Specifies the CPUs for the cluster and operating system housekeeping duties. Threads in the reserved group are often busy. Do not run latency-sensitive applications in the reserved group. Latency-sensitive applications run in the isolated group. .Procedure
1. Create a performance profile appropriate for the environment’s hardware and topology.
2. Add the reserved and isolated parameters with the CPUs you want reserved and isolated for the infra and application containers:
  apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: name: infra-cpus spec: cpu: reserved: "0-4,9"
  1
  isolated: "5-8"
  2
  nodeSelector:
  3
  node-role.kubernetes.io/worker: ""
  Copy to Clipboard Toggle word wrap
  1
  Specify which CPUs are for infra containers to perform cluster and operating system housekeeping duties.
  2
  Specify which CPUs are for application containers to run workloads.
  3
  Specify a node selector to apply the performance profile to specific nodes.

15.8. Tuning nodes for low latency with the performance profile
Copiar enlace

The performance profile lets you control latency tuning aspects of nodes that belong to a certain machine config pool. After you specify your settings, the PerformanceProfile object is compiled into multiple objects that perform the actual node level tuning:

A MachineConfig file that manipulates the nodes.
A KubeletConfig file that configures the Topology Manager, the CPU Manager, and the OpenShift Container Platform nodes.
The Tuned profile that configures the Node Tuning Operator.

Procedure

Prepare a cluster.
Create a machine config pool.
Install the Performance Addon Operator.

Create a performance profile that is appropriate for your hardware and topology. In the performance profile, you can specify whether to update the kernel to kernel-rt, allocation of huge pages, the CPUs that will be reserved for operating system housekeeping processes and CPUs that will be used for running the workloads.

This is a typical performance profile:

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
 name: performance
spec:
 cpu:
  isolated: "5-15"
  reserved: "0-4"
 hugepages:
  defaultHugepagesSize: "1G"
  pages:
  -size: "1G"
   count: 16
   node: 0
 realTimeKernel:
  enabled: true  
 numa:  
  topologyPolicy: "best-effort"
 nodeSelector:
  node-role.kubernetes.io/worker-cnf: ""

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
 name: performance
spec:
 cpu:
  isolated: "5-15"
  reserved: "0-4"
 hugepages:
  defaultHugepagesSize: "1G"
  pages:
  -size: "1G"
   count: 16
   node: 0
 realTimeKernel:
  enabled: true

1


 numa:

2


  topologyPolicy: "best-effort"
 nodeSelector:
  node-role.kubernetes.io/worker-cnf: ""

Copy to Clipboard

Toggle word wrap

1: Valid values are true or false. Setting the true value installs the real-time kernel on the node.
2: Use this field to configure the topology manager policy. Valid values are none (default), best-effort, restricted, and single-numa-node. For more information, see Topology Manager Policies.

15.9. Performing end-to-end tests for platform verification
Copiar enlace

The Cloud-native Network Functions (CNF) tests image is a containerized test suite that validates features required to run CNF payloads. You can use this image to validate a CNF-enabled OpenShift cluster where all the components required for running CNF workloads are installed.

The tests run by the image are split into three different phases:

Simple cluster validation
Setup
End to end tests

The validation phase checks that all the features required to be tested are deployed correctly on the cluster.

Validations include:

Targeting a machine config pool that belong to the machines to be tested
Enabling SCTP on the nodes
Having the Performance Addon Operator installed
Having the SR-IOV Operator installed
Having the PTP Operator installed
Using OVN kubernetes as the SDN

The tests need to perform an environment configuration every time they are executed. This involves items such as creating SR-IOV node policies, performance profiles, or PTP profiles. Allowing the tests to configure an already configured cluster might affect the functionality of the cluster. Also, changes to configuration items such as SR-IOV node policy might result in the environment being temporarily unavailable until the configuration change is processed.

15.9.1. Prerequisites
Copiar enlace

The test entrypoint is /usr/bin/test-run.sh. It runs both a setup test set and the real conformance test suite. The minimum requirement is to provide it with a kubeconfig file and its related $KUBECONFIG environment variable, mounted through a volume.
The tests assumes that a given feature is already available on the cluster in the form of an Operator, flags enabled on the cluster, or machine configs.

Some tests require a pre-existing machine config pool to append their changes to. This must be created on the cluster before running the tests.

The default worker pool is worker-cnf and can be created with the following manifest:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-cnf
  labels:
    machineconfiguration.openshift.io/role: worker-cnf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
          key: machineconfiguration.openshift.io/role,
          operator: In,
          values: [worker-cnf, worker],
        }
  paused: false
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-cnf: ""

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-cnf
  labels:
    machineconfiguration.openshift.io/role: worker-cnf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
          key: machineconfiguration.openshift.io/role,
          operator: In,
          values: [worker-cnf, worker],
        }
  paused: false
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-cnf: ""

Copy to Clipboard

Toggle word wrap

You can use the ROLE_WORKER_CNF variable to override the worker pool name:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e
ROLE_WORKER_CNF=custom-worker-pool registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

Note

Currently, not all tests run selectively on the nodes belonging to the pool.

15.9.2. Running the tests
Copiar enlace

Assuming the file is in the current folder, the command for running the test suite is:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

This allows your kubeconfig file to be consumed from inside the running container.

15.9.3. Image parameters
Copiar enlace

Depending on the requirements, the tests can use different images. There are two images used by the tests that can be changed using the following environment variables:

CNF_TESTS_IMAGE
DPDK_TESTS_IMAGE

For example, to change the CNF_TESTS_IMAGE with a custom registry run the following command:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e CNF_TESTS_IMAGE="custom-cnf-tests-image:latests" registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e CNF_TESTS_IMAGE="custom-cnf-tests-image:latests" registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.3.1. Ginkgo parameters
Copiar enlace

The test suite is built upon the ginkgo BDD framework. This means that it accepts parameters for filtering or skipping tests.

You can use the -ginkgo.focus parameter to filter a set of tests:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh -ginkgo.focus="performance|sctp"

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh -ginkgo.focus="performance|sctp"

Copy to Clipboard

Toggle word wrap

Note

There is a particular test that requires both SR-IOV and SCTP. Given the selective nature of the focus parameter, this test is triggered by only placing the sriov matcher. If the tests are executed against a cluster where SR-IOV is installed but SCTP is not, adding the -ginkgo.skip=SCTP parameter causes the tests to skip SCTP testing.

15.9.3.2. Available features
Copiar enlace

The set of available features to filter are:

performance
sriov
ptp
sctp
dpdk

15.9.4. Dry run
Copiar enlace

Use this command to run in dry-run mode. This is useful for checking what is in the test suite and provides output for all of the tests the image would run.

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh -ginkgo.dryRun -ginkgo.v

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh -ginkgo.dryRun -ginkgo.v

Copy to Clipboard

Toggle word wrap

15.9.5. Disconnected mode
Copiar enlace

The CNF tests image support running tests in a disconnected cluster, meaning a cluster that is not able to reach outer registries. This is done in two steps:

Performing the mirroring.
Instructing the tests to consume the images from a custom registry.

15.9.5.1. Mirroring the images to a custom registry accessible from the cluster
Copiar enlace

A mirror executable is shipped in the image to provide the input required by oc to mirror the images needed to run the tests to a local registry.

Run this command from an intermediate machine that has access both to the cluster and to registry.redhat.io over the Internet:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror -registry my.local.registry:5000/ |  oc image mirror -f -

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror -registry my.local.registry:5000/ |  oc image mirror -f -

Copy to Clipboard

Toggle word wrap

Then, follow the instructions in the following section about overriding the registry used to fetch the images.

15.9.5.2. Instruct the tests to consume those images from a custom registry
Copiar enlace

This is done by setting the IMAGE_REGISTRY environment variable:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY="my.local.registry:5000/" -e CNF_TESTS_IMAGE="custom-cnf-tests-image:latests" registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY="my.local.registry:5000/" -e CNF_TESTS_IMAGE="custom-cnf-tests-image:latests" registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.5.3. Mirroring to the cluster internal registry
Copiar enlace

OpenShift Container Platform provides a built-in container image registry, which runs as a standard workload on the cluster.

Procedure

Gain external access to the registry by exposing it with a route:

oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"defaultRoute":true}}' --type=merge

$ oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"defaultRoute":true}}' --type=merge

Copy to Clipboard

Toggle word wrap

Fetch the registry endpoint:

REGISTRY=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')

REGISTRY=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')

Copy to Clipboard

Toggle word wrap

Create a namespace for exposing the images:
```
oc create ns cnftests
```
```
$ oc create ns cnftests
```
Copy to Clipboard Toggle word wrap

Make that image stream available to all the namespaces used for tests. This is required to allow the tests namespaces to fetch the images from the cnftests image stream.

oc policy add-role-to-user system:image-puller system:serviceaccount:sctptest:default --namespace=cnftests

$ oc policy add-role-to-user system:image-puller system:serviceaccount:sctptest:default --namespace=cnftests

Copy to Clipboard

Toggle word wrap

oc policy add-role-to-user system:image-puller system:serviceaccount:cnf-features-testing:default --namespace=cnftests

$ oc policy add-role-to-user system:image-puller system:serviceaccount:cnf-features-testing:default --namespace=cnftests

Copy to Clipboard

Toggle word wrap

oc policy add-role-to-user system:image-puller system:serviceaccount:performance-addon-operators-testing:default --namespace=cnftests

$ oc policy add-role-to-user system:image-puller system:serviceaccount:performance-addon-operators-testing:default --namespace=cnftests

Copy to Clipboard

Toggle word wrap

oc policy add-role-to-user system:image-puller system:serviceaccount:dpdk-testing:default --namespace=cnftests

$ oc policy add-role-to-user system:image-puller system:serviceaccount:dpdk-testing:default --namespace=cnftests

Copy to Clipboard

Toggle word wrap

oc policy add-role-to-user system:image-puller system:serviceaccount:sriov-conformance-testing:default --namespace=cnftests

$ oc policy add-role-to-user system:image-puller system:serviceaccount:sriov-conformance-testing:default --namespace=cnftests

Copy to Clipboard

Toggle word wrap

Retrieve the docker secret name and auth token:

SECRET=$(oc -n cnftests get secret | grep builder-docker | awk {'print $1'}
TOKEN=$(oc -n cnftests get secret $SECRET -o jsonpath="{.data['\.dockercfg']}" | base64 --decode | jq '.["image-registry.openshift-image-registry.svc:5000"].auth')

SECRET=$(oc -n cnftests get secret | grep builder-docker | awk {'print $1'}
TOKEN=$(oc -n cnftests get secret $SECRET -o jsonpath="{.data['\.dockercfg']}" | base64 --decode | jq '.["image-registry.openshift-image-registry.svc:5000"].auth')

Copy to Clipboard

Toggle word wrap

Write a dockerauth.json similar to this:

echo "{\"auths\": { \"$REGISTRY\": { \"auth\": $TOKEN } }}" > dockerauth.json

echo "{\"auths\": { \"$REGISTRY\": { \"auth\": $TOKEN } }}" > dockerauth.json

Copy to Clipboard

Toggle word wrap

Do the mirroring:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror -registry $REGISTRY/cnftests |  oc image mirror --insecure=true -a=$(pwd)/dockerauth.json -f -

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror -registry $REGISTRY/cnftests |  oc image mirror --insecure=true -a=$(pwd)/dockerauth.json -f -

Copy to Clipboard

Toggle word wrap

Run the tests:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/cnftests cnf-tests-local:latest /usr/bin/test-run.sh

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/cnftests cnf-tests-local:latest /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.5.4. Mirroring a different set of images
Copiar enlace

Procedure

The mirror command tries to mirror the u/s images by default. This can be overridden by passing a file with the following format to the image:

[
    {
        "registry": "public.registry.io:5000",
        "image": "imageforcnftests:4.6"
    },
    {
        "registry": "public.registry.io:5000",
        "image": "imagefordpdk:4.6"
    }
]

[
    {
        "registry": "public.registry.io:5000",
        "image": "imageforcnftests:4.6"
    },
    {
        "registry": "public.registry.io:5000",
        "image": "imagefordpdk:4.6"
    }
]

Copy to Clipboard

Toggle word wrap

Pass it to the mirror command, for example saving it locally as images.json. With the following command, the local path is mounted in /kubeconfig inside the container and that can be passed to the mirror command.

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror --registry "my.local.registry:5000/" --images "/kubeconfig/images.json" |  oc image mirror -f -

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/mirror --registry "my.local.registry:5000/" --images "/kubeconfig/images.json" |  oc image mirror -f -

Copy to Clipboard

Toggle word wrap

15.9.6. Discovery mode
Copiar enlace

Discovery mode allows you to validate the functionality of a cluster without altering its configuration. Existing environment configurations are used for the tests. The tests attempt to find the configuration items needed and use those items to execute the tests. If resources needed to run a specific test are not found, the test is skipped, providing an appropriate message to the user. After the tests are finished, no cleanup of the pre-configured configuration items is done, and the test environment can be immediately used for another test run.

Some configuration items are still created by the tests. These are specific items needed for a test to run; for example, a SR-IOV Network. These configuration items are created in custom namespaces and are cleaned up after the tests are executed.

An additional bonus is a reduction in test run times. As the configuration items are already there, no time is needed for environment configuration and stabilization.

To enable discovery mode, the tests must be instructed by setting the DISCOVERY_MODE environment variable as follows:

docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e

$ docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e
DISCOVERY_MODE=true registry.redhat.io/openshift-kni/cnf-tests /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.6.1. Required environment configuration prerequisites
Copiar enlace

SR-IOV tests

Most SR-IOV tests require the following resources:

SriovNetworkNodePolicy.
At least one with the resource specified by SriovNetworkNodePolicy being allocatable; a resource count of at least 5 is considered sufficient.

Some tests have additional requirements:

An unused device on the node with available policy resource, with link state DOWN and not a bridge slave.
A SriovNetworkNodePolicy with a MTU value of 9000.

DPDK tests

The DPDK related tests require:

A performance profile.
A SR-IOV policy.
A node with resources available for the SR-IOV policy and available with the PerformanceProfile node selector.

PTP tests

A slave PtpConfig (ptp4lOpts="-s" ,phc2sysOpts="-a -r").
A node with a label matching the slave PtpConfig.

SCTP tests

SriovNetworkNodePolicy.
A node matching both the SriovNetworkNodePolicy and a MachineConfig that enables SCTP.

Performance Operator tests

Various tests have different requirements. Some of them are:

A performance profile.
A performance profile having profile.Spec.CPU.Isolated = 1.
A performance profile having profile.Spec.RealTimeKernel.Enabled == true.
A node with no huge pages usage.

15.9.6.2. Limiting the nodes used during tests
Copiar enlace

The nodes on which the tests are executed can be limited by specifying a NODES_SELECTOR environment variable. Any resources created by the test are then limited to the specified nodes.

docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e

$ docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e
NODES_SELECTOR=node-role.kubernetes.io/worker-cnf registry.redhat.io/openshift-kni/cnf-tests /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.6.3. Using a single performance profile
Copiar enlace

The resources needed by the DPDK tests are higher than those required by the performance test suite. To make the execution faster, the performance profile used by tests can be overridden using one that also serves the DPDK test suite.

To do this, a profile like the following one can be mounted inside the container, and the performance tests can be instructed to deploy it.

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
  name: performance
spec:
  cpu:
    isolated: "4-15"
    reserved: "0-3"
  hugepages:
    defaultHugepagesSize: "1G"
    pages:
    - size: "1G"
      count: 16
      node: 0
  realTimeKernel:
    enabled: true
  nodeSelector:
    node-role.kubernetes.io/worker-cnf: ""

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
  name: performance
spec:
  cpu:
    isolated: "4-15"
    reserved: "0-3"
  hugepages:
    defaultHugepagesSize: "1G"
    pages:
    - size: "1G"
      count: 16
      node: 0
  realTimeKernel:
    enabled: true
  nodeSelector:
    node-role.kubernetes.io/worker-cnf: ""

Copy to Clipboard

Toggle word wrap

To override the performance profile used, the manifest must be mounted inside the container and the tests must be instructed by setting the PERFORMANCE_PROFILE_MANIFEST_OVERRIDE parameter as follows:

docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e

$ docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e
PERFORMANCE_PROFILE_MANIFEST_OVERRIDE=/kubeconfig/manifest.yaml registry.redhat.io/openshift-kni/cnf-tests /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.6.4. Disabling the performance profile cleanup
Copiar enlace

When not running in discovery mode, the suite cleans up all the created artifacts and configurations. This includes the performance profile.

When deleting the performance profile, the machine config pool is modified and nodes are rebooted. After a new iteration, a new profile is created. This causes long test cycles between runs.

To speed up this process, set CLEAN_PERFORMANCE_PROFILE="false" to instruct the tests not to clean the performance profile. In this way, the next iteration will not need to create it and wait for it to be applied.

docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e

$ docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e
CLEAN_PERFORMANCE_PROFILE="false" registry.redhat.io/openshift-kni/cnf-tests /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.7. Troubleshooting
Copiar enlace

The cluster must be reached from within the container. You can verify this by running:

docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig

$ docker run -v $(pwd)/:/kubeconfig -e KUBECONFIG=/kubeconfig/kubeconfig
registry.redhat.io/openshift-kni/cnf-tests oc get nodes

Copy to Clipboard

Toggle word wrap

If this does not work, it could be caused by spanning across DNS, MTU size, or firewall issues.

15.9.8. Test reports
Copiar enlace

CNF end-to-end tests produce two outputs: a JUnit test output and a test failure report.

15.9.8.1. JUnit test output
Copiar enlace

A JUnit-compliant XML is produced by passing the --junit parameter together with the path where the report is dumped:

docker run -v $(pwd)/:/kubeconfig -v $(pwd)/junitdest:/path/to/junit -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh --junit /path/to/junit

$ docker run -v $(pwd)/:/kubeconfig -v $(pwd)/junitdest:/path/to/junit -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh --junit /path/to/junit

Copy to Clipboard

Toggle word wrap

15.9.8.2. Test failure report
Copiar enlace

A report with information about the cluster state and resources for troubleshooting can be produced by passing the --report parameter with the path where the report is dumped:

docker run -v $(pwd)/:/kubeconfig -v $(pwd)/reportdest:/path/to/report -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh --report /path/to/report

$ docker run -v $(pwd)/:/kubeconfig -v $(pwd)/reportdest:/path/to/report -e KUBECONFIG=/kubeconfig/kubeconfig registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh --report /path/to/report

Copy to Clipboard

Toggle word wrap

15.9.8.3. A note on podman
Copiar enlace

When executing podman as non root and non privileged, mounting paths can fail with "permission denied" errors. To make it work, append :Z to the volumes creation; for example, -v $(pwd)/:/kubeconfig:Z to allow podman to do the proper SELinux relabeling.

15.9.8.4. Running on OpenShift Container Platform 4.4
Copiar enlace

With the exception of the following, the CNF end-to-end tests are compatible with OpenShift Container Platform 4.4:

[test_id:28466][crit:high][vendor:cnf-qe@redhat.com][level:acceptance] Should contain configuration injected through openshift-node-performance profile
[test_id:28467][crit:high][vendor:cnf-qe@redhat.com][level:acceptance] Should contain configuration injected through the openshift-node-performance profile

[test_id:28466][crit:high][vendor:cnf-qe@redhat.com][level:acceptance] Should contain configuration injected through openshift-node-performance profile
[test_id:28467][crit:high][vendor:cnf-qe@redhat.com][level:acceptance] Should contain configuration injected through the openshift-node-performance profile

Copy to Clipboard

Toggle word wrap

You can skip these tests by adding the -ginkgo.skip “28466|28467" parameter.

15.9.8.5. Using a single performance profile
Copiar enlace

The DPDK tests require more resources than what is required by the performance test suite. To make the execution faster, you can override the performance profile used by the tests using a profile that also serves the DPDK test suite.

To do this, use a profile like the following one that can be mounted inside the container, and the performance tests can be instructed to deploy it.

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
 name: performance
spec:
 cpu:
  isolated: "5-15"
  reserved: "0-4"
 hugepages:
  defaultHugepagesSize: "1G"
  pages:
  -size: "1G"
   count: 16
   node: 0
 realTimeKernel:
  enabled: true
 numa:
  topologyPolicy: "best-effort"
 nodeSelector:
  node-role.kubernetes.io/worker-cnf: ""

apiVersion: performance.openshift.io/v1
kind: PerformanceProfile
metadata:
 name: performance
spec:
 cpu:
  isolated: "5-15"
  reserved: "0-4"
 hugepages:
  defaultHugepagesSize: "1G"
  pages:
  -size: "1G"
   count: 16
   node: 0
 realTimeKernel:
  enabled: true
 numa:
  topologyPolicy: "best-effort"
 nodeSelector:
  node-role.kubernetes.io/worker-cnf: ""

Copy to Clipboard

Toggle word wrap

To override the performance profile, the manifest must be mounted inside the container and the tests must be instructed by setting the PERFORMANCE_PROFILE_MANIFEST_OVERRIDE:

docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e PERFORMANCE_PROFILE_MANIFEST_OVERRIDE=/kubeconfig/manifest.yaml registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

$ docker run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e PERFORMANCE_PROFILE_MANIFEST_OVERRIDE=/kubeconfig/manifest.yaml registry.redhat.io/openshift4/cnf-tests-rhel8:v4.6 /usr/bin/test-run.sh

Copy to Clipboard

Toggle word wrap

15.9.9. Impacts on the cluster
Copiar enlace

Depending on the feature, running the test suite could cause different impacts on the cluster. In general, only the SCTP tests do not change the cluster configuration. All of the other features have various impacts on the configuration.

15.9.9.1. SCTP
Copiar enlace

SCTP tests just run different pods on different nodes to check connectivity. The impacts on the cluster are related to running simple pods on two nodes.

15.9.9.2. SR-IOV
Copiar enlace

SR-IOV tests require changes in the SR-IOV network configuration, where the tests create and destroy different types of configuration.

This might have an impact if existing SR-IOV network configurations are already installed on the cluster, because there may be conflicts depending on the priority of such configurations.

At the same time, the result of the tests might be affected by existing configurations.

15.9.9.3. PTP
Copiar enlace

PTP tests apply a PTP configuration to a set of nodes of the cluster. As with SR-IOV, this might conflict with any existing PTP configuration already in place, with unpredictable results.

15.9.9.4. Performance
Copiar enlace

Performance tests apply a performance profile to the cluster. The effect of this is changes in the node configuration, reserving CPUs, allocating memory huge pages, and setting the kernel packages to be realtime. If an existing profile named performance is already available on the cluster, the tests do not deploy it.

15.9.9.5. DPDK
Copiar enlace

DPDK relies on both performance and SR-IOV features, so the test suite configures both a performance profile and SR-IOV networks, so the impacts are the same as those described in SR-IOV testing and performance testing.

15.9.9.6. Cleaning up
Copiar enlace

After running the test suite, all the dangling resources are cleaned up.

15.10. Debugging low latency CNF tuning status
Copiar enlace

The PerformanceProfile custom resource (CR) contains status fields for reporting tuning status and debugging latency degradation issues. These fields report on conditions that describe the state of the operator’s reconciliation functionality.

A typical issue can arise when the status of machine config pools that are attached to the performance profile are in a degraded state, causing the PerformanceProfile status to degrade. In this case, the machine config pool issues a failure message.

The Performance Addon Operator contains the performanceProfile.spec.status.Conditions status field:

Status:
  Conditions:
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                True
    Type:                  Available
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                True
    Type:                  Upgradeable
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                False
    Type:                  Progressing
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                False
    Type:                  Degraded

Status:
  Conditions:
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                True
    Type:                  Available
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                True
    Type:                  Upgradeable
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                False
    Type:                  Progressing
    Last Heartbeat Time:   2020-06-02T10:01:24Z
    Last Transition Time:  2020-06-02T10:01:24Z
    Status:                False
    Type:                  Degraded

Copy to Clipboard

Toggle word wrap

The Status field contains Conditions that specify Type values that indicate the status of the performance profile:

Available

All machine configs and Tuned profiles have been created successfully and are available for cluster components are responsible to process them (NTO, MCO, Kubelet).

Upgradeable

Indicates whether the resources maintained by the Operator are in a state that is safe to upgrade.

Progressing

Indicates that the deployment process from the performance profile has started.

Degraded

Indicates an error if:

Validation of the performance profile has failed.
Creation of all relevant components did not complete successfully.

Each of these types contain the following fields:

Status: The state for the specific type (true or false).
Timestamp: The transaction timestamp.
Reason string: The machine readable reason.
Message string: The human readable reason describing the state and error details, if any.

15.10.1. Machine config pools
Copiar enlace

A performance profile and its created products are applied to a node according to an associated machine config pool (MCP). The MCP holds valuable information about the progress of applying the machine configurations created by performance addons that encompass kernel args, kube config, huge pages allocation, and deployment of rt-kernel. The performance addons controller monitors changes in the MCP and updates the performance profile status accordingly.

The only conditions returned by the MCP to the performance profile status is when the MCP is Degraded, which leads to performaceProfile.status.condition.Degraded = true.

Example

The following example is for a performance profile with an associated machine config pool (worker-cnf) that was created for it:

The associated machine config pool is in a degraded state:

oc get mcp

# oc get mcp

Copy to Clipboard

Toggle word wrap

Example output

NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master       rendered-master-2ee57a93fa6c9181b546ca46e1571d2d       True      False      False      3              3                   3                     0                      2d21h
worker       rendered-worker-d6b2bdc07d9f5a59a6b68950acf25e5f       True      False      False      2              2                   2                     0                      2d21h
worker-cnf   rendered-worker-cnf-6c838641b8a08fff08dbd8b02fb63f7c   False     True       True       2              1                   1                     1                      2d20h

NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master       rendered-master-2ee57a93fa6c9181b546ca46e1571d2d       True      False      False      3              3                   3                     0                      2d21h
worker       rendered-worker-d6b2bdc07d9f5a59a6b68950acf25e5f       True      False      False      2              2                   2                     0                      2d21h
worker-cnf   rendered-worker-cnf-6c838641b8a08fff08dbd8b02fb63f7c   False     True       True       2              1                   1                     1                      2d20h

Copy to Clipboard

Toggle word wrap

The describe section of the MCP shows the reason:

oc describe mcp worker-cnf

# oc describe mcp worker-cnf

Copy to Clipboard

Toggle word wrap

Example output

  Message:               Node node-worker-cnf is reporting: "prepping update:
  machineconfig.machineconfiguration.openshift.io \"rendered-worker-cnf-40b9996919c08e335f3ff230ce1d170\" not
  found"
    Reason:                1 nodes are reporting degraded status on sync

  Message:               Node node-worker-cnf is reporting: "prepping update:
  machineconfig.machineconfiguration.openshift.io \"rendered-worker-cnf-40b9996919c08e335f3ff230ce1d170\" not
  found"
    Reason:                1 nodes are reporting degraded status on sync

Copy to Clipboard

Toggle word wrap

The degraded state should also appear under the performance profile status field marked as degraded = true:

oc describe performanceprofiles performance

# oc describe performanceprofiles performance

Copy to Clipboard

Toggle word wrap

Example output

Message: Machine config pool worker-cnf Degraded Reason: 1 nodes are reporting degraded status on sync.
Machine config pool worker-cnf Degraded Message: Node yquinn-q8s5v-w-b-z5lqn.c.openshift-gce-devel.internal is
reporting: "prepping update: machineconfig.machineconfiguration.openshift.io
\"rendered-worker-cnf-40b9996919c08e335f3ff230ce1d170\" not found".    Reason:  MCPDegraded
   Status:  True
   Type:    Degraded

Message: Machine config pool worker-cnf Degraded Reason: 1 nodes are reporting degraded status on sync.
Machine config pool worker-cnf Degraded Message: Node yquinn-q8s5v-w-b-z5lqn.c.openshift-gce-devel.internal is
reporting: "prepping update: machineconfig.machineconfiguration.openshift.io
\"rendered-worker-cnf-40b9996919c08e335f3ff230ce1d170\" not found".    Reason:  MCPDegraded
   Status:  True
   Type:    Degraded

Copy to Clipboard

Toggle word wrap

15.11. Collecting low latency tuning debugging data for Red Hat Support
Copiar enlace

When opening a support case, it is helpful to provide debugging information about your cluster to Red Hat Support.

The must-gather tool enables you to collect diagnostic information about your OpenShift Container Platform cluster, including node tuning, NUMA topology, and other information needed to debug issues with low latency setup.

For prompt support, supply diagnostic information for both OpenShift Container Platform and low latency tuning.

15.11.1. About the must-gather tool
Copiar enlace

The oc adm must-gather CLI command collects the information from your cluster that is most likely needed for debugging issues, such as:

Resource definitions
Audit logs
Service logs

You can specify one or more images when you run the command by including the --image argument. When you specify an image, the tool collects data related to that feature or product. When you run oc adm must-gather, a new pod is created on the cluster. The data is collected on that pod and saved in a new directory that starts with must-gather.local. This directory is created in your current working directory.

15.11.2. About collecting low latency tuning data
Copiar enlace

Use the oc adm must-gather CLI command to collect information about your cluster, including features and objects associated with low latency tuning, including:

The Performance Addon Operator namespaces and child objects.
MachineConfigPool and associated MachineConfig objects.
The Node Tuning Operator and associated Tuned objects.
Linux Kernel command line options.
CPU and NUMA topology
Basic PCI device information and NUMA locality.

To collect Performance Addon Operator debugging information with must-gather, you must specify the Performance Addon Operator must-gather image:

--image=registry.redhat.io/openshift4/performance-addon-operator-must-gather-rhel8:v4.6.

--image=registry.redhat.io/openshift4/performance-addon-operator-must-gather-rhel8:v4.6.

Copy to Clipboard

Toggle word wrap

15.11.3. Gathering data about specific features
Copiar enlace

You can gather debugging information about specific features by using the oc adm must-gather CLI command with the --image or --image-stream argument. The must-gather tool supports multiple images, so you can gather data about more than one feature by running a single command.

Note

To collect the default must-gather data in addition to specific feature data, add the --image-stream=openshift/must-gather argument.

Prerequisites

Access to the cluster as a user with the cluster-admin role.
The OpenShift Container Platform CLI (oc) installed.

Procedure

Navigate to the directory where you want to store the must-gather data.
Run the oc adm must-gather command with one or more --image or --image-stream arguments. For example, the following command gathers both the default cluster data and information specific to the Performance Addon Operator:
```
oc adm must-gather \
 --image-stream=openshift/must-gather \
```
```
$ oc adm must-gather \
 --image-stream=openshift/must-gather \ 
```
1
```
 --image=registry.redhat.io/openshift4/performance-addon-operator-must-gather-rhel8:v4.6 
```
2
Copy to Clipboard Toggle word wrap
1
The default OpenShift Container Platform must-gather image.
2
The must-gather image for low latency tuning diagnostics.
Create a compressed file from the must-gather directory that was created in your working directory. For example, on a computer that uses a Linux operating system, run the following command:
```
tar cvaf must-gather.tar.gz must-gather.local.5421342344627712289/
```
```
 $ tar cvaf must-gather.tar.gz must-gather.local.5421342344627712289/ 
```
1
Copy to Clipboard Toggle word wrap
1
Replace must-gather-local.5421342344627712289/ with the actual directory name.
Attach the compressed file to your support case on the Red Hat Customer Portal.

Este contenido no está disponible en el idioma seleccionado.

15.1. Understanding low latencyCopiar enlaceEnlace copiado en el portapapeles!

15.2. Installing the Performance Addon OperatorCopiar enlaceEnlace copiado en el portapapeles!

15.2.1. Installing the Operator using the CLICopiar enlaceEnlace copiado en el portapapeles!

15.2.2. Installing the Performance Addon Operator using the web consoleCopiar enlaceEnlace copiado en el portapapeles!

15.3. Upgrading Performance Addon OperatorCopiar enlaceEnlace copiado en el portapapeles!

15.3.1. About upgrading Performance Addon OperatorCopiar enlaceEnlace copiado en el portapapeles!

15.3.1.1. How Performance Addon Operator upgrades affect your clusterCopiar enlaceEnlace copiado en el portapapeles!

15.3.1.2. Upgrading Performance Addon Operator to the next minor versionCopiar enlaceEnlace copiado en el portapapeles!

15.3.2. Monitoring upgrade statusCopiar enlaceEnlace copiado en el portapapeles!

15.4. Provisioning real-time and low latency workloadsCopiar enlaceEnlace copiado en el portapapeles!

15.4.1. Known limitations for real-timeCopiar enlaceEnlace copiado en el portapapeles!

15.4.2. Provisioning a worker with real-time capabilitiesCopiar enlaceEnlace copiado en el portapapeles!

15.4.3. Verifying the real-time kernel installationCopiar enlaceEnlace copiado en el portapapeles!

15.4.4. Creating a workload that works in real-timeCopiar enlaceEnlace copiado en el portapapeles!

15.4.5. Creating a pod with a QoS class of GuaranteedCopiar enlaceEnlace copiado en el portapapeles!

15.4.6. Optional: Disabling CPU load balancing for DPDKCopiar enlaceEnlace copiado en el portapapeles!

15.4.7. Assigning a proper node selectorCopiar enlaceEnlace copiado en el portapapeles!

15.4.8. Scheduling a workload onto a worker with real-time capabilitiesCopiar enlaceEnlace copiado en el portapapeles!

15.5. Configuring huge pagesCopiar enlaceEnlace copiado en el portapapeles!

15.6. Allocating multiple huge page sizesCopiar enlaceEnlace copiado en el portapapeles!

15.7. Restricting CPUs for infra and application containersCopiar enlaceEnlace copiado en el portapapeles!

15.8. Tuning nodes for low latency with the performance profileCopiar enlaceEnlace copiado en el portapapeles!

15.9. Performing end-to-end tests for platform verificationCopiar enlaceEnlace copiado en el portapapeles!

15.9.1. PrerequisitesCopiar enlaceEnlace copiado en el portapapeles!

15.9.2. Running the testsCopiar enlaceEnlace copiado en el portapapeles!

15.9.3. Image parametersCopiar enlaceEnlace copiado en el portapapeles!

15.9.3.1. Ginkgo parametersCopiar enlaceEnlace copiado en el portapapeles!

15.9.3.2. Available featuresCopiar enlaceEnlace copiado en el portapapeles!

15.9.4. Dry runCopiar enlaceEnlace copiado en el portapapeles!

15.9.5. Disconnected modeCopiar enlaceEnlace copiado en el portapapeles!

15.9.5.1. Mirroring the images to a custom registry accessible from the clusterCopiar enlaceEnlace copiado en el portapapeles!

15.9.5.2. Instruct the tests to consume those images from a custom registryCopiar enlaceEnlace copiado en el portapapeles!

15.9.5.3. Mirroring to the cluster internal registryCopiar enlaceEnlace copiado en el portapapeles!

15.9.5.4. Mirroring a different set of imagesCopiar enlaceEnlace copiado en el portapapeles!

15.9.6. Discovery modeCopiar enlaceEnlace copiado en el portapapeles!

15.9.6.1. Required environment configuration prerequisitesCopiar enlaceEnlace copiado en el portapapeles!

15.9.6.2. Limiting the nodes used during testsCopiar enlaceEnlace copiado en el portapapeles!

15.9.6.3. Using a single performance profileCopiar enlaceEnlace copiado en el portapapeles!

15.9.6.4. Disabling the performance profile cleanupCopiar enlaceEnlace copiado en el portapapeles!

15.9.7. TroubleshootingCopiar enlaceEnlace copiado en el portapapeles!

15.9.8. Test reportsCopiar enlaceEnlace copiado en el portapapeles!

15.9.8.1. JUnit test outputCopiar enlaceEnlace copiado en el portapapeles!

15.9.8.2. Test failure reportCopiar enlaceEnlace copiado en el portapapeles!

15.9.8.3. A note on podmanCopiar enlaceEnlace copiado en el portapapeles!

15.9.8.4. Running on OpenShift Container Platform 4.4Copiar enlaceEnlace copiado en el portapapeles!

15.9.8.5. Using a single performance profileCopiar enlaceEnlace copiado en el portapapeles!

15.9.9. Impacts on the clusterCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.1. SCTPCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.2. SR-IOVCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.3. PTPCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.4. PerformanceCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.5. DPDKCopiar enlaceEnlace copiado en el portapapeles!

15.9.9.6. Cleaning upCopiar enlaceEnlace copiado en el portapapeles!

15.10. Debugging low latency CNF tuning statusCopiar enlaceEnlace copiado en el portapapeles!

15.10.1. Machine config poolsCopiar enlaceEnlace copiado en el portapapeles!

15.11. Collecting low latency tuning debugging data for Red Hat SupportCopiar enlaceEnlace copiado en el portapapeles!

15.11.1. About the must-gather toolCopiar enlaceEnlace copiado en el portapapeles!

15.11.2. About collecting low latency tuning dataCopiar enlaceEnlace copiado en el portapapeles!

15.11.3. Gathering data about specific featuresCopiar enlaceEnlace copiado en el portapapeles!

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

15.1. Understanding low latency
Copiar enlace

15.2. Installing the Performance Addon Operator
Copiar enlace

15.2.1. Installing the Operator using the CLI
Copiar enlace

15.2.2. Installing the Performance Addon Operator using the web console
Copiar enlace

15.3. Upgrading Performance Addon Operator
Copiar enlace

15.3.1. About upgrading Performance Addon Operator
Copiar enlace

15.3.1.1. How Performance Addon Operator upgrades affect your cluster
Copiar enlace

15.3.1.2. Upgrading Performance Addon Operator to the next minor version
Copiar enlace

15.3.2. Monitoring upgrade status
Copiar enlace

15.4. Provisioning real-time and low latency workloads
Copiar enlace

15.4.1. Known limitations for real-time
Copiar enlace

15.4.2. Provisioning a worker with real-time capabilities
Copiar enlace

15.4.3. Verifying the real-time kernel installation
Copiar enlace

15.4.4. Creating a workload that works in real-time
Copiar enlace

15.4.5. Creating a pod with a QoS class of Guaranteed
Copiar enlace

15.4.6. Optional: Disabling CPU load balancing for DPDK
Copiar enlace

15.4.7. Assigning a proper node selector
Copiar enlace

15.4.8. Scheduling a workload onto a worker with real-time capabilities
Copiar enlace

15.5. Configuring huge pages
Copiar enlace

15.6. Allocating multiple huge page sizes
Copiar enlace

15.7. Restricting CPUs for infra and application containers
Copiar enlace

15.8. Tuning nodes for low latency with the performance profile
Copiar enlace

15.9. Performing end-to-end tests for platform verification
Copiar enlace

15.9.1. Prerequisites
Copiar enlace

15.9.2. Running the tests
Copiar enlace

15.9.3. Image parameters
Copiar enlace

15.9.3.1. Ginkgo parameters
Copiar enlace

15.9.3.2. Available features
Copiar enlace

15.9.4. Dry run
Copiar enlace

15.9.5. Disconnected mode
Copiar enlace

15.9.5.1. Mirroring the images to a custom registry accessible from the cluster
Copiar enlace

15.9.5.2. Instruct the tests to consume those images from a custom registry
Copiar enlace

15.9.5.3. Mirroring to the cluster internal registry
Copiar enlace

15.9.5.4. Mirroring a different set of images
Copiar enlace

15.9.6. Discovery mode
Copiar enlace

15.9.6.1. Required environment configuration prerequisites
Copiar enlace

15.9.6.2. Limiting the nodes used during tests
Copiar enlace

15.9.6.3. Using a single performance profile
Copiar enlace

15.9.6.4. Disabling the performance profile cleanup
Copiar enlace

15.9.7. Troubleshooting
Copiar enlace

15.9.8. Test reports
Copiar enlace

15.9.8.1. JUnit test output
Copiar enlace

15.9.8.2. Test failure report
Copiar enlace

15.9.8.3. A note on podman
Copiar enlace

15.9.8.4. Running on OpenShift Container Platform 4.4
Copiar enlace

15.9.8.5. Using a single performance profile
Copiar enlace

15.9.9. Impacts on the cluster
Copiar enlace

15.9.9.1. SCTP
Copiar enlace

15.9.9.2. SR-IOV
Copiar enlace

15.9.9.3. PTP
Copiar enlace

15.9.9.4. Performance
Copiar enlace

15.9.9.5. DPDK
Copiar enlace

15.9.9.6. Cleaning up
Copiar enlace

15.10. Debugging low latency CNF tuning status
Copiar enlace

15.10.1. Machine config pools
Copiar enlace

15.11. Collecting low latency tuning debugging data for Red Hat Support
Copiar enlace

15.11.1. About the must-gather tool
Copiar enlace

15.11.2. About collecting low latency tuning data
Copiar enlace

15.11.3. Gathering data about specific features
Copiar enlace