Specialized hardware and driver enablement
Learn about hardware enablement on OpenShift Container Platform
Abstract
Chapter 1. About specialized hardware and driver enablement Copy linkLink copied to clipboard!
Many applications require specialized hardware or software that depends on kernel modules or drivers. You can use driver containers to load out-of-tree kernel modules on Red Hat Enterprise Linux CoreOS (RHCOS) nodes. To deploy out-of-tree drivers during cluster installation, use the
kmods-via-containers
- The Driver Toolkit is a container image that is a part of every OpenShift Container Platform release. It contains the kernel packages and other common dependencies that are needed to build a driver or kernel module. The Driver Toolkit can be used as a base image for driver container image builds on OpenShift Container Platform.
- The Special Resource Operator (SRO) orchestrates the building and management of driver containers to load kernel modules and drivers on an existing OpenShift or Kubernetes cluster.
- The Node Feature Discovery (NFD) Operator adds node labels for CPU capabilities, kernel version, PCIe device vendor IDs, and more.
Chapter 2. Driver Toolkit Copy linkLink copied to clipboard!
Learn about the Driver Toolkit and how you can use it as a base image for driver containers for enabling special software and hardware devices on Kubernetes.
The Driver Toolkit is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
2.1. About the Driver Toolkit Copy linkLink copied to clipboard!
Background
The Driver Toolkit is a container image in the OpenShift Container Platform payload used as a base image on which you can build driver containers. The Driver Toolkit image contains the kernel packages commonly required as dependencies to build or install kernel modules, as well as a few tools needed in driver containers. The version of these packages will match the kernel version running on the Red Hat Enterprise Linux CoreOS (RHCOS) nodes in the corresponding OpenShift Container Platform release.
Driver containers are container images used for building and deploying out-of-tree kernel modules and drivers on container operating systems like RHCOS. Kernel modules and drivers are software libraries running with a high level of privilege in the operating system kernel. They extend the kernel functionalities or provide the hardware-specific code required to control new devices. Examples include hardware devices like Field Programmable Gate Arrays (FPGA) or GPUs, and software-defined storage (SDS) solutions, such as Lustre parallel file systems, which require kernel modules on client machines. Driver containers are the first layer of the software stack used to enable these technologies on Kubernetes.
The list of kernel packages in the Driver Toolkit includes the following and their dependencies:
-
kernel-core -
kernel-devel -
kernel-headers -
kernel-modules -
kernel-modules-extra
In addition, the Driver Toolkit also includes the corresponding real-time kernel packages:
-
kernel-rt-core -
kernel-rt-devel -
kernel-rt-modules -
kernel-rt-modules-extra
The Driver Toolkit also has several tools which are commonly needed to build and install kernel modules, including:
-
elfutils-libelf-devel -
kmod -
binutilskabi-dw -
kernel-abi-whitelists - dependencies for the above
Purpose
Prior to the Driver Toolkit’s existence, you could install kernel packages in a pod or build config on OpenShift Container Platform using entitled builds or by installing from the kernel RPMs in the hosts
machine-os-content
The Driver Toolkit is also used by the Special Resource Operator (SRO), which is currently available as a community Operator on OperatorHub. SRO supports out-of-tree and third-party kernel drivers and the support software for the underlying operating system. Users can create recipes for SRO to build and deploy a driver container, as well as support software like a device plugin, or metrics. Recipes can include a build config to build a driver container based on the Driver Toolkit, or SRO can deploy a prebuilt driver container.
2.2. Pulling the Driver Toolkit container image Copy linkLink copied to clipboard!
The
driver-toolkit
oc adm
2.2.1. Pulling the Driver Toolkit container image from registry.redhat.io Copy linkLink copied to clipboard!
Instructions for pulling the
driver-toolkit
registry.redhat.io
registry.redhat.io/openshift4/driver-toolkit-rhel8:v4.11
2.2.2. Finding the Driver Toolkit image URL in the payload Copy linkLink copied to clipboard!
Prerequisites
- You obtained the image pull secret from the Red Hat OpenShift Cluster Manager.
-
You installed the OpenShift CLI ().
oc
Procedure
The image URL of the
corresponding to a certain release can be extracted from the release image using thedriver-toolkitcommand:oc admFor an x86 image, the command is as follows:
$ oc adm release info quay.io/openshift-release-dev/ocp-release:{product-version}.z-x86_64 --image-for=driver-toolkitFor an ARM image, the command is as follows:
$ oc adm release info quay.io/openshift-release-dev/ocp-release:{product-version}.z-aarch64 --image-for=driver-toolkit
Example output
The output for the
image is as follows:ocp-release:4.11.0-x86_64quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:514e256367e8707e2c30f18a4f06fbd6c821ab9776602d2488e861f577a357d0This image can be pulled using a valid pull secret, such as the pull secret required to install OpenShift Container Platform.
$ podman pull --authfile=path/to/pullsecret.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<SHA>
ARM images are built for 64-bit Arm (AArch64) only, and are currently a Technology Preview feature. For more information about OpenShift Container Platform on ARM, see Release Notes. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
2.3. Using the Driver Toolkit Copy linkLink copied to clipboard!
As an example, the Driver Toolkit can be used as the base image for building a very simple kernel module called simple-kmod.
The Driver Toolkit contains the necessary dependencies,
openssl
mokutil
keyutils
Secure Boot
2.3.1. Build and run the simple-kmod driver container on a cluster Copy linkLink copied to clipboard!
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You set the Image Registry Operator state to for your cluster.
Managed -
You installed the OpenShift CLI ().
oc -
You are logged into the OpenShift CLI as a user with privileges.
cluster-admin
Procedure
Create a namespace. For example:
$ oc new-project simple-kmod-demo
The YAML defines an
for storing theImageStreamdriver container image, and asimple-kmodfor building the container. Save this YAML asBuildConfig.0000-buildconfig.yaml.templateapiVersion: image.openshift.io/v1 kind: ImageStream metadata: labels: app: simple-kmod-driver-container name: simple-kmod-driver-container namespace: simple-kmod-demo spec: {} --- apiVersion: build.openshift.io/v1 kind: BuildConfig metadata: labels: app: simple-kmod-driver-build name: simple-kmod-driver-build namespace: simple-kmod-demo spec: nodeSelector: node-role.kubernetes.io/worker: "" runPolicy: "Serial" triggers: - type: "ConfigChange" - type: "ImageChange" source: git: ref: "master" uri: "https://github.com/openshift-psap/kvc-simple-kmod.git" type: Git dockerfile: | FROM DRIVER_TOOLKIT_IMAGE WORKDIR /build/ # Expecting kmod software version as an input to the build ARG KMODVER # Grab the software from upstream RUN git clone https://github.com/openshift-psap/simple-kmod.git WORKDIR simple-kmod # Build and install the module RUN make all KVER=$(rpm -q --qf "%{VERSION}-%{RELEASE}.%{ARCH}" kernel-core) KMODVER=${KMODVER} \ && make install KVER=$(rpm -q --qf "%{VERSION}-%{RELEASE}.%{ARCH}" kernel-core) KMODVER=${KMODVER} # Add the helper tools WORKDIR /root/kvc-simple-kmod ADD Makefile . ADD simple-kmod-lib.sh . ADD simple-kmod-wrapper.sh . ADD simple-kmod.conf . RUN mkdir -p /usr/lib/kvc/ \ && mkdir -p /etc/kvc/ \ && make install RUN systemctl enable kmods-via-containers@simple-kmod strategy: dockerStrategy: buildArgs: - name: KMODVER value: DEMO output: to: kind: ImageStreamTag name: simple-kmod-driver-container:demoSubstitute the correct driver toolkit image for the OpenShift Container Platform version you are running in place of “DRIVER_TOOLKIT_IMAGE” with the following commands.
$ OCP_VERSION=$(oc get clusterversion/version -ojsonpath={.status.desired.version})$ DRIVER_TOOLKIT_IMAGE=$(oc adm release info $OCP_VERSION --image-for=driver-toolkit)$ sed "s#DRIVER_TOOLKIT_IMAGE#${DRIVER_TOOLKIT_IMAGE}#" 0000-buildconfig.yaml.template > 0000-buildconfig.yamlCreate the image stream and build config with
$ oc create -f 0000-buildconfig.yamlAfter the builder pod completes successfully, deploy the driver container image as a
.DaemonSetThe driver container must run with the privileged security context in order to load the kernel modules on the host. The following YAML file contains the RBAC rules and the
for running the driver container. Save this YAML asDaemonSet.1000-drivercontainer.yamlapiVersion: v1 kind: ServiceAccount metadata: name: simple-kmod-driver-container --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: simple-kmod-driver-container rules: - apiGroups: - security.openshift.io resources: - securitycontextconstraints verbs: - use resourceNames: - privileged --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: simple-kmod-driver-container roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: simple-kmod-driver-container subjects: - kind: ServiceAccount name: simple-kmod-driver-container userNames: - system:serviceaccount:simple-kmod-demo:simple-kmod-driver-container --- apiVersion: apps/v1 kind: DaemonSet metadata: name: simple-kmod-driver-container spec: selector: matchLabels: app: simple-kmod-driver-container template: metadata: labels: app: simple-kmod-driver-container spec: serviceAccount: simple-kmod-driver-container serviceAccountName: simple-kmod-driver-container containers: - image: image-registry.openshift-image-registry.svc:5000/simple-kmod-demo/simple-kmod-driver-container:demo name: simple-kmod-driver-container imagePullPolicy: Always command: ["/sbin/init"] lifecycle: preStop: exec: command: ["/bin/sh", "-c", "systemctl stop kmods-via-containers@simple-kmod"] securityContext: privileged: true nodeSelector: node-role.kubernetes.io/worker: ""Create the RBAC rules and daemon set:
$ oc create -f 1000-drivercontainer.yaml
After the pods are running on the worker nodes, verify that the
kernel module is loaded successfully on the host machines withsimple_kmod.lsmodVerify that the pods are running:
$ oc get pod -n simple-kmod-demoExample output
NAME READY STATUS RESTARTS AGE simple-kmod-driver-build-1-build 0/1 Completed 0 6m simple-kmod-driver-container-b22fd 1/1 Running 0 40s simple-kmod-driver-container-jz9vn 1/1 Running 0 40s simple-kmod-driver-container-p45cc 1/1 Running 0 40sExecute the
command in the driver container pod:lsmod$ oc exec -it pod/simple-kmod-driver-container-p45cc -- lsmod | grep simpleExample output
simple_procfs_kmod 16384 0 simple_kmod 16384 0
Chapter 3. Special Resource Operator Copy linkLink copied to clipboard!
Learn about the Special Resource Operator (SRO) and how you can use it to build and manage driver containers for loading kernel modules and device drivers on nodes in an OpenShift Container Platform cluster.
The Special Resource Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
3.1. About the Special Resource Operator Copy linkLink copied to clipboard!
The Special Resource Operator (SRO) helps you manage the deployment of kernel modules and drivers on an existing OpenShift Container Platform cluster. The SRO can be used for a case as simple as building and loading a single kernel module, or as complex as deploying the driver, device plugin, and monitoring stack for a hardware accelerator.
For loading kernel modules, the SRO is designed around the use of driver containers. Driver containers are increasingly being used in cloud-native environments, especially when run on pure container operating systems, to deliver hardware drivers to the host. Driver containers extend the kernel stack beyond the out-of-the-box software and hardware features of a specific kernel. Driver containers work on various container-capable Linux distributions. With driver containers, the host operating system stays clean and there is no clash between different library versions or binaries on the host.
The functions described require a connected environment with a constant connection to the network. These functions are not available for disconnected environments.
3.2. Installing the Special Resource Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift CLI or the web console.
3.2.1. Installing the Special Resource Operator by using the CLI Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift CLI.
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You installed the OpenShift CLI ().
oc -
You are logged into the OpenShift CLI as a user with privileges.
cluster-admin
Procedure
Install the SRO in the
namespace:openshift-operatorsCreate the following
CR and save the YAML in theSubscriptionfile:sro-sub.yamlExample Subscription CR
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-special-resource-operator namespace: openshift-operators spec: channel: "stable" installPlanApproval: Automatic name: openshift-special-resource-operator source: redhat-operators sourceNamespace: openshift-marketplaceCreate the subscription object by running the following command:
$ oc create -f sro-sub.yamlSwitch to the
project:openshift-operators$ oc project openshift-operators
Verification
To verify that the Operator deployment is successful, run:
$ oc get podsExample output
NAME READY STATUS RESTARTS AGE nfd-controller-manager-7f4c5f5778-4lvvk 2/2 Running 0 89s special-resource-controller-manager-6dbf7d4f6f-9kl8h 2/2 Running 0 81sA successful deployment shows a
status.Running
3.2.2. Installing the Special Resource Operator by using the web console Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift Container Platform web console.
Procedure
- Log in to the OpenShift Container Platform web console.
Install the Special Resource Operator:
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Choose Special Resource Operator from the list of available Operators, and then click Install.
- On the Install Operator page, select a specific namespace on the cluster, select the namespace created in the previous section, and then click Install.
Verification
To verify that the Special Resource Operator installed successfully:
- Navigate to the Operators → Installed Operators page.
Ensure that Special Resource Operator is listed in the openshift-operators project with a Status of InstallSucceeded.
NoteDuring installation, an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator does not appear as installed, to troubleshoot further:
- Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Navigate to the Workloads → Pods page and check the logs for pods in the project.
openshift-operators
3.3. Using the Special Resource Operator Copy linkLink copied to clipboard!
The Special Resource Operator (SRO) is used to manage the build and deployment of a driver container. The objects required to build and deploy the container can be defined in a Helm chart.
The example in this section uses the simple-kmod
SpecialResource
ConfigMap
3.3.1. Building and running the simple-kmod SpecialResource by using a config map Copy linkLink copied to clipboard!
In this example, the simple-kmod kernel module shows how the Special Resource Operator (SRO) manages a driver container. The container is defined in the Helm chart templates that are stored in a config map.
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You set the Image Registry Operator state to for your cluster.
Managed -
You installed the OpenShift CLI ().
oc -
You are logged into the OpenShift CLI as a user with privileges.
cluster-admin - You installed the Node Feature Discovery (NFD) Operator.
- You installed the SRO.
-
You installed the Helm CLI ().
helm
Procedure
To create a simple-kmod
object, define an image stream and build config to build the image, and a service account, role, role binding, and daemon set to run the container. The service account, role, and role binding are required to run the daemon set with the privileged security context so that the kernel module can be loaded.SpecialResourceCreate a
directory, and change into it:templates$ mkdir -p chart/simple-kmod-0.0.1/templates$ cd chart/simple-kmod-0.0.1/templatesSave this YAML template for the image stream and build config in the
directory astemplates:0000-buildconfig.yamlapiVersion: image.openshift.io/v1 kind: ImageStream metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}1 name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}2 spec: {} --- apiVersion: build.openshift.io/v1 kind: BuildConfig metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverBuild}}3 name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverBuild}}4 annotations: specialresource.openshift.io/wait: "true" specialresource.openshift.io/driver-container-vendor: simple-kmod specialresource.openshift.io/kernel-affine: "true" spec: nodeSelector: node-role.kubernetes.io/worker: "" runPolicy: "Serial" triggers: - type: "ConfigChange" - type: "ImageChange" source: git: ref: {{.Values.specialresource.spec.driverContainer.source.git.ref}} uri: {{.Values.specialresource.spec.driverContainer.source.git.uri}} type: Git strategy: dockerStrategy: dockerfilePath: Dockerfile.SRO buildArgs: - name: "IMAGE" value: {{ .Values.driverToolkitImage }} {{- range $arg := .Values.buildArgs }} - name: {{ $arg.name }} value: {{ $arg.value }} {{- end }} - name: KVER value: {{ .Values.kernelFullVersion }} output: to: kind: ImageStreamTag name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}:v{{.Values.kernelFullVersion}}5 Save the following YAML template for the RBAC resources and daemon set in the
directory astemplates:1000-driver-container.yamlapiVersion: v1 kind: ServiceAccount metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} rules: - apiGroups: - security.openshift.io resources: - securitycontextconstraints verbs: - use resourceNames: - privileged --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} subjects: - kind: ServiceAccount name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} namespace: {{.Values.specialresource.spec.namespace}} --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} annotations: specialresource.openshift.io/wait: "true" specialresource.openshift.io/state: "driver-container" specialresource.openshift.io/driver-container-vendor: simple-kmod specialresource.openshift.io/kernel-affine: "true" specialresource.openshift.io/from-configmap: "true" spec: updateStrategy: type: OnDelete selector: matchLabels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} template: metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} spec: priorityClassName: system-node-critical serviceAccount: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} serviceAccountName: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} containers: - image: image-registry.openshift-image-registry.svc:5000/{{.Values.specialresource.spec.namespace}}/{{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}:v{{.Values.kernelFullVersion}} name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} imagePullPolicy: Always command: ["/sbin/init"] lifecycle: preStop: exec: command: ["/bin/sh", "-c", "systemctl stop kmods-via-containers@{{.Values.specialresource.metadata.name}}"] securityContext: privileged: true nodeSelector: node-role.kubernetes.io/worker: "" feature.node.kubernetes.io/kernel-version.full: "{{.Values.KernelFullVersion}}"Change into the
directory:chart/simple-kmod-0.0.1$ cd ..Save the following YAML for the chart as
in theChart.yamldirectory:chart/simple-kmod-0.0.1apiVersion: v2 name: simple-kmod description: Simple kmod will deploy a simple kmod driver-container icon: https://avatars.githubusercontent.com/u/55542927 type: application version: 0.0.1 appVersion: 1.0.0
From the
directory, create the chart using thechartcommand:helm package$ helm package simple-kmod-0.0.1/Example output
Successfully packaged chart and saved it to: /data/<username>/git/<github_username>/special-resource-operator/yaml-for-docs/chart/simple-kmod-0.0.1/simple-kmod-0.0.1.tgzCreate a config map to store the chart files:
Create a directory for the config map files:
$ mkdir cmCopy the Helm chart into the
directory:cm$ cp simple-kmod-0.0.1.tgz cm/simple-kmod-0.0.1.tgzCreate an index file specifying the Helm repo that contains the Helm chart:
$ helm repo index cm --url=cm://simple-kmod/simple-kmod-chartCreate a namespace for the objects defined in the Helm chart:
$ oc create namespace simple-kmodCreate the config map object:
$ oc create cm simple-kmod-chart --from-file=cm/index.yaml --from-file=cm/simple-kmod-0.0.1.tgz -n simple-kmod
Use the following
manifest to deploy the simple-kmod object using the Helm chart that you created in the config map. Save this YAML asSpecialResource:simple-kmod-configmap.yamlapiVersion: sro.openshift.io/v1beta1 kind: SpecialResource metadata: name: simple-kmod spec: #debug: true1 namespace: simple-kmod chart: name: simple-kmod version: 0.0.1 repository: name: example url: cm://simple-kmod/simple-kmod-chart2 set: kind: Values apiVersion: sro.openshift.io/v1beta1 kmodNames: ["simple-kmod", "simple-procfs-kmod"] buildArgs: - name: "KMODVER" value: "SRO" driverContainer: source: git: ref: "master" uri: "https://github.com/openshift-psap/kvc-simple-kmod.git"From a command line, create the
file:SpecialResource$ oc create -f simple-kmod-configmap.yaml
To remove the simple-kmod kernel module from the node, delete the simple-kmod
SpecialResource
oc delete
Verification
The
simple-kmod
simple-kmod
simple-kmod
Use
command to display the status of the build pods:oc get pods$ oc get pods -n simple-kmodExample output
NAME READY STATUS RESTARTS AGE simple-kmod-driver-build-12813789169ac0ee-1-build 0/1 Completed 0 7m12s simple-kmod-driver-container-12813789169ac0ee-mjsnh 1/1 Running 0 8m2s simple-kmod-driver-container-12813789169ac0ee-qtkff 1/1 Running 0 8m2sUse the
command, along with the build pod name obtained from theoc logscommand above, to display the logs of the simple-kmod driver container image build:oc get pods$ oc logs pod/simple-kmod-driver-build-12813789169ac0ee-1-build -n simple-kmodTo verify that the simple-kmod kernel modules are loaded, execute the
command in one of the driver container pods that was returned from thelsmodcommand above:oc get pods$ oc exec -n simple-kmod -it pod/simple-kmod-driver-container-12813789169ac0ee-mjsnh -- lsmod | grep simpleExample output
simple_procfs_kmod 16384 0 simple_kmod 16384 0
The
sro_kind_completed_info
3.3.2. Building and running the simple-kmod SpecialResource for a hub-and-spoke topology Copy linkLink copied to clipboard!
You can use the Special Resource Operator (SRO) on a hub-and-spoke deployment in which Red Hat Advanced Cluster Management (RHACM) connects a hub cluster to one or more managed clusters.
This example procedure shows how the SRO builds driver containers in the hub. The SRO watches hub cluster resources to identify OpenShift Container Platform versions for the helm charts that it uses to create resources which it delivers to spokes.
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You installed the OpenShift CLI ().
oc -
You are logged into the OpenShift CLI as a user with privileges.
cluster-admin - You installed the SRO.
-
You installed the Helm CLI ().
helm - You installed Red Hat Advanced Cluster Management (RHACM).
- You configured a container registry.
Procedure
Create a
directory by running the following command:templates$ mkdir -p charts/acm-simple-kmod-0.0.1/templatesChange to the
directory by running the following command:templates$ cd charts/acm-simple-kmod-0.0.1/templatesCreate templates files for the
,BuildConfig, andPolicyresources.PlacementRuleSave this YAML template for the image stream and build config in the
directory astemplates.0001-buildconfig.yamlapiVersion: build.openshift.io/v1 kind: BuildConfig metadata: labels: app: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} name: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} annotations: specialresource.openshift.io/wait: "true" spec: nodeSelector: node-role.kubernetes.io/worker: "" runPolicy: "Serial" triggers: - type: "ConfigChange" - type: "ImageChange" source: dockerfile: | FROM {{ .Values.driverToolkitImage }} as builder WORKDIR /build/ RUN git clone -b {{.Values.specialResourceModule.spec.set.git.ref}} {{.Values.specialResourceModule.spec.set.git.uri}} WORKDIR /build/simple-kmod RUN make all install KVER={{ .Values.kernelFullVersion }} FROM registry.redhat.io/ubi8/ubi-minimal RUN microdnf -y install kmod COPY --from=builder /etc/driver-toolkit-release.json /etc/ COPY --from=builder /lib/modules/{{ .Values.kernelFullVersion }}/* /lib/modules/{{ .Values.kernelFullVersion }}/ strategy: dockerStrategy: dockerfilePath: Dockerfile.SRO buildArgs: - name: "IMAGE" value: {{ .Values.driverToolkitImage }} {{- range $arg := .Values.buildArgs }} - name: {{ $arg.name }} value: {{ $arg.value }} {{- end }} - name: KVER value: {{ .Values.kernelFullVersion }} output: to: kind: DockerImage name: {{.Values.registry}}/{{.Values.specialResourceModule.metadata.name}}-{{.Values.groupName.driverContainer}}:{{.Values.kernelFullVersion}}Save this YAML template for the ACM policy in the
directory astemplates.0002-policy.yamlapiVersion: policy.open-cluster-management.io/v1 kind: Policy metadata: name: policy-{{.Values.specialResourceModule.metadata.name}}-ds annotations: policy.open-cluster-management.io/categories: CM Configuration Management policy.open-cluster-management.io/controls: CM-2 Baseline Configuration policy.open-cluster-management.io/standards: NIST-CSF spec: remediationAction: enforce disabled: false policy-templates: - objectDefinition: apiVersion: policy.open-cluster-management.io/v1 kind: ConfigurationPolicy metadata: name: config-{{.Values.specialResourceModule.metadata.name}}-ds spec: remediationAction: enforce severity: low namespaceselector: exclude: - kube-* include: - '*' object-templates: - complianceType: musthave objectDefinition: apiVersion: v1 kind: Namespace metadata: name: {{.Values.specialResourceModule.spec.namespace}} - complianceType: mustonlyhave objectDefinition: apiVersion: v1 kind: ServiceAccount metadata: name: {{.Values.specialResourceModule.metadata.name}} namespace: {{.Values.specialResourceModule.spec.namespace}} - complianceType: mustonlyhave objectDefinition: apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: {{.Values.specialResourceModule.metadata.name}} namespace: {{.Values.specialResourceModule.spec.namespace}} rules: - apiGroups: - security.openshift.io resources: - securitycontextconstraints verbs: - use resourceNames: - privileged - complianceType: mustonlyhave objectDefinition: apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: {{.Values.specialResourceModule.metadata.name}} namespace: {{.Values.specialResourceModule.spec.namespace}} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: {{.Values.specialResourceModule.metadata.name}} subjects: - kind: ServiceAccount name: {{.Values.specialResourceModule.metadata.name}} namespace: {{.Values.specialResourceModule.spec.namespace}} - complianceType: musthave objectDefinition: apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} name: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} namespace: {{.Values.specialResourceModule.spec.namespace}} spec: updateStrategy: type: OnDelete selector: matchLabels: app: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} template: metadata: labels: app: {{ printf "%s-%s" .Values.specialResourceModule.metadata.name .Values.kernelFullVersion | replace "." "-" | replace "_" "-" | trunc 63 }} spec: priorityClassName: system-node-critical serviceAccount: {{.Values.specialResourceModule.metadata.name}} serviceAccountName: {{.Values.specialResourceModule.metadata.name}} containers: - image: {{.Values.registry}}/{{.Values.specialResourceModule.metadata.name}}-{{.Values.groupName.driverContainer}}:{{.Values.kernelFullVersion}} name: {{.Values.specialResourceModule.metadata.name}} imagePullPolicy: Always command: [sleep, infinity] lifecycle: preStop: exec: command: ["modprobe", "-r", "-a" , "simple-kmod", "simple-procfs-kmod"] securityContext: privileged: trueSave this YAML template for the placement of policies in the
directory astemplates.0003-policy.yamlapiVersion: apps.open-cluster-management.io/v1 kind: PlacementRule metadata: name: {{.Values.specialResourceModule.metadata.name}}-placement spec: clusterConditions: - status: "True" type: ManagedClusterConditionAvailable clusterSelector: matchExpressions: - key: name operator: NotIn values: - local-cluster --- apiVersion: policy.open-cluster-management.io/v1 kind: PlacementBinding metadata: name: {{.Values.specialResourceModule.metadata.name}}-binding placementRef: apiGroup: apps.open-cluster-management.io kind: PlacementRule name: {{.Values.specialResourceModule.metadata.name}}-placement subjects: - apiGroup: policy.open-cluster-management.io kind: Policy name: policy-{{.Values.specialResourceModule.metadata.name}}-dsChange into the
directory by running the following command:charts/acm-simple-kmod-0.0.1cd ..Save the following YAML template for the chart as
in theChart.yamldirectory:charts/acm-simple-kmod-0.0.1apiVersion: v2 name: acm-simple-kmod description: Build ACM enabled simple-kmod driver with SpecialResourceOperator icon: https://avatars.githubusercontent.com/u/55542927 type: application version: 0.0.1 appVersion: 1.6.4
From the
directory, create the chart using the command:charts$ helm package acm-simple-kmod-0.0.1/Example output
Successfully packaged chart and saved it to: <directory>/charts/acm-simple-kmod-0.0.1.tgzCreate a config map to store the chart files.
Create a directory for the config map files by running the following command:
$ mkdir cmCopy the Helm chart into the
directory by running the following command:cm$ cp acm-simple-kmod-0.0.1.tgz cm/acm-simple-kmod-0.0.1.tgzCreate an index file specifying the Helm repository that contains the Helm chart by running the following command:
$ helm repo index cm --url=cm://acm-simple-kmod/acm-simple-kmod-chartCreate a namespace for the objects defined in the Helm chart by running the following command:
$ oc create namespace acm-simple-kmodCreate the config map object by running the following command:
$ oc create cm acm-simple-kmod-chart --from-file=cm/index.yaml --from-file=cm/acm-simple-kmod-0.0.1.tgz -n acm-simple-kmod
Use the following
manifest to deploy theSpecialResourceModuleobject using the Helm chart that you created in the config map. Save this YAML file assimple-kmod:acm-simple-kmod.yamlapiVersion: sro.openshift.io/v1beta1 kind: SpecialResourceModule metadata: name: acm-simple-kmod spec: namespace: acm-simple-kmod chart: name: acm-simple-kmod version: 0.0.1 repository: name: acm-simple-kmod url: cm://acm-simple-kmod/acm-simple-kmod-chart set: kind: Values apiVersion: sro.openshift.io/v1beta1 buildArgs: - name: "KMODVER" value: "SRO" registry: <your_registry>1 git: ref: master uri: https://github.com/openshift-psap/kvc-simple-kmod.git watch: - path: "$.metadata.labels.openshiftVersion" apiVersion: cluster.open-cluster-management.io/v1 kind: ManagedCluster name: spoke1- 1
- Specify the URL for a registry that you have configured.
Create the special resource module by running the following command:
$ oc apply -f charts/examples/acm-simple-kmod.yaml
Verification
Check the status of the build pods by running the following command:
$ KUBECONFIG=~/hub/auth/kubeconfig oc get pod -n acm-simple-kmodExample output
NAME READY STATUS RESTARTS AGE acm-simple-kmod-4-18-0-305-34-2-el8-4-x86-64-1-build 0/1 Completed 0 42mCheck that the policies have been created by running the following command:
$ KUBECONFIG=~/hub/auth/kubeconfig oc get placementrules,placementbindings,policies -n acm-simple-kmodExample output
NAME AGE REPLICAS placementrule.apps.open-cluster-management.io/acm-simple-kmod-placement 40m NAME AGE placementbinding.policy.open-cluster-management.io/acm-simple-kmod-binding 40m NAME REMEDIATION ACTION COMPLIANCE STATE AGE policy.policy.open-cluster-management.io/policy-acm-simple-kmod-ds enforce Compliant 40mCheck that the resources have been reconciled by running the following command:
$ KUBECONFIG=~/hub/auth/kubeconfig oc get specialresourcemodule acm-simple-kmod -o json | jq -r '.status'Example output
{ "versions": { "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6a3330ef5a178435721ff4efdde762261a9c55212e9b4534385e04037693fbe4": { "complete": true } } }Check that the resources are running in the spoke by running the following command:
$ KUBECONFIG=~/spoke1/kubeconfig oc get ds,pod -n acm-simple-kmodExample output
AME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/acm-simple-kmod-4-18-0-305-45-1-el8-4-x86-64 3 3 3 3 3 <none> 26m NAME READY STATUS RESTARTS AGE pod/acm-simple-kmod-4-18-0-305-45-1-el8-4-x86-64-brw78 1/1 Running 0 26m pod/acm-simple-kmod-4-18-0-305-45-1-el8-4-x86-64-fqh5h 1/1 Running 0 26m pod/acm-simple-kmod-4-18-0-305-45-1-el8-4-x86-64-m9sfd 1/1 Running 0 26m
3.4. Prometheus Special Resource Operator metrics Copy linkLink copied to clipboard!
The Special Resource Operator (SRO) exposes the following Prometheus metrics through the
metrics
| Metric Name | Description |
|---|---|
|
| Returns the nodes that are running pods created by a SRO custom resource (CR). This metric is available for
|
|
| Represents whether a
|
|
| Represents whether the SRO has finished processing a CR successfully (value
|
|
| Returns the number of SRO CRs in the cluster, regardless of their state. |
Chapter 4. Node Feature Discovery Operator Copy linkLink copied to clipboard!
Learn about the Node Feature Discovery (NFD) Operator and how you can use it to expose node-level information by orchestrating Node Feature Discovery, a Kubernetes add-on for detecting hardware features and system configuration.
4.1. About the Node Feature Discovery Operator Copy linkLink copied to clipboard!
The Node Feature Discovery Operator (NFD) manages the detection of hardware features and configuration in an OpenShift Container Platform cluster by labeling the nodes with hardware-specific information. NFD labels the host with node-specific attributes, such as PCI cards, kernel, operating system version, and so on.
The NFD Operator can be found on the Operator Hub by searching for “Node Feature Discovery”.
4.2. Installing the Node Feature Discovery Operator Copy linkLink copied to clipboard!
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the NFD daemon set. As a cluster administrator, you can install the NFD Operator by using the OpenShift Container Platform CLI or the web console.
4.2.1. Installing the NFD Operator using the CLI Copy linkLink copied to clipboard!
As a cluster administrator, you can install the NFD Operator using the CLI.
Prerequisites
- An OpenShift Container Platform cluster
-
Install the OpenShift CLI ().
oc -
Log in as a user with privileges.
cluster-admin
Procedure
Create a namespace for the NFD Operator.
Create the following
custom resource (CR) that defines theNamespacenamespace, and then save the YAML in theopenshift-nfdfile:nfd-namespace.yamlapiVersion: v1 kind: Namespace metadata: name: openshift-nfdCreate the namespace by running the following command:
$ oc create -f nfd-namespace.yaml
Install the NFD Operator in the namespace you created in the previous step by creating the following objects:
Create the following
CR and save the YAML in theOperatorGroupfile:nfd-operatorgroup.yamlapiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: generateName: openshift-nfd- name: openshift-nfd namespace: openshift-nfd spec: targetNamespaces: - openshift-nfdCreate the
CR by running the following command:OperatorGroup$ oc create -f nfd-operatorgroup.yamlCreate the following
CR and save the YAML in theSubscriptionfile:nfd-sub.yamlExample Subscription
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: nfd namespace: openshift-nfd spec: channel: "stable" installPlanApproval: Automatic name: nfd source: redhat-operators sourceNamespace: openshift-marketplaceCreate the subscription object by running the following command:
$ oc create -f nfd-sub.yamlChange to the
project:openshift-nfd$ oc project openshift-nfd
Verification
To verify that the Operator deployment is successful, run:
$ oc get podsExample output
NAME READY STATUS RESTARTS AGE nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 10mA successful deployment shows a
status.Running
4.2.2. Installing the NFD Operator using the web console Copy linkLink copied to clipboard!
As a cluster administrator, you can install the NFD Operator using the web console.
Procedure
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Choose Node Feature Discovery from the list of available Operators, and then click Install.
- On the Install Operator page, select A specific namespace on the cluster, and then click Install. You do not need to create a namespace because it is created for you.
Verification
To verify that the NFD Operator installed successfully:
- Navigate to the Operators → Installed Operators page.
Ensure that Node Feature Discovery is listed in the openshift-nfd project with a Status of InstallSucceeded.
NoteDuring installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
Troubleshooting
If the Operator does not appear as installed, troubleshoot further:
- Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Navigate to the Workloads → Pods page and check the logs for pods in the project.
openshift-nfd
4.3. Using the Node Feature Discovery Operator Copy linkLink copied to clipboard!
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the Node-Feature-Discovery daemon set by watching for a
NodeFeatureDiscovery
NodeFeatureDiscovery
namespace
image
imagePullPolicy
nfd-worker-conf
As a cluster administrator, you can create a
NodeFeatureDiscovery
4.3.1. Create a NodeFeatureDiscovery instance using the CLI Copy linkLink copied to clipboard!
As a cluster administrator, you can create a
NodeFeatureDiscovery
Prerequisites
- An OpenShift Container Platform cluster
-
Install the OpenShift CLI ().
oc -
Log in as a user with privileges.
cluster-admin - Install the NFD Operator.
Procedure
Create the following
Custom Resource (CR), and then save the YAML in theNodeFeatureDiscoveryfile:NodeFeatureDiscovery.yamlapiVersion: nfd.openshift.io/v1 kind: NodeFeatureDiscovery metadata: name: nfd-instance namespace: openshift-nfd spec: instance: "" # instance is empty by default topologyupdater: false # False by default operand: image: registry.redhat.io/openshift4/ose-node-feature-discovery:v4.11 imagePullPolicy: Always workerConfig: configData: | core: # labelWhiteList: # noPublish: false sleepInterval: 60s # sources: [all] # klog: # addDirHeader: false # alsologtostderr: false # logBacktraceAt: # logtostderr: true # skipHeaders: false # stderrthreshold: 2 # v: 0 # vmodule: ## NOTE: the following options are not dynamically run-time configurable ## and require a nfd-worker restart to take effect after being changed # logDir: # logFile: # logFileMaxSize: 1800 # skipLogHeaders: false sources: cpu: cpuid: # NOTE: whitelist has priority over blacklist attributeBlacklist: - "BMI1" - "BMI2" - "CLMUL" - "CMOV" - "CX16" - "ERMS" - "F16C" - "HTT" - "LZCNT" - "MMX" - "MMXEXT" - "NX" - "POPCNT" - "RDRAND" - "RDSEED" - "RDTSCP" - "SGX" - "SSE" - "SSE2" - "SSE3" - "SSE4.1" - "SSE4.2" - "SSSE3" attributeWhitelist: kernel: kconfigFile: "/path/to/kconfig" configOpts: - "NO_HZ" - "X86" - "DMI" pci: deviceClassWhitelist: - "0200" - "03" - "12" deviceLabelFields: - "class" customConfig: configData: | - name: "more.kernel.features" matchOn: - loadedKMod: ["example_kmod3"]
For more details on how to customize NFD workers, refer to the Configuration file reference of nfd-worker.
Create the
CR instance by running the following command:NodeFeatureDiscovery$ oc create -f NodeFeatureDiscovery.yaml
Verification
To verify that the instance is created, run:
$ oc get podsExample output
NAME READY STATUS RESTARTS AGE nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 11m nfd-master-hcn64 1/1 Running 0 60s nfd-master-lnnxx 1/1 Running 0 60s nfd-master-mp6hr 1/1 Running 0 60s nfd-worker-vgcz9 1/1 Running 0 60s nfd-worker-xqbws 1/1 Running 0 60sA successful deployment shows a
status.Running
4.3.2. Create a NodeFeatureDiscovery CR using the web console Copy linkLink copied to clipboard!
Procedure
- Navigate to the Operators → Installed Operators page.
- Find Node Feature Discovery and see a box under Provided APIs.
- Click Create instance.
-
Edit the values of the CR.
NodeFeatureDiscovery - Click Create.
4.4. Configuring the Node Feature Discovery Operator Copy linkLink copied to clipboard!
4.4.1. core Copy linkLink copied to clipboard!
The
core
core.sleepInterval
core.sleepInterval
This value is overridden by the deprecated
--sleep-interval
Example usage
core:
sleepInterval: 60s
The default value is
60s
core.sources
core.sources
all
This value is overridden by the deprecated
--sources
Default:
[all]
Example usage
core:
sources:
- system
- custom
core.labelWhiteList
core.labelWhiteList
The regular expression is only matched against the basename part of the label, the part of the name after '/'. The label prefix, or namespace, is omitted.
This value is overridden by the deprecated
--label-whitelist
Default:
null
Example usage
core:
labelWhiteList: '^cpu-cpuid'
core.noPublish
Setting
core.noPublish
true
nfd-master
nfd-worker
nfd-master
This value is overridden by the
--no-publish
Example:
Example usage
core:
noPublish: true
The default value is
false
core.klog
The following options specify the logger configuration, most of which can be dynamically adjusted at run-time.
The logger options can also be specified using command line flags, which take precedence over any corresponding config file options.
core.klog.addDirHeader
If set to
true
core.klog.addDirHeader
Default:
false
Run-time configurable: yes
core.klog.alsologtostderr
Log to standard error as well as files.
Default:
false
Run-time configurable: yes
core.klog.logBacktraceAt
When logging hits line file:N, emit a stack trace.
Default: empty
Run-time configurable: yes
core.klog.logDir
If non-empty, write log files in this directory.
Default: empty
Run-time configurable: no
core.klog.logFile
If not empty, use this log file.
Default: empty
Run-time configurable: no
core.klog.logFileMaxSize
core.klog.logFileMaxSize
0
Default:
1800
Run-time configurable: no
core.klog.logtostderr
Log to standard error instead of files
Default:
true
Run-time configurable: yes
core.klog.skipHeaders
If
core.klog.skipHeaders
true
Default:
false
Run-time configurable: yes
core.klog.skipLogHeaders
If
core.klog.skipLogHeaders
true
Default:
false
Run-time configurable: no
core.klog.stderrthreshold
Logs at or above this threshold go to stderr.
Default:
2
Run-time configurable: yes
core.klog.v
core.klog.v
Default:
0
Run-time configurable: yes
core.klog.vmodule
core.klog.vmodule
pattern=N
Default: empty
Run-time configurable: yes
4.4.2. sources Copy linkLink copied to clipboard!
The
sources
sources.cpu.cpuid.attributeBlacklist
Prevent publishing
cpuid
This value is overridden by
sources.cpu.cpuid.attributeWhitelist
Default:
[BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SGXLC, SSE, SSE2, SSE3, SSE4.1, SSE4.2, SSSE3]
Example usage
sources:
cpu:
cpuid:
attributeBlacklist: [MMX, MMXEXT]
sources.cpu.cpuid.attributeWhitelist
Only publish the
cpuid
sources.cpu.cpuid.attributeWhitelist
sources.cpu.cpuid.attributeBlacklist
Default: empty
Example usage
sources:
cpu:
cpuid:
attributeWhitelist: [AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512VL]
sources.kernel.kconfigFile
sources.kernel.kconfigFile
Default: empty
Example usage
sources:
kernel:
kconfigFile: "/path/to/kconfig"
sources.kernel.configOpts
sources.kernel.configOpts
Default:
[NO_HZ, NO_HZ_IDLE, NO_HZ_FULL, PREEMPT]
Example usage
sources:
kernel:
configOpts: [NO_HZ, X86, DMI]
sources.pci.deviceClassWhitelist
sources.pci.deviceClassWhitelist
03
0300
deviceLabelFields
Default:
["03", "0b40", "12"]
Example usage
sources:
pci:
deviceClassWhitelist: ["0200", "03"]
sources.pci.deviceLabelFields
sources.pci.deviceLabelFields
class
vendor
device
subsystem_vendor
subsystem_device
Default:
[class, vendor]
Example usage
sources:
pci:
deviceLabelFields: [class, vendor, device]
With the example config above, NFD would publish labels such as
feature.node.kubernetes.io/pci-<class-id>_<vendor-id>_<device-id>.present=true
sources.usb.deviceClassWhitelist
sources.usb.deviceClassWhitelist
deviceLabelFields
Default:
["0e", "ef", "fe", "ff"]
Example usage
sources:
usb:
deviceClassWhitelist: ["ef", "ff"]
sources.usb.deviceLabelFields
sources.usb.deviceLabelFields
class
vendor
device
Default:
[class, vendor, device]
Example usage
sources:
pci:
deviceLabelFields: [class, vendor]
With the example config above, NFD would publish labels like:
feature.node.kubernetes.io/usb-<class-id>_<vendor-id>.present=true
sources.custom
sources.custom
Default: empty
Example usage
source:
custom:
- name: "my.custom.feature"
matchOn:
- loadedKMod: ["e1000e"]
- pciId:
class: ["0200"]
vendor: ["8086"]
4.5. Using the NFD Topology Updater Copy linkLink copied to clipboard!
The Node Feature Discovery (NFD) Topology Updater is a daemon responsible for examining allocated resources on a worker node. It accounts for resources that are available to be allocated to new pod on a per-zone basis, where a zone can be a Non-Uniform Memory Access (NUMA) node. The NFD Topology Updater communicates the information to nfd-master, which creates a
NodeResourceTopology
To enable the Topology Updater workers in NFD, set the
topologyupdater
true
NodeFeatureDiscovery
4.5.1. NodeResourceTopology CR Copy linkLink copied to clipboard!
When run with NFD Topology Updater, NFD creates custom resource instances corresponding to the node resource hardware topology, such as:
apiVersion: topology.node.k8s.io/v1alpha1
kind: NodeResourceTopology
metadata:
name: node1
topologyPolicies: ["SingleNUMANodeContainerLevel"]
zones:
- name: node-0
type: Node
resources:
- name: cpu
capacity: 20
allocatable: 16
available: 10
- name: vendor/nic1
capacity: 3
allocatable: 3
available: 3
- name: node-1
type: Node
resources:
- name: cpu
capacity: 30
allocatable: 30
available: 15
- name: vendor/nic2
capacity: 6
allocatable: 6
available: 6
- name: node-2
type: Node
resources:
- name: cpu
capacity: 30
allocatable: 30
available: 15
- name: vendor/nic1
capacity: 3
allocatable: 3
available: 3
4.5.2. NFD Topology Updater command line flags Copy linkLink copied to clipboard!
To view available command line flags, run the
nfd-topology-updater -help
$ podman run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-topology-updater -help
-ca-file
The
-ca-file
-cert-file
Default: empty
The
-ca-file
-cert-file
-key-file
Example
$ nfd-topology-updater -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key
-cert-file
The
-cert-file
-ca-file
-key-file flags
Default: empty
The
-cert-file
-ca-file
-key-file
Example
$ nfd-topology-updater -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key -ca-file=/opt/nfd/ca.crt
-h, -help
Print usage and exit.
-key-file
The
-key-file
-ca-file
-cert-file
-cert-file
Default: empty
The
-key-file
-ca-file
-cert-file
Example
$ nfd-topology-updater -key-file=/opt/nfd/updater.key -cert-file=/opt/nfd/updater.crt -ca-file=/opt/nfd/ca.crt
-kubelet-config-file
The
-kubelet-config-file
Default:
/host-var/lib/kubelet/config.yaml
Example
$ nfd-topology-updater -kubelet-config-file=/var/lib/kubelet/config.yaml
-no-publish
The
-no-publish
Default:
false
Example
$ nfd-topology-updater -no-publish
4.5.2.1. -oneshot Copy linkLink copied to clipboard!
The
-oneshot
Default:
false
Example
$ nfd-topology-updater -oneshot -no-publish
-podresources-socket
The
-podresources-socket
Default:
/host-var/liblib/kubelet/pod-resources/kubelet.sock
Example
$ nfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock
-server
The
-server
Default:
localhost:8080
Example
$ nfd-topology-updater -server=nfd-master.nfd.svc.cluster.local:443
-server-name-override
The
-server-name-override
Default: empty
Example
$ nfd-topology-updater -server-name-override=localhost
-sleep-interval
The
-sleep-interval
Default:
60s
Example
$ nfd-topology-updater -sleep-interval=1h
-version
Print version and exit.
-watch-namespace
The
-watch-namespace
*
Default:
*
Example
$ nfd-topology-updater -watch-namespace=rte
Legal Notice
Copy linkLink copied to clipboard!
Copyright © Red Hat
OpenShift documentation is licensed under the Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0).
Modified versions must remove all Red Hat trademarks.
Portions adapted from https://github.com/kubernetes-incubator/service-catalog/ with modifications by Red Hat.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of the OpenJS Foundation.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.