5.2. 禁用 IRDMA 内核模块
在一些系统中,包括 DellR750xa,IRDMA 内核模块在卸载和加载 DOCA 驱动程序时为 NVIDIA Network Operator 造成问题。使用以下步骤禁用该模块。
流程
运行以下命令生成以下机器配置文件:
$ cat <<EOF > 99-machine-config-blacklist-irdma.yaml输出示例
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-worker-blacklist-irdma spec: kernelArguments: - "module_blacklist=irdma"运行以下命令,在集群中创建机器配置并等待节点重新引导:
$ oc create -f 99-machine-config-blacklist-irdma.yaml输出示例
machineconfig.machineconfiguration.openshift.io/99-worker-blacklist-irdma created运行以下命令,在每个节点上验证模块尚未加载的 debug pod:
$ oc debug node/nvd-srv-32.nvidia.eng.rdu2.dc.redhat.com Starting pod/nvd-srv-32nvidiaengrdu2dcredhatcom-debug-btfj2 ... To use host binaries, run `chroot /host` Pod IP: 10.6.135.11 If you don't see a command prompt, try pressing enter. sh-5.1# chroot /host sh-5.1# lsmod|grep irdma sh-5.1#