18.4. 运行延迟测试
运行集群延迟测试,以验证 Cloud-native Network Function (CNF) 工作负载的节点调整。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
流程
在包含
kubeconfig文件的目录中打开 shell 提示符。您可以在当前目录中为测试镜像提供
kubeconfig文件,及其相关的$KUBECONFIG环境变量(通过卷挂载)。这允许运行的容器使用容器内的kubeconfig文件。输入以下命令运行延迟测试:
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true registry.redhat.io/openshift4/cnf-tests-rhel8:v4.8 \ /usr/bin/test-run.sh -ginkgo.focus="\[performance\]\ Latency\ Test"-
可选:附加
-ginkgo.dryRun以空运行模式运行延迟测试。这对于检查测试运行的内容非常有用。 -
可选: 附加
-ginkgo.v用来运行测试,并增加详细程度。 可选: 要针对特定的性能配置集运行延迟测试,请运行以下命令,替换适当的值:
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e LATENCY_TEST_RUNTIME=600 -e OSLAT_MAXIMUM_LATENCY=20 \ -e PERF_TEST_PROFILE=<performance_profile> registry.redhat.io/openshift4/cnf-tests-rhel8:v4.8 \ /usr/bin/test-run.sh -ginkgo.focus="[performance]\ Latency\ Test"其中:
- <performance_profile>
- 是您要对其运行延迟测试的性能配置集的名称。
重要如需有效延迟测试结果,请至少运行测试 12 小时。
18.4.1. 运行 oslat 复制链接链接已复制到粘贴板!
oslat 测试模拟 CPU 密集型 DPDK 应用程序,并测量所有中断和中断来测试集群处理 CPU 大量数据处理的方式。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
先决条件
-
您使用客户门户网站凭证登录到
registry.redhat.io。 - 已使用 Performance addon operator 应用集群性能配置集。
流程
要执行
oslat测试,请运行以下命令,根据需要替换变量值:$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true -e ROLE_WORKER_CNF=worker-cnf \ -e LATENCY_TEST_CPUS=7 -e LATENCY_TEST_RUNTIME=600 -e OSLAT_MAXIMUM_LATENCY=20 \ registry.redhat.io/openshift4/cnf-tests-rhel8:v4.8 \ /usr/bin/test-run.sh -ginkgo.v -ginkgo.focus="oslat"LATENCY_TEST_CPUS指定使用oslat命令测试的 CPU 列表。命令运行
oslat工具 10 分钟(600 秒)。当观察到的最大延迟低于OSLAT_MAXIMUM_LATENCY(20 μs)时,测试会成功运行。如果结果超过延迟阈值,测试会失败。
重要对于有效结果,测试应至少运行 12 小时。
失败输出示例
running /usr/bin//validationsuite -ginkgo.v -ginkgo.focus=oslat I0829 12:36:55.386776 8 request.go:668] Waited for 1.000303471s due to client-side throttling, not priority and fairness, request: GET:https://api.cnfdc8.t5g.lab.eng.bos.redhat.com:6443/apis/authentication.k8s.io/v1?timeout=32s Running Suite: CNF Features e2e validation ========================================== Discovery mode enabled, skipping setup running /usr/bin//cnftests -ginkgo.v -ginkgo.focus=oslat I0829 12:37:01.219077 20 request.go:668] Waited for 1.050010755s due to client-side throttling, not priority and fairness, request: GET:https://api.cnfdc8.t5g.lab.eng.bos.redhat.com:6443/apis/snapshot.storage.k8s.io/v1beta1?timeout=32s Running Suite: CNF Features e2e integration tests ================================================= Random Seed: 1630240617 Will run 1 of 142 specs SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS ------------------------------ [performance] Latency Test with the oslat image should succeed /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:134 STEP: Waiting two minutes to download the latencyTest image STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase Aug 29 12:37:59.324: [INFO]: found mcd machine-config-daemon-wf4w8 for node cnfdc8.clus2.t5g.lab.eng.bos.redhat.com • Failure [49.246 seconds] [performance] Latency Test /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:59 with the oslat image /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:112 should succeed [It] /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:134 The current latency 27 is bigger than the expected one 201 Expected <bool>: false to be true /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:168 Log file created at: 2021/08/29 13:25:21 Running on machine: oslat-57c2g Binary: Built with gc go1.16.6 for linux/amd64 Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg I0829 13:25:21.569182 1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/vmlinuz-4.18.0-305.10.2.rt7.83.el8_4.x86_64 ip=dhcp random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ostree=/ostree/boot.0/rhcos/612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/0 ignition.platform.id=openstack root=UUID=5a4ddf16-9372-44d9-ac4e-3ee329e16ab3 rw rootflags=prjquota skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=000000ff,ffffffff,ffffffff,fffffff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 default_hugepagesz=1G hugepagesz=2M hugepages=128 nmi_watchdog=0 audit=0 mce=off processor.max_cstate=1 idle=poll intel_idle.max_cstate=0 I0829 13:25:21.569345 1 node.go:44] Environment information: kernel version 4.18.0-305.10.2.rt7.83.el8_4.x86_64 I0829 13:25:21.569367 1 main.go:53] Running the oslat command with arguments \ [--duration 600 --rtprio 1 --cpu-list 4,6,52,54,56,58 --cpu-main-thread 2] I0829 13:35:22.632263 1 main.go:59] Succeeded to run the oslat command: oslat V 2.00 Total runtime: 600 seconds Thread priority: SCHED_FIFO:1 CPU list: 4,6,52,54,56,58 CPU for main thread: 2 Workload: no Workload mem: 0 (KiB) Preheat cores: 6 Pre-heat for 1 seconds... Test starts... Test completed. Core: 4 6 52 54 56 58 CPU Freq: 2096 2096 2096 2096 2096 2096 (Mhz) 001 (us): 19390720316 19141129810 20265099129 20280959461 19391991159 19119877333 002 (us): 5304 5249 5777 5947 6829 4971 003 (us): 28 14 434 47 208 21 004 (us): 1388 853 123568 152817 5576 0 005 (us): 207850 223544 103827 91812 227236 231563 006 (us): 60770 122038 277581 323120 122633 122357 007 (us): 280023 223992 63016 25896 214194 218395 008 (us): 40604 25152 24368 4264 24440 25115 009 (us): 6858 3065 5815 810 3286 2116 010 (us): 1947 936 1452 151 474 361 ... Minimum: 1 1 1 1 1 1 (us) Average: 1.000 1.000 1.000 1.000 1.000 1.000 (us) Maximum: 37 38 49 28 28 19 (us) Max-Min: 36 37 48 27 27 18 (us) Duration: 599.667 599.667 599.667 599.667 599.667 599.667 (sec)- 1
- 在本例中,测量的延迟超出了最大允许的值。