15.4. 运行延迟测试
运行集群延迟测试,以验证 Cloud-native Network Function (CNF) 工作负载的节点调整。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
流程
在包含
kubeconfig文件的目录中打开 shell 提示符。您可以在当前目录中为测试镜像提供
kubeconfig文件,及其相关的$KUBECONFIG环境变量(通过卷挂载)。这允许运行的容器使用容器内的kubeconfig文件。输入以下命令运行延迟测试:
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true registry.redhat.io/openshift4/cnf-tests-rhel8:v4.10 \ /usr/bin/test-run.sh -ginkgo.focus="\[performance\]\ Latency\ Test"-
可选:附加
-ginkgo.dryRun以空运行模式运行延迟测试。这对于检查测试运行的内容非常有用。 -
可选: 附加
-ginkgo.v用来运行测试,并增加详细程度。 可选: 要针对特定的性能配置集运行延迟测试,请运行以下命令,替换适当的值:
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e LATENCY_TEST_RUNTIME=600 -e MAXIMUM_LATENCY=20 \ -e PERF_TEST_PROFILE=<performance_profile> registry.redhat.io/openshift4/cnf-tests-rhel8:v4.10 \ /usr/bin/test-run.sh -ginkgo.focus="[performance]\ Latency\ Test"其中:
- <performance_profile>
- 是您要对其运行延迟测试的性能配置集的名称。
重要如需有效延迟测试结果,请至少运行测试 12 小时。
15.4.1. 运行 hwlatdetect 复制链接链接已复制到粘贴板!
hwlatdetect 工具位于 rt-kernel 软件包中,带有常规订阅 Red Hat Enterprise Linux (RHEL) 8.x。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
先决条件
- 已在集群中安装了实时内核。
-
您使用客户门户网站凭证登录到
registry.redhat.io。
流程
要运行
hwlatdetect测试,请运行以下命令,并根据情况替换变量值:$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true -e ROLE_WORKER_CNF=worker-cnf \ -e LATENCY_TEST_RUNTIME=600 -e MAXIMUM_LATENCY=20 \ registry.redhat.io/openshift4/cnf-tests-rhel8:v4.10 \ /usr/bin/test-run.sh -ginkgo.v -ginkgo.focus="hwlatdetect"hwlatdetect测试运行了 10 分钟 (600 秒)。当最观察到的延迟低于MAXIMUM_LATENCY(20 FORWARD) 时,测试会成功运行。如果结果超过延迟阈值,测试会失败。
重要对于有效结果,测试应至少运行 12 小时。
失败输出示例
running /usr/bin/validationsuite -ginkgo.v -ginkgo.focus=hwlatdetect I0210 17:08:38.607699 7 request.go:668] Waited for 1.047200253s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp.demo.lab:6443/apis/apps.openshift.io/v1?timeout=32s Running Suite: CNF Features e2e validation ========================================== Random Seed: 1644512917 Will run 0 of 48 specs SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Ran 0 of 48 Specs in 0.001 seconds SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 48 Skipped PASS Discovery mode enabled, skipping setup running /usr/bin/cnftests -ginkgo.v -ginkgo.focus=hwlatdetect I0210 17:08:41.179269 40 request.go:668] Waited for 1.046001096s due to client-side throttling, not priority and fairness, request: GET:https://api.ocp.demo.lab:6443/apis/storage.k8s.io/v1beta1?timeout=32s Running Suite: CNF Features e2e integration tests ================================================= Random Seed: 1644512920 Will run 1 of 151 specs SSSSSSS ------------------------------ [performance] Latency Test with the hwlatdetect image should succeed /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:221 STEP: Waiting two minutes to download the latencyTest image STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase Feb 10 17:10:56.045: [INFO]: found mcd machine-config-daemon-dzpw7 for node ocp-worker-0.demo.lab Feb 10 17:10:56.259: [INFO]: found mcd machine-config-daemon-dzpw7 for node ocp-worker-0.demo.lab Feb 10 17:11:56.825: [ERROR]: timed out waiting for the condition • Failure [193.903 seconds] [performance] Latency Test /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:60 with the hwlatdetect image /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:213 should succeed [It] /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:221 Log file created at: 2022/02/10 17:08:45 Running on machine: hwlatdetect-cd8b6 Binary: Built with gc go1.16.6 for linux/amd64 Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg I0210 17:08:45.716288 1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-56fabc639a679b757ebae30e5f01b2ebd38e9fde9ecae91c41be41d3e89b37f8/vmlinuz-4.18.0-305.34.2.rt7.107.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.0/rhcos/56fabc639a679b757ebae30e5f01b2ebd38e9fde9ecae91c41be41d3e89b37f8/0 root=UUID=56731f4f-f558-46a3-85d3-d1b579683385 rw rootflags=prjquota skew_tick=1 nohz=on rcu_nocbs=3-5 tuned.non_isolcpus=ffffffc7 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,3-5 systemd.cpu_affinity=0,1,2,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 + + I0210 17:08:45.716782 1 node.go:44] Environment information: kernel version 4.18.0-305.34.2.rt7.107.el8_4.x86_64 I0210 17:08:45.716861 1 main.go:50] running the hwlatdetect command with arguments [/usr/bin/hwlatdetect --threshold 1 --hardlimit 1 --duration 10 --window 10000000us --width 950000us] F0210 17:08:56.815204 1 main.go:53] failed to run hwlatdetect command; out: hwlatdetect: test duration 10 seconds detector: tracer parameters: Latency threshold: 1us1 Sample window: 10000000us Sample width: 950000us Non-sampling period: 9050000us Output File: None Starting test test finished Max Latency: 24us2 Samples recorded: 1 Samples exceeding threshold: 1 ts: 1644512927.163556381, inner:20, outer:24 ; err: exit status 1 goroutine 1 [running]: k8s.io/klog.stacks(0xc000010001, 0xc00012e000, 0x25b, 0x2710) /remote-source/app/vendor/k8s.io/klog/klog.go:875 +0xb9 k8s.io/klog.(*loggingT).output(0x5bed00, 0xc000000003, 0xc0000121c0, 0x53ea81, 0x7, 0x35, 0x0) /remote-source/app/vendor/k8s.io/klog/klog.go:829 +0x1b0 k8s.io/klog.(*loggingT).printf(0x5bed00, 0x3, 0x5082da, 0x33, 0xc000113f58, 0x2, 0x2) /remote-source/app/vendor/k8s.io/klog/klog.go:707 +0x153 k8s.io/klog.Fatalf(...) /remote-source/app/vendor/k8s.io/klog/klog.go:1276 main.main() /remote-source/app/cnf-tests/pod-utils/hwlatdetect-runner/main.go:53 +0x897 goroutine 6 [chan receive]: k8s.io/klog.(*loggingT).flushDaemon(0x5bed00) /remote-source/app/vendor/k8s.io/klog/klog.go:1010 +0x8b created by k8s.io/klog.init.0 /remote-source/app/vendor/k8s.io/klog/klog.go:411 +0xd8 goroutine 7 [chan receive]: k8s.io/klog/v2.(*loggingT).flushDaemon(0x5bede0) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:1169 +0x8b created by k8s.io/klog/v2.init.0 /remote-source/app/vendor/k8s.io/klog/v2/klog.go:420 +0xdf Unexpected error: <*errors.errorString | 0xc000418ed0>: { s: "timed out waiting for the condition", } timed out waiting for the condition occurred /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:433 ------------------------------ SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS JUnit report was created: /junit.xml/cnftests-junit.xml Summarizing 1 Failure: [Fail] [performance] Latency Test with the hwlatdetect image [It] should succeed /remote-source/app/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:433 Ran 1 of 151 Specs in 222.254 seconds FAIL! -- 0 Passed | 1 Failed | 0 Pending | 150 Skipped --- FAIL: TestTest (222.45s) FAIL
hwlatdetect 测试结果示例
您可以捕获以下类型的结果:
- 在每次运行后收集的粗略结果,以便对整个测试过程中所做的任何更改产生影响的历史记录。
- 基本测试和配置设置的组合。
良好结果的示例
hwlatdetect: test duration 3600 seconds
detector: tracer
parameters:
Latency threshold: 10us
Sample window: 1000000us
Sample width: 950000us
Non-sampling period: 50000us
Output File: None
Starting test
test finished
Max Latency: Below threshold
Samples recorded: 0
hwlatdetect 工具仅在示例超过指定阈值时提供输出。
错误结果的示例
hwlatdetect: test duration 3600 seconds
detector: tracer
parameters:Latency threshold: 10usSample window: 1000000us
Sample width: 950000usNon-sampling period: 50000usOutput File: None
Starting tests:1610542421.275784439, inner:78, outer:81
ts: 1610542444.330561619, inner:27, outer:28
ts: 1610542445.332549975, inner:39, outer:38
ts: 1610542541.568546097, inner:47, outer:32
ts: 1610542590.681548531, inner:13, outer:17
ts: 1610543033.818801482, inner:29, outer:30
ts: 1610543080.938801990, inner:90, outer:76
ts: 1610543129.065549639, inner:28, outer:39
ts: 1610543474.859552115, inner:28, outer:35
ts: 1610543523.973856571, inner:52, outer:49
ts: 1610543572.089799738, inner:27, outer:30
ts: 1610543573.091550771, inner:34, outer:28
ts: 1610543574.093555202, inner:116, outer:63
hwlatdetect 的输出显示多个样本超过阈值。但是,相同的输出可能会根据以下因素显示不同的结果:
- 测试的持续时间
- CPU 内核数
- 主机固件设置
在继续执行下一个延迟测试前,请确保 hwlatdetect 报告的延迟满足所需的阈值。修复硬件带来的延迟可能需要您联系系统厂商支持。
并非所有延迟高峰都与硬件相关。确保调整主机固件以满足您的工作负载要求。如需更多信息,请参阅为系统调整设置固件参数。
15.4.2. 运行 cyclictest 复制链接链接已复制到粘贴板!
cyclictest 工具测量指定 CPU 上的实时内核调度程序延迟。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
先决条件
-
您使用客户门户网站凭证登录到
registry.redhat.io。 - 已在集群中安装了实时内核。
- 已使用 Performance addon operator 应用了集群性能配置集。
流程
要执行
cyclictest,请运行以下命令,并根据情况替换变量值:$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true -e ROLE_WORKER_CNF=worker-cnf \ -e LATENCY_TEST_CPUS=10 -e LATENCY_TEST_RUNTIME=600 -e MAXIMUM_LATENCY=20 \ registry.redhat.io/openshift4/cnf-tests-rhel8:v4.10 \ /usr/bin/test-run.sh -ginkgo.v -ginkgo.focus="cyclictest"该命令运行
cyclictest工具 10 分钟(600 秒)。当观察到的延迟低于MAXIMUM_LATENCY时,测试会成功运行(在本例中,20 TOKENs)。对于电信 RAN 工作负载,对 20 个以上延迟的激增通常并不能接受。如果结果超过延迟阈值,测试会失败。
重要对于有效结果,测试应至少运行 12 小时。
失败输出示例
Discovery mode enabled, skipping setup running /usr/bin//cnftests -ginkgo.v -ginkgo.focus=cyclictest I0811 15:02:36.350033 20 request.go:668] Waited for 1.049965918s due to client-side throttling, not priority and fairness, request: GET:https://api.cnfdc8.t5g.lab.eng.bos.redhat.com:6443/apis/machineconfiguration.openshift.io/v1?timeout=32s Running Suite: CNF Features e2e integration tests ================================================= Random Seed: 1628694153 Will run 1 of 138 specs SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS ------------------------------ [performance] Latency Test with the cyclictest image should succeed /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:200 STEP: Waiting two minutes to download the latencyTest image STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase Aug 11 15:03:06.826: [INFO]: found mcd machine-config-daemon-wf4w8 for node cnfdc8.clus2.t5g.lab.eng.bos.redhat.com • Failure [22.527 seconds] [performance] Latency Test /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:84 with the cyclictest image /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:188 should succeed [It] /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:200 The current latency 27 is bigger than the expected one 20 Expected <bool>: false to be true /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:219 Log file created at: 2021/08/11 15:02:51 Running on machine: cyclictest-knk7d Binary: Built with gc go1.16.6 for linux/amd64 Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg I0811 15:02:51.092254 1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/vmlinuz-4.18.0-305.10.2.rt7.83.el8_4.x86_64 ip=dhcp random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ostree=/ostree/boot.1/rhcos/612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/0 ignition.platform.id=openstack root=UUID=5a4ddf16-9372-44d9-ac4e-3ee329e16ab3 rw rootflags=prjquota skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=000000ff,ffffffff,ffffffff,fffffff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 default_hugepagesz=1G hugepagesz=2M hugepages=128 nmi_watchdog=0 audit=0 mce=off processor.max_cstate=1 idle=poll intel_idle.max_cstate=0 I0811 15:02:51.092427 1 node.go:44] Environment information: kernel version 4.18.0-305.10.2.rt7.83.el8_4.x86_64 I0811 15:02:51.092450 1 main.go:48] running the cyclictest command with arguments \ [-D 600 -95 1 -t 10 -a 2,4,6,8,10,54,56,58,60,62 -h 30 -i 1000 --quiet] I0811 15:03:06.147253 1 main.go:54] succeeded to run the cyclictest command: # /dev/cpu_dma_latency set to 0us # Histogram 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000001 000000 005561 027778 037704 011987 000000 120755 238981 081847 300186 000002 587440 581106 564207 554323 577416 590635 474442 357940 513895 296033 000003 011751 011441 006449 006761 008409 007904 002893 002066 003349 003089 000004 000527 001079 000914 000712 001451 001120 000779 000283 000350 000251 More histogram entries ... # Min Latencies: 00002 00001 00001 00001 00001 00002 00001 00001 00001 00001 # Avg Latencies: 00002 00002 00002 00001 00002 00002 00001 00001 00001 00001 # Max Latencies: 00018 00465 00361 00395 00208 00301 02052 00289 00327 00114 # Histogram Overflows: 00000 00220 00159 00128 00202 00017 00069 00059 00045 00120 # Histogram Overflow at cycle number: # Thread 0: # Thread 1: 01142 01439 05305 … # 00190 others # Thread 2: 20895 21351 30624 … # 00129 others # Thread 3: 01143 17921 18334 … # 00098 others # Thread 4: 30499 30622 31566 ... # 00172 others # Thread 5: 145221 170910 171888 ... # Thread 6: 01684 26291 30623 ...# 00039 others # Thread 7: 28983 92112 167011 … 00029 others # Thread 8: 45766 56169 56171 ...# 00015 others # Thread 9: 02974 08094 13214 ... # 00090 others
cyclictest 结果示例
相同的输出可能会显示不同工作负载的结果。例如,spikes 最长为 18μs 对 4G DU 工作负载是可以接受的,但对于 5G DU 工作负载不能接受。
良好结果的示例
running cmd: cyclictest -q -D 10m -p 1 -t 16 -a 2,4,6,8,10,12,14,16,54,56,58,60,62,64,66,68 -h 30 -i 1000 -m
# Histogram
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000001 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000002 579506 535967 418614 573648 532870 529897 489306 558076 582350 585188 583793 223781 532480 569130 472250 576043
More histogram entries ...
# Total: 000600000 000600000 000600000 000599999 000599999 000599999 000599998 000599998 000599998 000599997 000599997 000599996 000599996 000599995 000599995 000599995
# Min Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002
# Avg Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002
# Max Latencies: 00005 00005 00004 00005 00004 00004 00005 00005 00006 00005 00004 00005 00004 00004 00005 00004
# Histogram Overflows: 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000
# Histogram Overflow at cycle number:
# Thread 0:
# Thread 1:
# Thread 2:
# Thread 3:
# Thread 4:
# Thread 5:
# Thread 6:
# Thread 7:
# Thread 8:
# Thread 9:
# Thread 10:
# Thread 11:
# Thread 12:
# Thread 13:
# Thread 14:
# Thread 15:
错误结果的示例
running cmd: cyclictest -q -D 10m -p 1 -t 16 -a 2,4,6,8,10,12,14,16,54,56,58,60,62,64,66,68 -h 30 -i 1000 -m
# Histogram
000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000001 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
000002 564632 579686 354911 563036 492543 521983 515884 378266 592621 463547 482764 591976 590409 588145 589556 353518
More histogram entries ...
# Total: 000599999 000599999 000599999 000599997 000599997 000599998 000599998 000599997 000599997 000599996 000599995 000599996 000599995 000599995 000599995 000599993
# Min Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002
# Avg Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002
# Max Latencies: 00493 00387 00271 00619 00541 00513 00009 00389 00252 00215 00539 00498 00363 00204 00068 00520
# Histogram Overflows: 00001 00001 00001 00002 00002 00001 00000 00001 00001 00001 00002 00001 00001 00001 00001 00002
# Histogram Overflow at cycle number:
# Thread 0: 155922
# Thread 1: 110064
# Thread 2: 110064
# Thread 3: 110063 155921
# Thread 4: 110063 155921
# Thread 5: 155920
# Thread 6:
# Thread 7: 110062
# Thread 8: 110062
# Thread 9: 155919
# Thread 10: 110061 155919
# Thread 11: 155918
# Thread 12: 155918
# Thread 13: 110060
# Thread 14: 110060
# Thread 15: 110059 155917
15.4.3. 运行 oslat 复制链接链接已复制到粘贴板!
oslat 测试模拟 CPU 密集型 DPDK 应用程序,并测量所有中断和中断来测试集群处理 CPU 大量数据处理的方式。
始终使用 DISCOVERY_MODE=true 设置运行延迟测试。如果没有,测试套件将对正在运行的集群配置进行更改。
当以非 root 用户或非特权用户执行 podman 命令时,挂载路径可能会失败,错误为 permission denied。要使 podman 命令正常工作,请将 :Z 附加到卷创建中,例如 -v $(pwd)/:/kubeconfig:Z。这允许 podman 进行正确的 SELinux 重新标记。
先决条件
-
您使用客户门户网站凭证登录到
registry.redhat.io。 - 已使用 Performance addon operator 应用集群性能配置集。
流程
要执行
oslat测试,请运行以下命令,根据需要替换变量值:$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \ -e LATENCY_TEST_RUN=true -e DISCOVERY_MODE=true -e ROLE_WORKER_CNF=worker-cnf \ -e LATENCY_TEST_CPUS=7 -e LATENCY_TEST_RUNTIME=600 -e MAXIMUM_LATENCY=20 \ registry.redhat.io/openshift4/cnf-tests-rhel8:v4.10 \ /usr/bin/test-run.sh -ginkgo.v -ginkgo.focus="oslat"LATENCY_TEST_CPUS指定使用oslat命令测试的 CPU 列表。命令运行
oslat工具 10 分钟(600 秒)。当最观察到的延迟低于MAXIMUM_LATENCY(20 FORWARD) 时,测试会成功运行。如果结果超过延迟阈值,测试会失败。
重要对于有效结果,测试应至少运行 12 小时。
失败输出示例
running /usr/bin//validationsuite -ginkgo.v -ginkgo.focus=oslat I0829 12:36:55.386776 8 request.go:668] Waited for 1.000303471s due to client-side throttling, not priority and fairness, request: GET:https://api.cnfdc8.t5g.lab.eng.bos.redhat.com:6443/apis/authentication.k8s.io/v1?timeout=32s Running Suite: CNF Features e2e validation ========================================== Discovery mode enabled, skipping setup running /usr/bin//cnftests -ginkgo.v -ginkgo.focus=oslat I0829 12:37:01.219077 20 request.go:668] Waited for 1.050010755s due to client-side throttling, not priority and fairness, request: GET:https://api.cnfdc8.t5g.lab.eng.bos.redhat.com:6443/apis/snapshot.storage.k8s.io/v1beta1?timeout=32s Running Suite: CNF Features e2e integration tests ================================================= Random Seed: 1630240617 Will run 1 of 142 specs SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS ------------------------------ [performance] Latency Test with the oslat image should succeed /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:134 STEP: Waiting two minutes to download the latencyTest image STEP: Waiting another two minutes to give enough time for the cluster to move the pod to Succeeded phase Aug 29 12:37:59.324: [INFO]: found mcd machine-config-daemon-wf4w8 for node cnfdc8.clus2.t5g.lab.eng.bos.redhat.com • Failure [49.246 seconds] [performance] Latency Test /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:59 with the oslat image /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:112 should succeed [It] /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:134 The current latency 27 is bigger than the expected one 201 Expected <bool>: false to be true /go/src/github.com/openshift-kni/cnf-features-deploy/vendor/github.com/openshift-kni/performance-addon-operators/functests/4_latency/latency.go:168 Log file created at: 2021/08/29 13:25:21 Running on machine: oslat-57c2g Binary: Built with gc go1.16.6 for linux/amd64 Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg I0829 13:25:21.569182 1 node.go:37] Environment information: /proc/cmdline: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/vmlinuz-4.18.0-305.10.2.rt7.83.el8_4.x86_64 ip=dhcp random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ostree=/ostree/boot.0/rhcos/612d89f4519a53ad0b1a132f4add78372661bfb3994f5fe115654971aa58a543/0 ignition.platform.id=openstack root=UUID=5a4ddf16-9372-44d9-ac4e-3ee329e16ab3 rw rootflags=prjquota skew_tick=1 nohz=on rcu_nocbs=1-3 tuned.non_isolcpus=000000ff,ffffffff,ffffffff,fffffff1 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,1-3 systemd.cpu_affinity=0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 default_hugepagesz=1G hugepagesz=2M hugepages=128 nmi_watchdog=0 audit=0 mce=off processor.max_cstate=1 idle=poll intel_idle.max_cstate=0 I0829 13:25:21.569345 1 node.go:44] Environment information: kernel version 4.18.0-305.10.2.rt7.83.el8_4.x86_64 I0829 13:25:21.569367 1 main.go:53] Running the oslat command with arguments \ [--duration 600 --rtprio 1 --cpu-list 4,6,52,54,56,58 --cpu-main-thread 2] I0829 13:35:22.632263 1 main.go:59] Succeeded to run the oslat command: oslat V 2.00 Total runtime: 600 seconds Thread priority: SCHED_FIFO:1 CPU list: 4,6,52,54,56,58 CPU for main thread: 2 Workload: no Workload mem: 0 (KiB) Preheat cores: 6 Pre-heat for 1 seconds... Test starts... Test completed. Core: 4 6 52 54 56 58 CPU Freq: 2096 2096 2096 2096 2096 2096 (Mhz) 001 (us): 19390720316 19141129810 20265099129 20280959461 19391991159 19119877333 002 (us): 5304 5249 5777 5947 6829 4971 003 (us): 28 14 434 47 208 21 004 (us): 1388 853 123568 152817 5576 0 005 (us): 207850 223544 103827 91812 227236 231563 006 (us): 60770 122038 277581 323120 122633 122357 007 (us): 280023 223992 63016 25896 214194 218395 008 (us): 40604 25152 24368 4264 24440 25115 009 (us): 6858 3065 5815 810 3286 2116 010 (us): 1947 936 1452 151 474 361 ... Minimum: 1 1 1 1 1 1 (us) Average: 1.000 1.000 1.000 1.000 1.000 1.000 (us) Maximum: 37 38 49 28 28 19 (us) Max-Min: 36 37 48 27 27 18 (us) Duration: 599.667 599.667 599.667 599.667 599.667 599.667 (sec)- 1
- 在本例中,测量的延迟超出了最大允许的值。