Appendix A. Hardware certification tests
In this section we give more detailed information about each of the tests for hardware certification. Each test section uses the following format:
What the test covers
This section lists the types of hardware that this particular test is run on.
RHEL version supported
This section lists the versions of RHEL that the test is supported on.
What the test does
This section explains what the test scripts do. Remember, all the tests are python scripts and can be viewed in the directory /usr/lib/python2.7/site-packages/rhcert/suites/hwcert/tests if you want to know exactly what commands we are executing in the tests.
Preparing for the test
This section talks about the steps necessary to prepare for the test. For example, it talks about having a USB device on hand for the USB test and blank discs on hand for rewritable optical drive tests.
Executing the test
This section identifies whether the test is interactive or non-interactive and explains what command is necessary to run the test.
You can choose either way to run the test:
Follow Running the certification tests using CLI to run the test. Select the appropriate test name from the displayed list using the command:
rhcert-run
rhcert-runCopy to Clipboard Copied! Toggle word wrap Toggle overflow In case of hardware detection issues or other hardware-related problems during planning, follow Manually adding and running the tests. Run the
rhcert-clicommand by specifying the desired test name.rhcert-cli run --test=<test name>
rhcert-cli run --test=<test name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Run Time
This section explains how long a run of this test will take. Timing information for the supportable test is mentioned in each section as it is a required test for every run of the test suite.
A.1. Mandatory tests for hardware certification Copy linkLink copied to clipboard!
The following tests are executed for both RHEL and RHEL AI hardware certifications:
A.1.1. self_check test Copy linkLink copied to clipboard!
What the test covers
The self-check test confirms that all required software packages for certification are installed and unaltered, ensuring the test environment is ready for certification. Certification packages must not be modified for testing or any other purpose.
What the test does
The test has several subtests that perform the following tasks:
- Checks for valid certification packages
- Verifies the integrity of the certification rpm files and highlights the changes, if any
Executing the test
When you run the test suite, the rhcert tool runs the self_check test automatically as part of every run. The self_check test runs before any other test.
You must add the output of the self_check test as part of the test suite logs. If not, Red Hat will reject the test logs that do not contain the output of the self_check test.
Use the following command to run the test manually, if required:
rhcert-cli run --test self_check
$ rhcert-cli run --test self_check
Run time
The self_check test takes around 1 minute to execute provided you haven’t modified any of the certification files.
Success Criteria
The test environment includes all the necessary certification packages and the certification files have not been modified.
A.1.2. supportable test Copy linkLink copied to clipboard!
What the test covers
The supportable test gathers basic information about the host under test (HUT). Red Hat uses this information to verify that the system complies with the certification requisites.
What the test does
The test has several subtests that perform the following tasks:
-
Confirm that the
/proc/sys/kernel/taintedfile contains a zero (0), which indicates that the kernel is not tainted. -
Confirm that package verification with the
rpm -Vcommand shows that no files have been modified. -
Confirm that the
rpm -qa kernelcommand shows that the buildhost of the kernel package is a Red Hat server. -
Record the boot parameters from the
/proc/cmdlinefile. - Confirm that the`rpm -V redhat-certification` command shows that no modifications have been made to any of the certification test suite files.
-
Confirm that all the modules shown by the
lsmodcommand show up in a listing of the kernel files with therpm -ql kernelcommand. - Confirm that all modules are on the Kernel Application Binary Interface (kABI) stablelist.
- Confirm that the module vendor and buildhost are appropriate Red Hat entries.
Confirm that the kernel is the GA kernel of the Red Hat minor release.
The subtest tries to verify the kernel with data from the redhat-certification package. If the kernel is not present, the subtest attempts to verify the kernel by using the Internet connection.
To verify the kernel by using the Internet connection, you must either configure the HUT’s routing and DNS resolution to access the Internet or set the
ftp_proxy=http://proxy.domain:80environment variable.-
Check for any known hardware vulnerabilities reported by the kernel. The subtest reads the files in the
/sys/devices/system/cpu/vulnerabilities/directory and exits with a warning if the files contain the word “Vulnerable”. -
Confirm if the system has any offline CPUs by checking the output of the
lscpucommand. - Confirm if Simultaneous Multithreading (SMT) is available, enabled, and active in the system.
Check if there is unmaintained hardware or drivers in systems running RHEL 8 or later.
Unmaintained hardware and drivers are no longer tested or updated on a routine basis. Red Hat may fix serious issues, including security issues, but you cannot expect updates on any planned cadence.
Replace or remove unmaintained hardware or drivers as soon as possible.
Check if there is deprecated hardware or drivers in systems running RHEL 8 or later.
Deprecated hardware and drivers are still tested and maintained, but they are planned to become unmaintained and eventually disabled in a future release.
Replace or remove deprecated devices or hardware as soon as possible.
Check if there is disabled hardware in systems running RHEL 8 or later.
RHEL cannot use disabled hardware. Replace or remove the disabled hardware from your system before running the test again.
Run the following checks on the software RPM packages:
Check the RPM build host information to isolate non-Red Hat packages.
The test will ask you to explain the reasons for including the non-Red Hat packages. Red Hat will review the reasons and approve or reject each package individually.
Check that the installed RPM packages are from the Red Hat products available in the offering and have not been modified.
Red Hat reviews verification failures in the
rpm_verification_report.logfile. You will need to reinstall the failed packages and rerun the test.
- Check the presence of both Red Hat and non-Red Hat firmware files in the system. It lists the non-Red Hat files, if present, and exits with REVIEW status.
-
Check the page size of systems by
getconf PAGESIZEcommand. For RHEL AI certification, the supportable test executes an additional test that captures the following details of the HUT :
- OS version
- Total number of AI accelerators in the systems
- The list of driver modules that are loaded.
After performing these tasks, the test gathers a sosreport and the output of the dmidecode command.
Executing the test
The rhcert tool runs the supportable test automatically as part of every run of the test suite. The supportable test runs before any other test.
The output of the supportable test is required as part of the test suite logs. Red Hat will reject test logs that do not contain the output of the supportable test.
Use the following command to run the test manually, if required:
rhcert-cli run --test supportable
$ rhcert-cli run --test supportable
Run time
The supportable test takes around 1 minute on a 2013-era, single CPU, 3.3GHz, 6-core or 12-thread Intel workstation with 8 GB of RAM running Red Hat Enterprise Linux 6.4, AMD64, and Intel 64 that was installed using the Kickstart files in this guide. The time will vary depending on the speed of the machine and the number of RPM files that are installed.
A.1.3. sosreport test Copy linkLink copied to clipboard!
What the test covers
The Sosreport test connects to the HUT and collects information about the system’s hardware and configuration for further analysis when required.
What the test does
The sos test collects the configuration and diagnostic information from a HUT, to assist customers in troubleshooting their system and following recommended practices. The system report subtest ensures that the sos tool functions as expected on the image or system and captures a basic sosreport.
The sos_reports/manifest.json file contains details about node hostnames and the commands run by this test.
Executing the test
This test is non-interactive.
Run time
This is an automated test and can take a couple of minutes to complete.
A.2. RHEL hardware certification tests Copy linkLink copied to clipboard!
The following tests are executed for RHEL hardware certification:
A.2.1. Core Copy linkLink copied to clipboard!
What the test covers
The core test examines the system’s CPUs and ensures that they are capable of functioning properly under load.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The core test is actually composed of two separate routines. The first test is designed to detect clock jitter. Jitter is a condition that occurs when the system clocks are out of sync with each other. The system clocks are not the same as the CPU clock speed, which is just another way to refer to the speed at which the CPUs are operating. The jitter test uses the getimeofday() function to obtain the time as observed by each logical CPU and then analyzes the returned values. If all the CPU clocks are within .2 nanoseconds of each other, the test passes. The tolerances for the jitter test are very tight. In order to get good results it’s important that the rhcert tests are the only loads running on a system at the time the test is executed. Any other compute loads that are present could interfere with the timing and cause the test to fail. The jitter test also checks to see which clock source the kernel is using. It will print a warning in the logs if an Intel processor is not using TSC, but this will not affect the PASS/FAIL status of the test.
The second routine run in the core test is a CPU load test. It’s the test provided by the required stress package. The stress program, which is available for use outside the rhcert suite if you are looking for a way to stress test a system, launches several simultaneous activities on the system and then monitors for any failures. Specifically it instructs each logical CPU to calculate square roots, it puts the system under memory pressure by using malloc() and free() routines to reserve and free memory respectively, and it forces writes to disk by calling sync(). These activities continue for 10 minutes, and if no failures occur within that time period, the test passes. Please see the stress manpage if you are interested in using it outside of hardware certification testing.
Preparing for the test
The only preparation for the core test is to install a CPU that meets the requirements that are stated in the Policy Guide.
Executing the test
The core test is non-interactive. Run the following command and then select the appropriate Core test name from the list that displays.
rhcert-run
rhcert-run
Run time, bare-metal
The core test itself takes about 12 minutes to run on a bare-metal system. The jitter portion of the test takes a minute or two and the stress portion runs for exactly 10 minutes. The required supportable test will add about a minute to the overall run time.
Run time, full-virt guest
The fv_core test takes slightly longer than the bare-metal version, about 14 minutes, to run in a KVM guest. The added time is due to guest startup/shutdown activities and the required supportable test that runs in the guest. The required supportable test on the bare-metal system will add about a minute to the overall run time.
A.2.2. CPU scaling Copy linkLink copied to clipboard!
What the test covers
The cpuscaling test examines a CPU’s ability to increase and decrease its clock speed according to the compute demands placed on it.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test exercises the CPUs at varying frequencies using different scaling governors (the set of instructions that tell the CPU when to change to higher or lower clock speeds and how fast to do so) and measures the difference in the time that it takes to complete a standardized workload. The test is scheduled when the hardware detection routines find the following directories in /sys containing more than one cpu frequency:
/sys/devices/system/cpu/cpuX/cpufreq
/sys/devices/system/cpu/cpuX/cpufreq
The cpuscaling test is planned once per package, rather than being listed once per logical CPU. When the test is run, it will determine topology via /sys/devices/system/cpu/cpuX/topology/physical_package_id, and run the test in parallel for all the logical CPUs in a particular package.
The test runs the turbostat command first to gather the processor statistics. On supported architectures, turbostat checks if the advance statistics columns are visible in the turbostat output file, but returns a warning if the file does not contain the columns. The test then attempts to execute the cstate subtest and if it fails, executes pstate subtest.
The test procedure for each CPU package is as follows:
The test uses the values found in the sysfs filesystem to determine the maximum and minimum CPU frequencies. You can see these values for any system with this command:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
There will always be at least two frequencies displayed here, a maximum and a minimum, but some processors are capable of finer CPU speed control and will show more than two values in the file. Any additional CPU speeds between the max and min are not specifically used during the test, though they may be used as the CPU transitions between max and min frequencies. The test procedure is as follows:
-
The test records the maximum and minimum processor speeds from the file
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies. - The userspace governor is selected and maximum frequency is chosen.
-
Maximum speed is confirmed by reading all processors'
/sys/devices/system/cpu/cpuX/cpufreq/scaling_cur_freqvalue. If this value does not match the selected frequency, the test will report a failure. - Every processor in the package is given the simultaneous task of calculating pi to 2x10^12 digits. The value for the pi calculation was chosen because it takes a meaningful amount of time to complete (about 30 seconds).
- The amount of time it took to calculate pi is recorded for each CPU, and an average is calculated for the package.
- The userspace governor is selected and the minimum speed is set.
- Minimum speed is confirmed by sysfs data, with a failure occurring if any CPU is not at the requested speed.
- The same pi calculation is performed by every processor in the package and the results recorded.
- The ondemand governor is chosen, which throttles the CPU between minimum and maximum speeds depending on workload.
- Minimum speed is confirmed by sysfs data, with a failure occurring if any CPU is not at the requested speed.
- The same pi calculation is performed by every processor in the package and the results recorded.
- The performance governor is chosen, which forces the CPU to maximum speed at all times.
- Maximum speed is confirmed by sysfs data, with a failure occurring if any CPU is not at the requested speed.
- The same pi calculation is performed by every processor processor and the results recorded.
Now the analysis is performed on the three subsections. In steps one through eight we obtain the pi calculation times at maximum and minimum CPU speeds. The difference in the time it takes to calculate pi at the two speeds should be proportional to the difference in CPU speed. For example, if a hypothetical test system had a max frequency of 2GHz and a min of 1GHz and it took the system 30 seconds to run the pi calculation at max speed, we would expect the system to take 60 seconds at min speed to calculate pi. We know that for various reasons perfect results will not be obtained, so we allow for a 10% margin of error (faster or slower than expected) on the results. In our hypothetical example, this means that the minimum speed run could take between 54 and 66 seconds and still be considered a passing test (90% of 60 = 54 and 110% of 60 = 66).
In steps nine through eleven, we test the pi calculation time using the ondemand governor. This confirms that the system can quickly increase the CPU speed to the maximum when work is being done. We take the calculation time obtained in step eleven and compare it to the maximum speed calculation time we obtained back in step five. A passing test has those two values differing by no more than 10%.
In steps twelve through fourteen, we test the pi calculation using the performance governor. This confirms that the system can hold the CPU at maximum frequency at all times. We take the pi calculation time obtained in step 14 and compare it to the maximum speed calculation time we obtained back in step five. Again, a passing test has those two values differing by no more than 10%.
An additional portion of the cpuscaling test runs when an Intel processor with the TurboBoost feature is detected by the presence of the ida CPU flag in /proc/cpuinfo. This test chooses one of the CPUs in each package, omitting CPU0 for housekeeping purposes, and measures the performance using the ondemand governor at maximum speed. It expects a result of at least 5% faster performance than the previous test, when all the cores in the package were being tested in parallel.
Preparing for the test
To prepare for the test, ensure that CPU frequency scaling is enabled in the BIOS and ensure that a CPU is installed that meets the requirements explained in the Policy Guide.
Executing the test
The cpuscaling test is non-interactive. Run the following command and then select the appropriate CPU scaling test name from the list that displays.
rhcert-run
rhcert-run
Run time
The cpuscaling test takes about 42 minutes for a 2013-era, single CPU, 6-core/12-thread 3.3GHz Intel-based workstation running Red Hat Enterprise Linux 6.4, AMD64 and Intel 64. Systems with higher core counts and more populated sockets will take longer. The required supportable test will add about a minute to the overall run time.
A.2.3. Ethernet Copy linkLink copied to clipboard!
What the test covers
The Ethernet test only appears when the speed of a network device is not recognized by the test suite. This may be due to an unplugged cable or some other fault is preventing the proper detection of the connection speed. Please exit the test suite, check your connection, and run the test suite again when the device is properly connected. If the problem persists, contact your Red Hat support representative for assistance.
The example below shows a system with two gigabit Ethernet devices, eth0 and eth1. Device eth0 is properly connected, but eth1 is not plugged in.
The output of the ethtool command shows the expected gigabit Ethernet speed of 1000Mb/s for eth0:
But on eth1 the ethtool command shows an unknown speed, which would cause the Ethernet test to be planned.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
A.2.4. fv_core Copy linkLink copied to clipboard!
The fv_core test is a wrapper that launches the FV guest and runs a core test on it.
Starting with RHEL 9.4, this test is supported to run on ARM systems.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
The first time you run any full-virtualization test, the test tool will need to obtain the FV guest files. The execution time of the test tool depends on the transfer speed of the FV guest files. For example,
- If FV guest files are located on the test server and you are using 1GbE or faster networking, it takes almost a minute or two to transfer approximately 300MB of guest files.
- If the files are retrieved from the CWE API, which occurs automatically when the guest files are not installed or found on the test server, the first runtime will depend on the transfer speed from the CWE API.
When the guest files are available on the Host Under Test (HUT), they will be utilized for all the later runs of fv_* tests.
A.2.5. fv_memory Copy linkLink copied to clipboard!
The fv_memory test is a wrapper that launches the FV guest and runs a memory test on it.
Starting with RHEL 9.4, this test is supported to run on ARM systems.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
The first time you run any full-virtualization test, the test tool will need to obtain the FV guest files. The execution time of the test tool depends on the transfer speed of the FV guest files. For example,
- If FV guest files are located on the test server and you are using 1GbE or faster networking, it takes almost a minute or two to transfer approximately 300MB of guest files.
- If the files are retrieved from the CWE API, which occurs automatically when the guest files are not installed or found on the test server, the first runtime will depend on the transfer speed from the CWE API.
When the guest files are available on the Host Under Test (HUT), they will be utilized for all the later runs of fv_* tests.
A.2.6. kdump Copy linkLink copied to clipboard!
What the test covers
The kdump test uses the kdump service to check that the system can capture a vmcore file after a crash, and that the captured file is valid.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test includes the following subtests:
kdump with local: Using the
kdumpservice, this subtest performs the following tasks:- Crashes the host under test (HUT).
-
Writes a
vmcorefile to the local/var/crashdirectory. -
Validates the
vmcorefile.
kdump with NFS: Using the
kdumpservice, this subtest performs the following tasks:-
Mounts the
/var/rhcert/exportfilesystem on the HUT’s/var/crashdirectory. This filesystem is shared over NFS from the test server. - Crashes the HUT.
-
Writes a
vmcorefile to the/var/crashdirectory. -
Validates the
vmcorefile.
-
Mounts the
Preparing for the test
- Ensure that the HUT is connected to the test server before running the test.
-
Ensure that the
rhcertdprocess is running on the test server. The certification test suite prepares the NFS filesystem automatically. If the suite cannot set up the environment, the test fails.
Executing the test
- Log in to the HUT.
Run the kdump test:
To use the
rhcert-runcommand, perform the following steps:Run the
rhcert-runcommand:rhcert-run
# rhcert-runCopy to Clipboard Copied! Toggle word wrap Toggle overflow Select the kdump test.
The test runs both subtests sequentially.
To use the
rhcert-clicommand, choose whether to run both subtests sequentially, or specify a subtest:To run both subtests sequentially, use the following command:
rhcert-cli run --test=kdump -–server=<test server’s IP>
# rhcert-cli run --test=kdump -–server=<test server’s IP>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To run the kdump with local subtest only, use the following command:
rhcert-cli run --test=kdump --device=local
# rhcert-cli run --test=kdump --device=localCopy to Clipboard Copied! Toggle word wrap Toggle overflow To run the kdump with NFS subtest only, use the following command:
rhcert-cli run --test=kdump --device=nfs -–server=<test server’s IP>
# rhcert-cli run --test=kdump --device=nfs -–server=<test server’s IP>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Additionally, for the kdump with NFS test, execute the following command on the Test Server:
rhcertd start
# rhcertd startCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Wait for the HUT to restart after the crash.
The
kdumpservice shows several messages while it saves thevmcorefile to the/var/crashdirectory. After thevmcorefile is saved, the HUT restarts.-
Log in to the HUT after reboot, the rhcert suite will verify if the
vmcorefile exists, and if it is valid. If the file does not exist or is invalid, the test fails.
If you are running the subtests sequentially, the kdump with NFS subtest starts after the validation of the previous vmcore file has completed.
Run time
The run time of the kdump test varies according to factors such as the amount of RAM in the HUT, the disc speed of the test server and the HUT, the network connection speed to the test server, and the time taken to reboot the HUT.
For a 2013-era workstation with 8GB of RAM, a 7200 RPM 6Gb/s SATA drive, a gigabit Ethernet connection to the test server, and a 1.5 minute reboot time, a local kdump test can complete in about four minutes, including the reboot. The same 2013-era workstation can complete an NFS kdump test in about five minutes to a similarly equipped network test server. The supportable test will add about a minute to the overall run time.
A.2.7. memory Copy linkLink copied to clipboard!
What the memory test covers
The memory test is used to test system RAM. It does not test USB flash memory, SSD storage devices or any other type of RAM-based hardware. It tests main memory only.
A memory per CPU core check has been added to the planning process to verify that the HUT meets the RHEL minimum requirement memory standards. It is a planning condition for several of the hardware certification tests, including the ones for memory, core, realtime, and all the full-virtualization tests.
If the memory per CPU core check does not pass, the above-mentioned tests will not be planned automatically. However, these tests can be planned manually via CLI.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does: The test uses the file /proc/meminfo to determine how much memory is installed in the system. Once it knows how much is installed, it checks to see if the system architecture is 32-bit or 64-bit. Then it determines if swap space is available or if there is no swap partition. The test runs either once or twice with slightly different settings depending on whether or not the system has a swap file:
- If swap is available, allocate more RAM to the memory test than is actually installed in the system. This forces the use of swap space during the run.
- Regardless of swap presence, allocate as much RAM as possible to the memory test while staying below the limit that would force out of memory (OOM) kills. This version of the test always runs.
In both iterations of the memory test, malloc() is used to allocate RAM, the RAM is dirtied with a write of an arbitrary hex string (0xDEADBEEF), and a test is performed to ensure that 0xDEADBEEF is actually stored in RAM at the expected addresses. The test calls free() to release RAM when testing is complete. Multiple threads or multiple processes will be used to allocate the RAM depending on whether the process size is greater than or less than the amount of memory to be tested.
Preparing for the test
Install the correct amount of RAM in the system in accordance with the rules in the Policy Guide.
Executing the test
The memory test is non-interactive. Run the following command and then select the appropriate memory test name from the list that displays.
rhcert-run
rhcert-run
Run time, bare-metal
The memory test takes about 16 minutes to run on a 2013-era, single CPU, 6-core/12-thread 3.3GHz Intel-based workstation with 8GB of RAM running Red Hat Enterprise Linux, AMD64 and Intel 64. The test will take longer on systems with more RAM. The required supportable test will add about a minute to the overall run time.
Run time, full-virt guest
The fv_memory test takes slightly longer than the bare-metal version, about 18 minutes, to run in a guest. The added time is due to guest startup/shutdown activities and the required supportable test that runs in the guest. The required supportable test on the bare-metal system will add about a minute to the overall run time. The fv_memory test run times will not vary as widely from machine to machine as the bare-metal memory tests, as the amount of RAM assigned to our pre-built guest is always the same. There will be variations caused by the speed of the underlying real system, but the amount of RAM in use during the test won’t change from machine to machine.
Creating and Activating Swap for EC2: Partners can perform the following steps to create and activate swap for EC2
A.2.8. network Copy linkLink copied to clipboard!
What the test covers
The network test checks devices that transfer data over a TCP/IP network. The test can check multiple connection speeds and bandwidths of wired devices based on the corresponding test, as listed in the following table:
Different tests under Network test
| Ethernet test | Description |
|---|---|
| 1GigEthernet | The network test with added speed detection for 1 gigabit Ethernet connections. |
| 10GigEthernet | The network test with added speed detection for 10 gigabit Ethernet connections. |
| 20GigEthernet | The network test with added speed detection for 20 gigabit Ethernet connections. |
| 25GigEthernet | The network test with added speed detection for 25 gigabit Ethernet connections. |
| 40GigEthernet | The network test with added speed detection for 40 gigabit Ethernet connections. |
| 50GigEthernet | The network test with added speed detection for 50 gigabit Ethernet connections. |
| 100GigEthernet | The network test with added speed detection for 100 gigabit Ethernet connections. |
| 200GigEthernet | The network test with added speed detection for 200 gigabit Ethernet connections. |
| Ethernet | If the Ethernet test is listed in your local test plan, it indicates that the test suite did not recognize the speed of that device. Check the connection before attempting to test that particular device. |
RHEL version supported
- RHEL 8
- RHEL 9
What the test does
The test runs the following subtests to gather information about all the network devices:
-
The bounce test on the interface is conducted using
nmcli conn upandnmcli conn downcommands. - If the root partition is not NFS or iSCSI mounted, the bounce test is performed on the interface. Additionally, all other interfaces that will not be tested are shut down to ensure that traffic is routed through the interface being tested.
- If the root partition is NFS or iSCSI mounted, the bounce test on the interface responsible for the iSCSI or NFS connection is skipped, and all other interfaces, except for the one handling the iSCSI or NFS connection, will be shut down.
-
A test file gets created at location
/dev/urandom, and its size is adjusted with the speed of your NIC. TCP and UDP testing - The test uses iperf tool to:
- Test TCP latency between the test server and host under test. The test checks if the system runs into any OS timeouts and fails if it does.
- Test the bandwidth between the test server and the host under test. For wired devices, it is recommended that the speed is close to the theoretical maximum.
- Test UDP latency between the test server and host under test. The test checks if the system runs into any OS timeouts and fails if it does.
- File transfer testing - The test uses SCP to transfer a file from the host under test to the remote system or test server and then transfers it back to the host under test to check if the transfer works properly.
- ICMP (ping) test - The script causes a ping flood at the default packet size to ensure nothing in the system fails (the system should not restart or reset or anything else that indicates the inability to withstand a ping flood). 5000 packets are sent and a 100% success rate is expected. The test retries 5 times for an acceptable success rate.
- Finally, the test brings all interfaces back to their original state (active or inactive) when the test is executed.
Preparing for testing wired devices
You can test as many network devices as you want in each test run.
Before you begin:
- Ensure to connect each device at its native (maximum) speed, or else the test fails.
- Ensure that the test server is up and running.
- Ensure that each network device has an IP address assigned either statically or dynamically via DHCP.
- Ensure that multiple firewall ports are open, for the iperf tool to run TCP and UDP subtests.
By default, ports 52001-52101 are open. If you want to change the default ports, update the iperf-port and total-iperf-ports values in the /etc/rhcert.xml configuration file.
Example:
<server listener-port="8009" iperf-port="52001" total-iperf-ports="100">
If the firewall ports are not open, the test prompts to open the firewall ports during the test run.
Partitionable networking
The test checks if any of the network devices support partitioning, by checking the data transfer at full speed and the partitioning function.
Running the test based on the performance of NIC:
- If NIC runs at full speed while partitioned then, configure a partition with NIC running at its native speed and Perform the network test in that configuration.
- If NIC does not run at full speed while partitioned then, run the test twice - first time, run it without partitioning to see the full-speed operation, and the second time, run it with partitioning enabled to see the partitioning function.
Red Hat recommends selecting either 1Gb/s or 10Gb/s for your partitioned configuration so that it conforms to the existing network speed tests.
Executing the test
The network test is non-interactive. Run the following command and then select the appropriate network test name from the list that displays.
rhcert-run
rhcert-run
| Speed Type | Command to manually add Ethernet Test | Command to Manually run Ethernet Test |
|---|---|---|
| 1GigEthernet |
rhcert-cli plan --add --test 1GigEthernet --device <device name>
|
rhcert-cli run --test 1GigEthernet --server <test server IP addr>
|
| 10GigEthernet |
rhcert-cli plan --add --test 10GigEthernet --device <device name>
|
rhcert-cli run --test 10GigEthernet --server <test server IP addr>
|
| 20GigEthernet |
rhcert-cli plan --add --test 20GigEthernet --device <device name>
|
rhcert-cli run --test 20GigEthernet --server <test server IP addr>
|
| 25GigEthernet |
rhcert-cli plan --add --test 25GigEthernet --device <device name>
|
rhcert-cli run --test 25GigEthernet --server <test server IP addr>
|
| 40GigEthernet |
rhcert-cli plan --add --test 40GigEthernet --device <device name>
|
rhcert-cli run --test 40GigEthernet --server <test server IP addr>
|
| 50GigEthernet |
rhcert-cli plan --add --test 50GigEthernet --device <device name>
|
rhcert-cli run --test 50GigEthernet --server <test server IP addr>
|
| 100GigEthernet |
rhcert-cli plan --add --test 100GigEthernet --device <device name>
|
rhcert-cli run --test 100GigEthernet --server <test server IP addr>
|
| 200GigEthernet |
rhcert-cli plan --add --test 200GigEthernet --device <device name>
|
rhcert-cli run --test 200GigEthernet --server <test server IP addr>
|
| 400GigEthernet |
rhcert-cli plan --add --test 400GigEthernet --device <device name>
|
rhcert-cli run --test 400GigEthernet --server <test server IP addr>
|
Replace <device name> and <test server IP addr> with the appropriate value.
Run time
The network test takes about 2 minutes to test each PCIe-based, gigabit, wired Ethernet card, and the required Supportable test adds about a minute to the overall run time.
A.2.9. NetworkManageableCheck Copy linkLink copied to clipboard!
What the test covers
The NetworkManageableCheck test runs for all the network interfaces available in the system.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test comprises two subtests that perform the following tasks:
Check the BIOS device name to confirm that the interface follows the terminology set by the firmware.
NoteBIOS device name validation runs only on x86 systems.
- Check if the Network Manager manages the interface, for evaluating current network management status.
Executing the test
The NetworkManageableCheck test is mandatory. It is planned and executed with a self-check and supportable test to ensure thorough examination and validation of network interfaces.
Run time
The test takes around 1 minute to complete. However, the duration of the test varies depending on the specifics of the system and the number of interfaces.
A.2.10. profiler Copy linkLink copied to clipboard!
The profiler test collects the performance metric from the Host Under Test and determines whether the metrics are collected from the software or the hardware Performance Monitoring Unit (PMU) supported by the RHEL Kernel. If the metrics are hardware-based, the test further determines if the PMU includes per core counters only or includes per package counters also. The profiler test is divided into three tests, profiler_hardware_core, profiler_hardware_uncore, and profiler_software.
A.2.10.1. profiler_hardware_core Copy linkLink copied to clipboard!
What the test covers
The profiler_hardware_core test collects performance metrics using hardware-based per core counters by checking the cycle events. The core events measure the functions of a processor core, for example, the L2 cache.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test is planned if core hardware event counters are found and locate the cpu*cycles files in the /sys/devices directory by running the find /sys/devices/* -type f -name 'cpu*cycles' command.
The test executes multiple commands to accumulate the sample of 'cycle' events, checks if the 'cpu cycle' event was detected, and checks if the samples were collected.
This test is not intended to be exhaustive and, it does not test every possible core counter-event that a given processor may or may not have.
Preparing for the test
There are no special requirements to run this test.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate profiler_hardware_core test name from the list that displays.
rhcert-run
rhcert-run
Run time
The test takes approximately 30 seconds. Any other mandatory or selected tests will add to the overall run time.
A.2.10.2. profiler_hardware_uncore Copy linkLink copied to clipboard!
What the test covers
The profiler_hardware_uncore test collects performance metrics using hardware-based package-wide counters. The uncore events measure the functions of a processor that are outside the core but are inside the package, for example, a memory controller.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test is planned if uncore hardware event counters are found. The test passes if it finds any uncore events and collects statistics for any one event. The test fails if it finds uncore events but does not collect statistics as those events are not supported.
The test executes multiple commands to collect the list of uncore events and the uncore events statistics.
This test is not intended to be exhaustive and, it does not test every possible uncore counter-event that a given processor may or may not have.
Preparing for the test
There are no special requirements to run this test.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate profiler_hardware_uncore test name from the list that displays.
rhcert-run
rhcert-run
Run time
The test takes approximately 30 seconds. Any other mandatory or selected tests will add to the overall run time.
A.2.10.3. profiler_software Copy linkLink copied to clipboard!
What the test covers
The profiler_software test collects performance metrics using software-based counters by checking the cpu_clock events.
Software counters can be certified using this test. However, for customers with high-performance requirements, this test can be limiting.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test is planned if no core hardware event counters are found.
The test executes multiple commands to accumulate the sample of cpu-clock events, checks if the cpu-clock event was detected, and checks if the samples were collected.
Preparing for the test
There are no special requirements to run this test.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate profiler_software test name from the list that displays.
rhcert-run
rhcert-run
Run time
The test takes approximately 30 seconds. Any other mandatory or selected tests will add to the overall run time.
A.2.11. PCIE_NVMe Copy linkLink copied to clipboard!
What the PCIe_NVMe test covers
This test runs if the interface is NVMe and the device is connected through a PCIE connection.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the PCIe_NVMe test does
This test gets planned if the logical device host name string contains " nvme[0-9] "
Following are the device parameter values that are printed as a part of the test:
- logical_block_size - Used to address a location on the device.
- physical_block_size - Smallest unit on which the device can operate.
- minimum_io_size - Minimum unit preferred for random input or output of the device.
- optimal_io_size - Preferred unit of the device for streaming input or output operations.
- alignment_offset - Offset value from the underlying physical alignment.
A.2.12. M2_NVMe Copy linkLink copied to clipboard!
What the M2_NVMe test covers
This test runs if the interface is NVMe and the device is connected through a M2 connection.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
Manually adding and running the test
To manually add and run the M2_NVMe test, use the following command:
rhcert-cli plan --add --test M2_NVMe --device nvme0
rhcert-cli plan --add --test M2_NVMe --device nvme0
Following are the device parameter values that are printed as a part of the test:
- logical_block_size - Used to address a location on the device.
- physical_block_size - Smallest unit on which the device can operate.
- minimum_io_size - Minimum unit preferred for random input or output of the device.
- optimal_io_size - Preferred unit of the device for streaming input or output operations.
- alignment_offset - Offset value from the underlying physical alignment.
A.2.13. U2_NVMe Copy linkLink copied to clipboard!
What the U2_NVMe test covers
This test runs if the interface is NVMe and the device is connected through a U2 connection.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
Manually adding and running the test
To manually add and run the U2_NVMe test, use the following command:
rhcert-cli plan --add --test U2_NVMe --device nvme0
rhcert-cli plan --add --test U2_NVMe --device nvme0
Following are the device parameter values that are printed as a part of the test:
- logical_block_size - Used to address a location on the device.
- physical_block_size - Smallest unit on which the device can operate.
- minimum_io_size - Minimum unit preferred for random input or output of the device.
- optimal_io_size - Preferred unit of the device for streaming input or output operations.
- alignment_offset - Offset value from the underlying physical alignment
A.2.14. U3_NVMe Copy linkLink copied to clipboard!
What the U3_NVMe test covers
This test runs if the interface is NVMe and the device is connected through a U3 connection.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
Manually adding and running the test
To manually add and run the U3_NVMe test, use the following command:
rhcert-cli plan --add --test U3_NVMe --device nvme0
rhcert-cli plan --add --test U3_NVMe --device nvme0
Following are the device parameter values that are printed as a part of the test:
- logical_block_size - Used to address a location on the device.
- physical_block_size - Smallest unit on which the device can operate.
- minimum_io_size - Minimum unit preferred for random input or output of the device.
- optimal_io_size - Preferred unit of the device for streaming input or output operations.
- alignment_offset - Offset value from the underlying physical alignment.
A.2.15. E3_NVMe Copy linkLink copied to clipboard!
What the E3_NVMe test covers
This test runs if the interface is NVMe and the device is connected through a E3 connection.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
Manually adding and running the test
To manually add and run the E3_NVMe test, use the following command:
rhcert-cli plan --add --test E3_NVMe --device nvme0
rhcert-cli plan --add --test E3_NVMe --device nvme0
Following are the device parameter values that are printed as a part of the test:
- logical_block_size - Used to address a location on the device.
- physical_block_size - Smallest unit on which the device can operate.
- minimum_io_size - Minimum unit preferred for random input or output of the device.
- optimal_io_size - Preferred unit of the device for streaming input or output operations.
- alignment_offset - Offset value from the underlying physical alignment.
A.2.16. STORAGE Copy linkLink copied to clipboard!
What the storage test covers
There are many different kinds of persistent on-line storage devices available in systems today. The STORAGE test is designed to test anything that reports an ID_TYPE of "disk" in the udev database. This includes IDE, SCSI, SATA, SAS, and SSD drives, PCIe SSD block storage devices, as well as SD media, xD media, MemoryStick and MMC cards. The test plan script reads through the udev database and looks for storage devices that meet the above criteria. When it finds one, it records the device and its parent and compares it to the parents of any other recorded devices. It does this to ensure that only devices with unique parents are tested. If the parent has not been seen before, the device is added to the test plan. This speeds up testing as only one device per controller will be tested, as per the Policy Guide.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The STORAGE test performs the following actions on all storage devices with a unique parent:
-
The script looks through the partition table to locate a swap partition that is not on an LVM or software RAID device. If found, it will deactivate it with
swapoffand use that space for the test. If no swap is present, the system can still test the drive if it is completely blank (no partitions). Note that the swap device must be active in order for this to work (the test reads/proc/swapsto find the swap partitions) and that the swap partition must not be inside any kind of software-based container (no LVM or software RAID, but hardware RAID would work as it would be invisible to the system). - The tool creates a filesystem on the device, either in a swap partition on the blank drive.
-
The filesystem is mounted and the
fioordtcommand is used to test the device. Thefioordtcommand is an I/O test program and is a generic test tool capable of testing, reading, and writing to devices. Multiple sets of test patterns verify the functionality of storage devices. - After the mounted filesystem test, the filesystem is unmounted and a dt test is performed against the block device, ignoring the file system. The dt test uses the "direct" parameter to handle this.
Preparing for the test
You should install all the drives and storage controllers that are listed on the official test plan. In the case of multiple storage options, as many as can fit into the system at one time can be tested in a single run, or each storage device can be installed individually and have its own run of the storage test. You can decide on the order of testing and number of controllers present for each test. Each logical drive attached to the system must contain a swap partition in addition to any other partitions, or be totally blank. This is to provide the test with a location to create a filesystem and run the tests. The use of swap partitions will lead to a much quicker test, as devices left blank are tested in their entirety. They will almost always be significantly larger than a swap partition placed on the drive.
If testing an SD media card, use the fastest card you can obtain. While a Class 4 SD card may take 8 hours or more to run the test, a Class 10 or UHS 1/2 card can complete the test run in 30 minutes or less.
When it comes to choosing storage devices for the official test plan, the rule that the review team operates by is "one test per code path". What we mean by that is that we want to see a storage test run using every driver that a controller can use. The scenario of multiple drivers for the same controller usually involves RAID storage of some type. It’s common for storage controllers to use one driver when in regular disk mode and another when in RAID mode. Some even use multiple drivers depending on the RAID mode that they are in. The review team will analyze all storage hardware to determine the drivers that need to be used in order to fulfill all the testing requirements. That’s why you may see the same storage device listed more than once in the official test plan. Complete information on storage device testing is available in the Policy Guide.
Executing the test
The storage test is non-interactive. Run the following command and then select the appropriate STORAGE test name from the list that displays.
rhcert-run
rhcert-run
Run time, bare-metal
The storage test takes approximately 22 minutes on a 6Gb/s SATA hard drive installed in a 2013-era workstation system. The same test takes approximately 3 minutes on a 6Gb/s SATA solid-state drive installed in a 2013-era workstation system. The required supportable test will add about a minute to the overall run time.
A.2.17. supportable test Copy linkLink copied to clipboard!
What the test covers
The supportable test gathers basic information about the host under test (HUT). Red Hat uses this information to verify that the system complies with the certification requisites.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test has several subtests that perform the following tasks:
-
Confirm that the
/proc/sys/kernel/taintedfile contains a zero (0), which indicates that the kernel is not tainted. -
Confirm that package verification with the
rpm -Vcommand shows that no files have been modified. -
Confirm that the
rpm -qa kernelcommand shows that the buildhost of the kernel package is a Red Hat server. -
Record the boot parameters from the
/proc/cmdlinefile. - Confirm that the`rpm -V redhat-certification` command shows that no modifications have been made to any of the certification test suite files.
-
Confirm that all the modules shown by the
lsmodcommand show up in a listing of the kernel files with therpm -ql kernelcommand. - Confirm that all modules are on the Kernel Application Binary Interface (kABI) stablelist.
- Confirm that the module vendor and buildhost are appropriate Red Hat entries.
Confirm that the kernel is the GA kernel of the Red Hat minor release.
The subtest tries to verify the kernel with data from the redhat-certification package. If the kernel is not present, the subtest attempts to verify the kernel by using the Internet connection.
To verify the kernel by using the Internet connection, you must either configure the HUT’s routing and DNS resolution to access the Internet or set the
ftp_proxy=http://proxy.domain:80environment variable.-
Check for any known hardware vulnerabilities reported by the kernel. The subtest reads the files in the
/sys/devices/system/cpu/vulnerabilities/directory and exits with a warning if the files contain the word “Vulnerable”. -
Confirm if the system has any offline CPUs by checking the output of the
lscpucommand. - Confirm if Simultaneous Multithreading (SMT) is available, enabled, and active in the system.
Check if there is unmaintained hardware or drivers in systems running RHEL 8 or later.
Unmaintained hardware and drivers are no longer tested or updated on a routine basis. Red Hat may fix serious issues, including security issues, but you cannot expect updates on any planned cadence.
Replace or remove unmaintained hardware or drivers as soon as possible.
Check if there is deprecated hardware or drivers in systems running RHEL 8 or later.
Deprecated hardware and drivers are still tested and maintained, but they are planned to become unmaintained and eventually disabled in a future release.
Replace or remove deprecated devices or hardware as soon as possible.
Check if there is disabled hardware in systems running RHEL 8 or later.
RHEL cannot use disabled hardware. Replace or remove the disabled hardware from your system before running the test again.
Run the following checks on the software RPM packages:
Check the RPM build host information to isolate non-Red Hat packages.
The test will ask you to explain the reasons for including the non-Red Hat packages. Red Hat will review the reasons and approve or reject each package individually.
Check that the installed RPM packages are from the Red Hat products available in the offering and have not been modified.
Red Hat reviews verification failures in the
rpm_verification_report.logfile. You will need to reinstall the failed packages and rerun the test.
- Check the presence of both Red Hat and non-Red Hat firmware files in the system. It lists the non-Red Hat files, if present, and exits with REVIEW status.
-
Check the page size of systems by
getconf PAGESIZEcommand. For RHEL AI certification, the supportable test executes an additional test that captures the following details of the HUT :
- OS version
- Total number of AI accelerators in the systems
- The list of driver modules that are loaded.
After performing these tasks, the test gathers a sosreport and the output of the dmidecode command.
Executing the test
The rhcert tool runs the supportable test automatically as part of every run of the test suite. The supportable test runs before any other test.
The output of the supportable test is required as part of the test suite logs. Red Hat will reject test logs that do not contain the output of the supportable test.
Use the following command to run the test manually, if required:
rhcert-cli run --test supportable
$ rhcert-cli run --test supportable
Run time
The supportable test takes around 1 minute on a 2013-era, single CPU, 3.3GHz, 6-core or 12-thread Intel workstation with 8 GB of RAM running Red Hat Enterprise Linux 6.4, AMD64, and Intel 64 that was installed using the Kickstart files in this guide. The time will vary depending on the speed of the machine and the number of RPM files that are installed.
A.2.18. VIDEO Copy linkLink copied to clipboard!
What the test covers
For RHEL 8, the VIDEO test checks for all removable or integrated video hardware on the motherboard. Devices are selected for testing by their PCI class ID. Specifically, the test checks for a device with a PCI class as Display Controller in the udev command output.
For RHEL 9, the VIDEO test remains the same. However, for framebuffer graphic solutions, the test is planned after it identifies if the display kernel driver is in use as a framebuffer and if direct rendering is not supported using the glxinfo command.
For RHEL 10, the VIDEO test checks for all the removable or integrated video hardware on the motherboard. You don’t need to run the test from a terminal within the GUI. As long as the GNOME Display Manager (GDM) is running in the background, the test automatically detects the active display sockets and executes the subtests, whether initiated from a GUI terminal or an SSH session.
RHEL version supported
- RHEL 8
- RHEL 9
- RHEL 10
What the test does
The test runs multiple subtests:
For RHEL 8 and 9:
-
Check Connections - Logs the
xrandrcommand output. This subtest is optional, and its failure does not affect the overall test result. - Set Configuration - Checks the necessary configuration prerequisites like setting the display depth, flags, and configurations for the next subtest.
-
The X Server Test - Starts another display server using the new configuration file and runs the
glxgears, a lightweight MESA OpenGL demonstration program to check the performance. -
Log Module and Drivers - Runs
xdpyinfoto determine the screen resolution and color depth. Along with that, the configuration file created at the start of the test should allow the system to run at the maximum resolution capability.
Finally, the test uses grep to search through the /var/log/Xorg.0.log logfile to determine in-use modules and drivers.
For RHEL 10:
-
Check Connections: Logs the
wayland-infocommand output and is a replacement forxrandrcommand available for RHEL 8 and RHEL 9 systems. It shows information about the connected displays. -
Log resolution and display size: Parses the
wayland-infocommand output and prints the resolution and display size. It raises a REVIEW status if the connected display type is virtual or if it does not meet a minimum requirement of 1024x768 resolution. -
The XWayland test: This test replaces the X server test. It checks the functioning of the XWayland compatibility layer by running the
glxgearsdemo. It also checks the command output. - Check and log driver info: Checks the display drivers loaded by GDM to see if they are built by Red Hat.
Preparing for the test
- Ensure GUI is installed and running on your test system.
Ensure that the monitor and video card in the system can run at a minimum resolution of 1024x768 with a color depth of 24 bits per pixel (bpp). Higher resolutions or color depths are also acceptable.
-
For RHEL 8 and 9, check the
xrandrcommand output for 1024x768 at 24 bpp or higher to confirm. -
For RHEL 10, check the
wayland-infocommand output for 1024x768 at 24 bpp or higher to confirm.
-
For RHEL 8 and 9, check the
- If you do not see all the resolutions that the card or monitor combination can generate, ensure to remove any KVM switches between the monitor and video card.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate VIDEO test name from the list that displays.
rhcert-run
rhcert-run
First, the test system screen will go blank, and then a series of test patterns from the x11perf test program will appear. When the test finishes, it will return to the desktop or the virtual terminal screen.
Run time
The test takes about 1 minute to complete. Any other mandatory or selected tests will add to the overall run time.
A.2.19. VIDEO_DRM Copy linkLink copied to clipboard!
What the test covers
The VIDEO_DRM test verifies the graphics controller, which utilizes a native DRM kernel driver with basic graphics support.
The test will plan if:
- The display driver in use is identified as a kernel mode-setting driver.
- The display driver is not a framebuffer.
-
The direct rendering is not supported as identified by the
glxinfocommand, and the OpenGL renderer string isllvmpipe.
RHEL version supported
- RHEL 9
What the test does
The test verifies the functionality of the graphics controller similar to the VIDEO.
Preparing for the test
-
Ensure that the monitor and video card in the system can run at a resolution of 1024x768 with a color depth of 24 bits per pixel (bpp). Higher resolutions or color depths are also acceptable. Check the
xrandrcommand output for 1024x768 at 24 bpp or higher to confirm. - If you do not see all the resolutions that the card or monitor combination can generate, ensure to remove any KVM switches between the monitor and video card.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate VIDEO_DRM test name from the list that displays.
rhcert-run
rhcert-run
First, the test system screen will go blank, and then a series of test patterns from the x11perf test program will appear. When the test finishes, it will return to the desktop or the virtual terminal screen.
Run time
The test takes about 1 minute to complete. Any other mandatory or selected tests will add to the overall run time.
A.2.20. VIDEO_DRM_3D Copy linkLink copied to clipboard!
What the test covers
The VIDEO_DRM_3D test verifies the graphics controller, which utilizes a native DRM kernel driver with accelerated graphics support.
The test will plan if:
- The display driver in use is identified as a kernel mode-setting driver.
- The display driver is not a framebuffer.
-
The direct rendering is supported as identified by the
glxinfocommand, and the OpenGL renderer string is notllvmpipe.
The test uses Prime GPU Offloading technology to execute all the video test subtests.
RHEL version supported
- RHEL 9
- RHEL 10
What the test does
The test verifies the functionality of the graphics controller similar to the VIDEO test. In addition, the test runs the following subtests:
For RHEL 9:
-
Vulkaninfo test - Logs the
vulkaninfocommand output to collect the Vulkan information such as device properties of identified GPUs, Vulkan extensions supported by each GPU, recognized layers, supported image formats, and format properties. -
Glmark2 benchmarking test - Runs the
glmark2command to generate the score based on the OpenGL 2.0 & ES 2.0 benchmark set of tests and confirms the 3D capabilities. The subtest executes the utility two times with a different set of parameters, first with the Hardware renderer and later with the Software renderer. If the Hardware renderer command-run results in a better score than software, the test passes successfully, confirming the display controller has better 3D capabilities, otherwise fails.
For RHEL 10:
- Log vulkaninfo test: Logs the vulkaninfo command output.
- Test GPU capabilities: Runs glmark2-wayland command instead of the X version test. This command verifies that the test system is really hardware accelerated, by comparing the scores from both the hardware and software renderers.
Preparing for the test
- Ensure GUI is installed and running on your test system.
Ensure that the monitor and video card in the system can run at a minimum resolution of 1024x768 with a color depth of 24 bits per pixel (bpp). Higher resolutions or color depths are also acceptable.
-
For RHEL 9, check the
xrandrcommand output for 1024x768 at 24 bpp or higher to confirm. -
For RHEL 10, check the
wayland-infocommand output for 1024x768 at 24 bpp or higher to confirm.
-
For RHEL 9, check the
- If you do not see all the resolutions that the card or monitor combination can generate, ensure to remove any KVM switches between the monitor and video card.
Executing the test
The test is non-interactive. Run the following command and then select the appropriate VIDEO_DRM_3D test name from the list that displays.
rhcert-run
rhcert-run
First, the test system screen will go blank, and then a series of test patterns from the x11perf test program will appear. When the test finishes, it will return to the desktop or the virtual terminal screen.
Run time
The test takes about 1 minute to complete. Any other mandatory or selected tests will add to the overall run time.
A.2.21. Manually adding and running the tests Copy linkLink copied to clipboard!
On rare occasions, tests may fail to plan due to problems with hardware detection or other issues with the hardware, OS, or test scripts. If this happens you should get in touch with your Red Hat support contact for further assistance. They will likely ask you to open a support ticket for the issue, and then explain how to manually add a test to your local test plan using the rhcert-cli command on the HUT. Any modifications you make to the local test plan will be sent to the test server, so you can continue to use the web interface on the test server to run your tests. The command is run as follows:
rhcert-cli plan --add --test=<testname> --device=<devicename> --udi-<udi>
# rhcert-cli plan --add --test=<testname> --device=<devicename> --udi-<udi>
The options for the rhcert-cli command used here are:
-
plan- Modify the test plan -
--add- Add an item to the test plan --test=<testname>- The test to be added. The test names are as follows:- hwcert/kdump
- hwcert/network/Ethernet/100MegEthernet
- hwcert/network/Ethernet/1GigEthernet
- hwcert/network/Ethernet/10GigEthernet
- hwcert/network/Ethernet/40GigEthernet
- hwcert/network/NetworkManageableCheck
- hwcert/memory
- hwcert/core
- hwcert/cpuscaling
- hwcert/fvtest/fv_core
- hwcert/fvtest/fv_memory
- hwcert/profiler
- hwcert/profiler/profiler_hardware_core
- hwcert/profiler/profiler_hardware_uncore
- hwcert/profiler/profiler_software
- hwcert/storage
- hwcert/video
- hwcert/video/video_drm
- hwcert/video/video_drm_3d
- hwcert/supportable
- hwcert/storage/U2_NVME
- hwcert/storage/U3_NVME
- hwcert/storage/M2_NVME
- hwcert/storage/E3_NVME
- hwcert/storage/PCIE_NVME
The other options are only needed if a device must be specified, like in the network and storage tests that need to be told which device to run on. There are various places you would need to look to determine the device name or UDI that would be used here. Support can help determine the proper name or UDI. Once found, you would use one of the following two options to specify the device:
-
--device=<devicename>- The device that should be tested, identified by a device name such as "enp0s25" or "host0". -
--udi=<UDI>- The unique device ID of the device to be tested, identified by a UDI string.
-
Run the rhcert-cli command by specifying the test name:
rhcert-cli run --test=<test_name>
rhcert-cli run --test=<test_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow for example:
rhcert-cli run --test=audio
rhcert-cli run --test=audioCopy to Clipboard Copied! Toggle word wrap Toggle overflow You can specify
--deviceto run the specific device:rhcert-cli run --test=<test name> --device=<device name>
rhcert-cli run --test=<test name> --device=<device name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow for example:
rhcert-cli run --test=kdump --device=nfs
rhcert-cli run --test=kdump --device=nfsCopy to Clipboard Copied! Toggle word wrap Toggle overflow
It is advisable to use rhcert-cli or rhcert-run independently and save the results. Mixing the use of both rhcert-cli and rhcert-run and saving the results together may result in the inability to process the results correctly.
A.3. RHEL AI hardware certification tests Copy linkLink copied to clipboard!
The following tests are executed for RHEL AI hardware certification:
The tests are planned only if the underlying system is RHEL AI. The redhat-certification-hardware-ai test suite identifies if your HUT is RHEL AI, by checking the following parameters in the /etc/os-release file:
- RHEL_AI_VERSION_ID
- VARIANT
The RHEL AI certification test suite has the following tests:
- ilab_inferencing test
- ilab_validation test
- self_check test
- supportable test
- sos report test
A.3.1. ilab_inferencing test Copy linkLink copied to clipboard!
What the test covers
The ilab_inferencing test serves and interacts with a pre-trained model and checks if it uses the AI accelerators during the process. Inferencing is when a model can process and produce outputs from input data.
For a detailed list of RHEL AI hardware requirements see Hardware requirements for inference serving Granite models.
What the test does
The ilab inferencing test captures the model name from the serve section of ilab config file and downloads the model. The test then serves the model and interacts with it.
The test monitors and captures the AI accelerators status during the following phases:
- Before the ilab serve starts
- When ilab serve has started
- During the interaction with the loaded model
- After ilab serve stops
Preparing for the test
Before running the test, make sure you are logged in to registry.redhat.io by using the skopeo tool. This allows the ilab_inferencing test to download models from the registry during execution.
Executing the test
The test is interactive. The ilab inferencing test starts with the initialization of ilab. When prompted, select a training profile based on the accelerator configuration of your system.
At present, init ilab subtest returns the status REVIEW.
Run time
This is an automated test and can take a couple of minutes to complete.
A.3.2. ilab_validation test Copy linkLink copied to clipboard!
What the test covers
This test captures the model names mentioned in the generate, train and evaluate sections of the ilab config file and downloads them.
Some tested models can exceed 80GB in size.
What the test does
The ilab_validation test covers end-to-end testing of ilab training. It consists of the following steps:
- Taxonomy
- Synthetic Data Generation (SDG)
- Multiphase training
- Single Phase training
-
Evaluation by using
mmlu -
Evaluation by using
mt_bench
For each of the above steps, the test will capture the status of the AI accelerators after a certain period of time of the test run.
Taxonomy
The LAB method is driven by taxonomies, an information classification method. While running the RHEL AI Hardware certification test, the test suite will perform the following functions:
- Clones the RHEL AI git repository.
-
Copies the data from the location
knowledge/science/astronomy/constellations/phoenix/ -
Runs the command
ilab taxonomy diff
Synthetic Data Generation (SDG)
A process where Large Language Models (LLMs) are used, with human generated samples, to generate artificial data that can be used to train other LLMs.
Multiphase training
The LAB method implements a fine-tuning strategy where a model is trained on multiple datasets in separate phases called epochs. Each phase saves a checkpoint, and the best-performing checkpoint is used for further training. The fully fine-tuned model is the best performing checkpoint from the final phase.
For certification tests, training is run only for 2 epochs.
-
Starts a
tmuxsession. - Captures the status of the AI accelerators after running the test for 5 minutes.
-
After running the required commands, the test suite prints a list of checkpoints created in the location
/root/.local/share/instructlab/checkpoints/hf_format/ - One of the above checkpoints is randomly used for the evaluation phase.
Preparing for the test
Before running the test, make sure you are logged into registry.redhat.io by using the skopeo tool. This allows the ilab_validation test to download models from the registry during execution.
Executing the test
This test is interactive. This test takes a couple of hours to complete depending on the class of AI accelerators available in the HUT. Each of the subtests is started as a separate tmux session.
While running the SDG test, the test suite performs the following:
Checks if the pre-generated datasets are available.
If the datasets exist,
The test prompts for a user confirmation to reuse the datasets or delete them.
- Yes - Skips further steps.
- No - Deletes the generated datasets and proceeds with the test.
- If the datasets do not exist, the test proceeds to generate them.
After running the required commands, the test suite checks if the datasets are generated that contain the following names, as they will be consumed in further tests:
-
knowledge_train_msgs -
skills_train_msgs -
messages
-
While the ilab test is running in the background, you can interact with the ilab process. This step is optional.
If you want to interact with the ilab process, select one of the following options during the run time:
-
Status of ilab process - To check the current status of the ilab process running in the
tmuxsession. -
Attach
tmuxsession - To attach the tmux session (read-only mode) in which the ilab process is running. To exit, press the keysctrl+band thend. - GPU usage - To print the current usage of accelerators in the system.
- Kill ilab process - To kill the currently running ilab process in the tmux session. You’re prompted for a reason after which a kill signal is sent to the ilab process.
When you select this option, the ilab_validation subtest returns a FAIL status.
The above options are available during the runtime of the ilab process. When the test run is complete, the test status is automatically updated by the observer thread running in the background.
Run time
The following are the approximate run time details of the ilab_validation test trained for 2 epochs:
- SDG - 35 minutes
- Multiphase training - 30 hours for full training , 95 minutes for short training
- Single phase train - 10 minutes
- Combined evaluation - 1 hour
The run time values vary with the class of AI accelerators present in the HUT.