Ce contenu n'est pas disponible dans la langue sélectionnée.
Deploying RHEL 9 on Microsoft Azure
Obtaining RHEL system images and creating RHEL instances on Azure
Abstract
Providing feedback on Red Hat documentation Copier lienLien copié sur presse-papiers!
We appreciate your feedback on our documentation. Let us know how we can improve it.
Submitting feedback through Jira (account required)
- Log in to the Jira website.
- Click Create in the top navigation bar
- Enter a descriptive title in the Summary field.
- Enter your suggestion for improvement in the Description field. Include links to the relevant parts of the documentation.
- Click Create at the bottom of the dialogue.
Chapter 1. Introducing RHEL on public cloud platforms Copier lienLien copié sur presse-papiers!
Public cloud platforms offer computing resources as a service. Instead of using on-premise hardware, you can run your IT workloads, including Red Hat Enterprise Linux (RHEL) systems, as public cloud instances.
1.1. Benefits of using RHEL in a public cloud Copier lienLien copié sur presse-papiers!
RHEL as a cloud instance located on a public cloud platform has the following benefits over RHEL on-premises physical systems or virtual machines (VMs):
Flexible and fine-grained allocation of resources
A cloud instance of RHEL runs as a VM on a cloud platform, which typically means a cluster of remote servers maintained by the provider of the cloud service. Therefore, allocating hardware resources to the instance, such as a specific type of CPU or storage, happens on the software level and is easily customizable.
In comparison to a local RHEL system, you are also not limited by the capabilities of your physical host. Instead, you can choose from a variety of features, based on selection offered by the cloud provider.
Space and cost efficiency
You do not need to own any on-premises servers to host your cloud workloads. This avoids the space, power, and maintenance requirements associated with physical hardware.
Instead, on public cloud platforms, you pay the cloud provider directly for using a cloud instance. The cost is typically based on the hardware allocated to the instance and the time you spend using it. Therefore, you can optimize your costs based on your requirements.
Software-controlled configurations
The entire configuration of a cloud instance is saved as data on the cloud platform, and is controlled by software. Therefore, you can easily create, remove, clone, or migrate the instance. A cloud instance is also operated remotely in a cloud provider console and is connected to remote storage by default.
In addition, you can back up the current state of a cloud instance as a snapshot at any time. Afterwards, you can load the snapshot to restore the instance to the saved state.
Separation from the host and software compatibility
Similarly to a local VM, the RHEL guest operating system on a cloud instance runs on a virtualized kernel. This kernel is separate from the host operating system and from the client system that you use to connect to the instance.
Therefore, any operating system can be installed on the cloud instance. This means that on a RHEL public cloud instance, you can run RHEL-specific applications that cannot be used on your local operating system.
In addition, even if the operating system of the instance becomes unstable or is compromised, your client system is not affected in any way.
1.2. Public cloud use cases for RHEL Copier lienLien copié sur presse-papiers!
Deploying on a public cloud provides many benefits, but might not be the most efficient solution in every scenario. If you are evaluating whether to migrate your RHEL deployments to the public cloud, consider whether your use case will benefit from the advantages of the public cloud.
Beneficial use cases
Deploying public cloud instances is very effective for flexibly increasing and decreasing the active computing power of your deployments, also known as scaling up and scaling down. Therefore, using RHEL on public cloud is recommended in the following scenarios:
- Clusters with high peak workloads and low general performance requirements. Scaling up and down based on your demands can be highly efficient in terms of resource costs.
- Quickly setting up or expanding your clusters. This avoids high upfront costs of setting up local servers.
- Cloud instances are not affected by what happens in your local environment. Therefore, you can use them for backup and disaster recovery.
Potentially problematic use cases
- You are running an existing environment that cannot be adjusted. Customizing a cloud instance to fit the specific needs of an existing deployment may not be cost-effective in comparison with your current host platform.
- You are operating with a hard limit on your budget. Maintaining your deployment in a local data center typically provides less flexibility but more control over the maximum resource costs than the public cloud does.
Next steps
1.3. Frequent concerns when migrating to a public cloud Copier lienLien copié sur presse-papiers!
Moving your RHEL workloads from a local environment to a public cloud platform might raise concerns about the changes involved. The following are the most commonly asked questions.
Will my RHEL work differently as a cloud instance than as a local virtual machine?
In most respects, RHEL instances on a public cloud platform work the same as RHEL virtual machines on a local host, such as an on-premises server. Notable exceptions include:
- Instead of private orchestration interfaces, public cloud instances use provider-specific console interfaces for managing your cloud resources.
- Certain features, such as nested virtualization, may not work correctly. If a specific feature is critical for your deployment, check the feature’s compatibility in advance with your chosen public cloud provider.
Will my data stay safe in a public cloud as opposed to a local server?
The data in your RHEL cloud instances is in your ownership, and your public cloud provider does not have any access to it. In addition, major cloud providers support data encryption in transit, which improves the security of data when migrating your virtual machines to the public cloud.
The general security of your RHEL public cloud instances is managed as follows:
- Your public cloud provider is responsible for the security of the cloud hypervisor
- Red Hat provides the security features of the RHEL guest operating systems in your instances
- You manage the specific security settings and practices in your cloud infrastructure
What effect does my geographic region have on the functionality of RHEL public cloud instances?
You can use RHEL instances on a public cloud platform regardless of your geographical location. Therefore, you can run your instances in the same region as your on-premises server.
However, hosting your instances in a physically distant region might cause high latency when operating them. In addition, depending on the public cloud provider, certain regions may provide additional features or be more cost-efficient. Before creating your RHEL instances, review the properties of the hosting regions available for your chosen cloud provider.
1.4. Obtaining RHEL for public cloud deployments Copier lienLien copié sur presse-papiers!
To deploy a RHEL system in a public cloud environment, you need to:
Select the optimal cloud provider for your use case, based on your requirements and the current offer on the market.
The cloud providers currently certified for running RHEL instances are:
- Amazon Web Services (AWS)
- Google Cloud
- Note
This document specifically talks about deploying RHEL on Microsoft Azure.
- Create a RHEL cloud instance on your chosen cloud platform. For more information, see Methods for creating RHEL cloud instances.
- To keep your RHEL deployment up-to-date, use Red Hat Update Infrastructure (RHUI).
1.5. Methods for creating RHEL cloud instances Copier lienLien copié sur presse-papiers!
To deploy a RHEL instance on a public cloud platform, you can use one of the following methods:
| Create a system image of RHEL and import it to the cloud platform.
|
| Purchase a RHEL instance directly from the cloud provider marketplace.
|
For detailed instructions on using various methods to deploy RHEL instances On Microsoft Azure, see the following chapters in this document.
Chapter 2. Creating and automatically uploading VHD images to Microsoft Azure cloud Copier lienLien copié sur presse-papiers!
You can create .vhd images by using RHEL image builder that will be automatically uploaded to a Blob Storage of the Microsoft Azure Cloud service provider.
Prerequisites
- You have root access to the system.
- You have access to the RHEL image builder interface of the RHEL web console.
- You created a blueprint. See Creating a RHEL image builder blueprint in the web console interface.
- You have a Microsoft Storage Account created.
- You have a writable Blob Storage prepared.
Procedure
- In the RHEL image builder dashboard, select the blueprint you want to use.
- Click the tab.
Click to create your customized
.vhdimage.The Create image wizard opens.
-
Select
Microsoft Azure (.vhd)from the Type drop-down menu list. - Check the Upload to Azure checkbox to upload your image to the Microsoft Azure Cloud.
- Enter the Image Size and click .
-
Select
On the Upload to Azure page, enter the following information:
On the Authentication page, enter:
- Your Storage account name. You can find it on the Storage account page, in the Microsoft Azure portal.
- Your Storage access key: You can find it on the Access Key Storage page.
- Click .
On the Authentication page, enter:
- The image name.
- The Storage container. It is the blob container to which you will upload the image. Find it under the Blob service section, in the Microsoft Azure portal.
- Click .
On the Review page, click . The RHEL image builder and upload processes start.
Access the image you pushed into Microsoft Azure Cloud.
- Access the Microsoft Azure portal.
- In the search bar, type "storage account" and click Storage accounts from the list.
- On the search bar, type "Images" and select the first entry under Services. You are redirected to the Image dashboard.
- On the navigation panel, click Containers.
-
Find the container you created. Inside the container is the
.vhdfile you created and pushed by using RHEL image builder.
Verification
Verify that you can create a VM image and launch it.
- In the search bar, type images account and click Images from the list.
- Click .
- From the dropdown list, choose the resource group you used earlier.
- Enter a name for the image.
- For the OS type, select Linux.
- For the VM generation, select Gen 2.
- Under Storage Blob, click and click through the storage accounts and container until you reach your VHD file.
- Click Select at the end of the page.
- Choose an Account Type, for example, Standard SSD.
- Click and then . Wait a few moments for the image creation.
To launch the VM, follow the steps:
- Click .
- Click from the menu bar on the header.
- Enter a name for your virtual machine.
- Complete the Size and Administrator account sections.
Click and then . You can see the deployment progress.
After the deployment finishes, click the virtual machine name to retrieve the public IP address of the instance to connect by using SSH.
- Open a terminal to create an SSH connection to connect to the VM.
Chapter 3. Deploying a Red Hat Enterprise Linux image as a virtual machine on Microsoft Azure Copier lienLien copié sur presse-papiers!
To deploy a Red Hat Enterprise Linux 9 (RHEL 9) image on Microsoft Azure, follow the information below. This chapter:
- Discusses your options for choosing an image
- Lists or refers to system requirements for your host system and virtual machine (VM)
- Provides procedures for creating a custom VM from an ISO image, uploading it to Azure, and launching an Azure VM instance
You can create a custom VM from an ISO image, but Red Hat recommends that you use the Red Hat Image Builder product to create customized images for use on specific cloud providers. With Image Builder, you can create and upload an Azure Disk Image (VHD format). See Composing a Customized RHEL System Image for more information.
For a list of Red Hat products that you can use securely on Azure, refer to Red Hat on Microsoft Azure.
Prerequisites
- Sign up for a Red Hat Customer Portal account.
- Sign up for a Microsoft Azure account.
3.1. Red Hat Enterprise Linux image options on Azure Copier lienLien copié sur presse-papiers!
The following table lists image choices for RHEL 9 on Microsoft Azure, and notes the differences in the image options.
| Image option | Subscriptions | Sample scenario | Considerations |
|---|---|---|---|
| Deploy a Red Hat Gold Image. | Use your existing Red Hat subscriptions. | Select a Red Hat Gold Image on Azure. For details on Gold Images and how to access them on Azure, see the Red Hat Cloud Access Reference Guide. | The subscription includes the Red Hat product cost; you pay Microsoft for all other instance costs. |
| Deploy a custom image that you move to Azure. | Use your existing Red Hat subscriptions. | Upload your custom image and attach your subscriptions. | The subscription includes the Red Hat product cost; you pay Microsoft for all other instance costs. |
| Deploy an existing Azure image that includes RHEL. | The Azure images include a Red Hat product. | Choose a RHEL image when you create a VM by using the Azure console, or choose a VM from the Azure Marketplace. | You pay Microsoft hourly on a pay-as-you-go model. Such images are called "on-demand." Azure provides support for on-demand images through a support agreement. Red Hat provides updates to the images. Azure makes the updates available through the Red Hat Update Infrastructure (RHUI). |
3.2. Understanding base images Copier lienLien copié sur presse-papiers!
This section includes information about using preconfigured base images and their configuration settings.
3.2.1. Using a custom base image Copier lienLien copié sur presse-papiers!
To manually configure a virtual machine (VM), first create a base (starter) VM image. Then, you can modify configuration settings and add the packages the VM requires to operate on the cloud. You can make additional configuration changes for your specific application after you upload the image.
To prepare a cloud image of RHEL, follow the instructions in the sections below. To prepare a Hyper-V cloud image of RHEL, see the Prepare a Red Hat-based virtual machine from Hyper-V Manager.
3.2.2. Required system packages Copier lienLien copié sur presse-papiers!
To create and configure a base image of RHEL, your host system must have the following packages installed.
| Package | Repository | Description |
|---|---|---|
| libvirt | rhel-9-for-x86_64-appstream-rpms | Open source API, daemon, and management tool for managing platform virtualization |
| virt-install | rhel-9-for-x86_64-appstream-rpms | A command-line utility for building VMs |
| libguestfs | rhel-9-for-x86_64-appstream-rpms | A library for accessing and modifying VM file systems |
| guestfs-tools | rhel-9-for-x86_64-appstream-rpms |
System administration tools for VMs; includes the |
3.2.3. Azure VM configuration settings Copier lienLien copié sur presse-papiers!
Azure virtual machines (VMs) must have the following configuration settings. Some of these settings are enabled during the initial VM creation. Other settings are set when provisioning the VM image for Azure. Keep these settings in mind as you move through the procedures. Refer to them as necessary.
| Setting | Recommendation |
|---|---|
| SSH | SSH must be enabled to provide remote access to your Azure VMs. |
| dhcp | The primary virtual adapter should be configured for dhcp (IPv4 only). |
| swap space |
Do not create a dedicated swap file or |
| NIC |
Choose |
| encryption | For custom images, use Network Bound Disk Encryption (NBDE) for full disk encryption on Azure. |
3.2.4. Configuring swap space with cloud-init on Azure Copier lienLien copié sur presse-papiers!
To use swap space for a Red Hat Enterprise Linux (RHEL) virtual machine (VM) on Microsoft Azure, you need to create a swap partition on the ephemeral disk. Only use the ephemeral disk for creating a swap partition, not the operating system (OS) disk or data (storage) disk. Because the ephemeral disk is deleted when the virtual machine is deleted, the swap partition is also removed.
You can use the cloud-init utility to configure a swap partition on the ephemeral disk on-demand. Ephemeral disk is a local storage of the VM, while a resource disk is mounted storage on VM itself. Both storage types store data temporarily. Deleting, moving, stopping, or failure of the VM will result in the loss of the data stored on the ephemeral or resource disk.
Do not use the ephemeral disk for persistent data. All contents, including the swap partition, are deleted when the VM is stopped or moved.
Prerequisites
-
You have installed the
cloud-initutility on the VM. You have disabled the swap configuration in the Windows Azure Linux Agent (WALA) by setting the parameters in the
/etc/waagent.conffile:ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0
ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow - You have an ephemeral disk available on the VM.
Procedure
- Log in to the VM.
Create and edit the
/etc/cloud/cloud.cfg.d/00-azure-swap.cfgconfiguration file and add the followingcloud-initconfiguration to the file:vi /etc/cloud/cloud.cfg.d/00-azure-swap.cfg
# vi /etc/cloud/cloud.cfg.d/00-azure-swap.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow This configuration:
-
Partitions the ephemeral disk (
ephemeral0) with a GPT partition table. -
Creates two partitions: 66% for a file system (mounted at
/mnt) and 33% forswapspace. -
Formats the first partition as
ext4and the second partition asswap. Configures automatic mounting of both partitions at boot time.
NoteThe partition layout
[66, [33,82]]allocates 66% of the disk to the first partition and 33% to the second partition. The82in the second partition specification indicates a Linux swap partition type. You can adjust these percentages based on your requirements.
-
Partitions the ephemeral disk (
Verify the configuration file for any errors:
cloud-init devel schema --config-file /etc/cloud/cloud.cfg.d/00-azure-swap.cfg
# cloud-init devel schema --config-file /etc/cloud/cloud.cfg.d/00-azure-swap.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the configuration is valid, the command returns no errors.
Verification
After you reboot the VM, check that the swap partition is configured and active by verifying the active swap space, swap usage, and the swap partition entry in the
/etc/fstabfile.Check active swap space:
swapon -s
$ swapon -sCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output should show the swap partition from
ephemeral0.2:Filename Type Size Used Priority /dev/ephemeral0.2 partition 8388604 0 -2
Filename Type Size Used Priority /dev/ephemeral0.2 partition 8388604 0 -2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check swap usage:
free -h
$ free -hCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output should show swap space in the
Swaprow:total used free shared buffered/cache available Mem: 7.8Gi 1.2Gi 5.8Gi 16MiB 800MiB 6.3Gi Swap: 8.0Gi 0B 8.0Gi
total used free shared buffered/cache available Mem: 7.8Gi 1.2Gi 5.8Gi 16MiB 800MiB 6.3Gi Swap: 8.0Gi 0B 8.0GiCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the swap partition is present in the
/etc/fstabfile:grep swap /etc/fstab
$ grep swap /etc/fstabCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output should include an entry for the swap partition, for example:
/dev/ephemeral0.2 none swap sw,nofail,x-systemd.requires=cloud-init.service 0 0
/dev/ephemeral0.2 none swap sw,nofail,x-systemd.requires=cloud-init.service 0 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow
3.2.5. Creating a base image from an ISO image Copier lienLien copié sur presse-papiers!
The following procedure lists the steps and initial configuration requirements for creating a custom ISO image. Once you have configured the image, you can use the image as a template for creating additional VM instances.
Prerequisites
- Ensure that you have enabled your host machine for virtualization. See Enabling virtualization in RHEL 9 for information and procedures.
Procedure
- Download the latest Red Hat Enterprise Linux 9 DVD ISO image from the Red Hat Customer Portal.
Create and start a basic Red Hat Enterprise Linux VM. For instructions, see Creating virtual machines.
If you use the command line to create your VM, ensure that you set the default memory and CPUs to the capacity you want for the VM. Set your virtual network interface to virtio.
For example, the following command creates a
kvmtestVM by using therhel-9.0-x86_64-kvm.qcow2image:virt-install \ --name kvmtest --memory 2048 --vcpus 2 \ --disk rhel-9.0-x86_64-kvm.qcow2,bus=virtio \ --import --os-variant=rhel9.0# virt-install \ --name kvmtest --memory 2048 --vcpus 2 \ --disk rhel-9.0-x86_64-kvm.qcow2,bus=virtio \ --import --os-variant=rhel9.0Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use the web console to create your VM, follow the procedure in Creating virtual machines using the web console, with these caveats:
- Do not check Immediately Start VM.
- Change your Memory size to your preferred settings.
- Before you start the installation, ensure that you have changed Model under Virtual Network Interface Settings to virtio and change your vCPUs to the capacity settings you want for the VM.
Review the following additional installation selection and modifications.
- Select Minimal Install with the standard RHEL option.
For Installation Destination, select Custom Storage Configuration. Use the following configuration information to make your selections.
- Ensure at least 500 MB for /boot. However, 1 GB or more is adequate.
- For file system, use xfs, ext4, or ext3 for both boot and root partitions.
-
During installation, remove swap space from the OS disk. Use
cloud-initon the ephemeral disk after deployment to configure swap space.
- On the Installation Summary screen, select Network and Host Name. Switch Ethernet to On.
When the install starts:
-
Create a
rootpassword. - Create an administrative user account.
-
Create a
- When installation is complete, reboot the VM and log in to the root account.
-
Once you are logged in as
root, you can configure the image.
3.3. Configuring a custom base image for Microsoft Azure Copier lienLien copié sur presse-papiers!
To deploy a RHEL 9 virtual machine (VM) with specific settings in Azure, you can create a custom base image for the VM. The following sections describe additional configuration changes that Azure requires.
3.3.1. Installing Hyper-V device drivers Copier lienLien copié sur presse-papiers!
Microsoft provides network and storage device drivers as part of their Linux Integration Services (LIS) for Hyper-V package. You may need to install Hyper-V device drivers on the VM image prior to provisioning it as an Azure virtual machine (VM). Use the lsinitrd | grep hv command to verify that the drivers are installed.
Procedure
Enter the following
grepcommand to determine if the required Hyper-V device drivers are installed.lsinitrd | grep hv
# lsinitrd | grep hvCopy to Clipboard Copied! Toggle word wrap Toggle overflow In the example below, all required drivers are installed.
lsinitrd | grep hv drwxr-xr-x 2 root root 0 Aug 12 14:21 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv -rw-r--r-- 1 root root 31272 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv/hv_vmbus.ko.xz -rw-r--r-- 1 root root 25132 Aug 11 08:46 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz -rw-r--r-- 1 root root 9796 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/scsi/hv_storvsc.ko.xz
# lsinitrd | grep hv drwxr-xr-x 2 root root 0 Aug 12 14:21 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv -rw-r--r-- 1 root root 31272 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv/hv_vmbus.ko.xz -rw-r--r-- 1 root root 25132 Aug 11 08:46 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz -rw-r--r-- 1 root root 9796 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/scsi/hv_storvsc.ko.xzCopy to Clipboard Copied! Toggle word wrap Toggle overflow If all the drivers are not installed, complete the remaining steps.
NoteAn
hv_vmbusdriver may exist in the environment. Even if this driver is present, complete the following steps.-
Create a file named
hv.confin/etc/dracut.conf.d. Add the following driver parameters to the
hv.conffile.add_drivers+=" hv_vmbus " add_drivers+=" hv_netvsc " add_drivers+=" hv_storvsc " add_drivers+=" nvme "
add_drivers+=" hv_vmbus " add_drivers+=" hv_netvsc " add_drivers+=" hv_storvsc " add_drivers+=" nvme "Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteNote the spaces before and after the quotes, for example,
add_drivers+=" hv_vmbus ". This ensures that unique drivers are loaded in the event that other Hyper-V drivers already exist in the environment.Regenerate the
initramfsimage.dracut -f -v --regenerate-all
# dracut -f -v --regenerate-allCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
- Reboot the machine.
-
Run the
lsinitrd | grep hvcommand to verify that the drivers are installed.
3.3.2. Making configuration changes required for a Microsoft Azure deployment Copier lienLien copié sur presse-papiers!
Before you deploy your custom base image to Azure, you must perform additional configuration changes to ensure that the virtual machine (VM) can properly operate in Azure.
Procedure
- Log in to the VM.
Register the VM, and enable the Red Hat Enterprise Linux 9 repository.
subscription-manager register Installed Product Current Status: Product Name: Red Hat Enterprise Linux for x86_64 Status: Subscribed
# subscription-manager register Installed Product Current Status: Product Name: Red Hat Enterprise Linux for x86_64 Status: SubscribedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the
cloud-initandhyperv-daemonspackages are installed.dnf install cloud-init hyperv-daemons -y
# dnf install cloud-init hyperv-daemons -yCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create
cloud-initconfiguration files that are needed for integration with Azure services:To enable logging to the Hyper-V Data Exchange Service (KVP), create the
/etc/cloud/cloud.cfg.d/10-azure-kvp.cfgconfiguration file and add the following lines to that file.reporting: logging: type: log telemetry: type: hypervreporting: logging: type: log telemetry: type: hypervCopy to Clipboard Copied! Toggle word wrap Toggle overflow To add Azure as a datasource, create the
/etc/cloud/cloud.cfg.d/91-azure_datasource.cfgconfiguration file, and add the following lines to that file.datasource_list: [ Azure ] datasource: Azure: apply_network_config: Falsedatasource_list: [ Azure ] datasource: Azure: apply_network_config: FalseCopy to Clipboard Copied! Toggle word wrap Toggle overflow To configure swap space on the ephemeral disk, create the
/etc/cloud/cloud.cfg.d/00-azure-swap.cfgconfiguration file and add the following lines.ImportantThe ephemeral disk is temporary storage. Therefore, data stored on it, including swap space, is lost when the VM is deallocated or moved. Use the ephemeral disk only for temporary data such as swap space.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To ensure that specific kernel modules are blocked from loading automatically, edit or create the
/etc/modprobe.d/blocklist.conffile and add the following lines to that file.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Modify
udevnetwork device rules:Remove the following persistent network device rules if present.
rm -f /etc/udev/rules.d/70-persistent-net.rules rm -f /etc/udev/rules.d/75-persistent-net-generator.rules rm -f /etc/udev/rules.d/80-net-name-slot-rules
# rm -f /etc/udev/rules.d/70-persistent-net.rules # rm -f /etc/udev/rules.d/75-persistent-net-generator.rules # rm -f /etc/udev/rules.d/80-net-name-slot-rulesCopy to Clipboard Copied! Toggle word wrap Toggle overflow To ensure that Accelerated Networking on Azure works as intended, create a new network device rule
/etc/udev/rules.d/68-azure-sriov-nm-unmanaged.rulesand add the following line to it.SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1"SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Set the
sshdservice to start automatically.systemctl enable sshd systemctl is-enabled sshd
# systemctl enable sshd # systemctl is-enabled sshdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify kernel boot parameters:
Open the
/etc/default/grubfile, and ensure theGRUB_TIMEOUTline has the following value.GRUB_TIMEOUT=10
GRUB_TIMEOUT=10Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the following options from the end of the
GRUB_CMDLINE_LINUXline if present.rhgb quiet
rhgb quietCopy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure the
/etc/default/grubfile contains the following lines with all the specified options.GRUB_CMDLINE_LINUX="loglevel=3 crashkernel=auto console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300" GRUB_TIMEOUT_STYLE=countdown GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_CMDLINE_LINUX="loglevel=3 crashkernel=auto console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300" GRUB_TIMEOUT_STYLE=countdown GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you are not running workloads on HDDs, add
elevator=noneto the end of theGRUB_CMDLINE_LINUXline. This sets the I/O scheduler tonone, which improves I/O performance on SSD-based systems.Regenerate the
grub.cfgfile.On a BIOS-based machine:
In RHEL 9.2 and earlier:
grub2-mkconfig -o /boot/grub2/grub.cfg
# grub2-mkconfig -o /boot/grub2/grub.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow In RHEL 9.3 and later:
grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
# grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow
On a UEFI-based machine:
In RHEL 9.2 and earlier:
grub2-mkconfig -o /boot/grub2/grub.cfg
# grub2-mkconfig -o /boot/grub2/grub.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow In RHEL 9.3 and later:
grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
# grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow WarningThe path to rebuild
grub.cfgis same for both BIOS and UEFI based machines. Actualgrub.cfgis present at BIOS path only. The UEFI path has a stub file that must not be modified or recreated usinggrub2-mkconfigcommand.If your system uses a non-default location for
grub.cfg, adjust the command accordingly.
Configure the Windows Azure Linux Agent (
WALinuxAgent):Install and enable the
WALinuxAgentpackage.dnf install WALinuxAgent -y systemctl enable waagent
# dnf install WALinuxAgent -y # systemctl enable waagentCopy to Clipboard Copied! Toggle word wrap Toggle overflow To disable swap configuration in WALinuxAgent (required when using
cloud-initto manage swap), edit the following lines in the/etc/waagent.conffile.Provisioning.DeleteRootPassword=y ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0
Provisioning.DeleteRootPassword=y ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteBy disabling swap in WALinuxAgent, you enable
cloud-initto manage the swap configuration on the ephemeral disk.
Prepare the VM for Azure provisioning:
Unregister the VM from Red Hat Subscription Manager.
subscription-manager unregister
# subscription-manager unregisterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Clean up the existing provisioning details.
waagent -force -deprovision
# waagent -force -deprovisionCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThis command generates warnings, which are expected because Azure handles the provisioning of VMs automatically.
Clean the shell history and shut down the VM.
export HISTSIZE=0 poweroff
# export HISTSIZE=0 # poweroffCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.4. Converting the image to a fixed VHD format Copier lienLien copié sur presse-papiers!
All Microsoft Azure VM images must be in a fixed VHD format. The image must be aligned on a 1 MB boundary before it is converted to VHD. To convert the image from qcow2 to a fixed VHD format and align the image, see the following procedure. Once you have converted the image, you can upload it to Azure.
Procedure
Convert the image from
qcow2torawformat.qemu-img convert -f qcow2 -O raw <image-name>.qcow2 <image-name>.raw
$ qemu-img convert -f qcow2 -O raw <image-name>.qcow2 <image-name>.rawCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a shell script with the following content.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the script. This example uses the name
align.sh.sh align.sh <image-xxx>.raw
$ sh align.sh <image-xxx>.rawCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If the message "Your image is already aligned. You do not need to resize." displays, proceed to the following step.
- If a value displays, your image is not aligned.
Use the following command to convert the file to a fixed
VHDformat.The sample uses qemu-img version 2.12.0.
qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
$ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once converted, the
VHDfile is ready to upload to Azure.If the
rawimage is not aligned, complete the following steps to align it.Resize the
rawfile by using the rounded value displayed when you ran the verification script.qemu-img resize -f raw <image-xxx>.raw <rounded-value>
$ qemu-img resize -f raw <image-xxx>.raw <rounded-value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Convert the
rawimage file to aVHDformat.The sample uses qemu-img version 2.12.0.
qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
$ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once converted, the
VHDfile is ready to upload to Azure.
3.5. Installing the Azure CLI Copier lienLien copié sur presse-papiers!
Complete the following steps to install the Azure command-line interface (Azure CLI 2.1). Azure CLI 2.1 is a Python-based utility that creates and manages VMs in Azure.
Prerequisites
- You need to have an account with Microsoft Azure before you can use the Azure CLI.
- The Azure CLI installation requires Python 3.x.
Procedure
Import the Microsoft repository key.
sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc
$ sudo rpm --import https://packages.microsoft.com/keys/microsoft.ascCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a local Azure CLI repository entry.
sudo sh -c 'echo -e "[azure-cli]\nname=Azure CLI\nbaseurl=https://packages.microsoft.com/yumrepos/azure-cli\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/azure-cli.repo'
$ sudo sh -c 'echo -e "[azure-cli]\nname=Azure CLI\nbaseurl=https://packages.microsoft.com/yumrepos/azure-cli\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/azure-cli.repo'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the
dnfpackage index.dnf check-update
$ dnf check-updateCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check your Python version (
python --version) and install Python 3.x, if necessary.sudo dnf install python3
$ sudo dnf install python3Copy to Clipboard Copied! Toggle word wrap Toggle overflow Install the Azure CLI.
sudo dnf install -y azure-cli
$ sudo dnf install -y azure-cliCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the Azure CLI.
az
$ azCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.6. Creating resources in Azure Copier lienLien copié sur presse-papiers!
Complete the following procedure to create the Azure resources that you need before you can upload the VHD file and create the Azure image.
Procedure
Authenticate your system with Azure and log in.
az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf a browser is available in your environment, the CLI opens your browser to the Azure sign-in page. See Sign in with Azure CLI for more information and options.
Create a resource group in an Azure region.
az group create --name <resource-group> --location <azure-region>
$ az group create --name <resource-group> --location <azure-region>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a storage account. See SKU Types for more information about valid SKU values.
az storage account create -l <azure-region> -n <storage-account-name> -g <resource-group> --sku <sku_type>
$ az storage account create -l <azure-region> -n <storage-account-name> -g <resource-group> --sku <sku_type>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the storage account connection string.
az storage account show-connection-string -n <storage-account-name> -g <resource-group>
$ az storage account show-connection-string -n <storage-account-name> -g <resource-group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage account show-connection-string -n azrhelclistact -g azrhelclirsgrp { "connectionString": "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...==" }[clouduser@localhost]$ az storage account show-connection-string -n azrhelclistact -g azrhelclirsgrp { "connectionString": "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...==" }Copy to Clipboard Copied! Toggle word wrap Toggle overflow Export the connection string by copying the connection string and pasting it into the following command. This string connects your system to the storage account.
export AZURE_STORAGE_CONNECTION_STRING="<storage-connection-string>"
$ export AZURE_STORAGE_CONNECTION_STRING="<storage-connection-string>"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="
[clouduser@localhost]$ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the storage container.
az storage container create -n <container-name>
$ az storage container create -n <container-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage container create -n azrhelclistcont { "created": true }[clouduser@localhost]$ az storage container create -n azrhelclistcont { "created": true }Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a virtual network.
az network vnet create -g <resource group> --name <vnet-name> --subnet-name <subnet-name>
$ az network vnet create -g <resource group> --name <vnet-name> --subnet-name <subnet-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
3.7. Uploading and creating an Azure image Copier lienLien copié sur presse-papiers!
Complete the following steps to upload the VHD file to your container and create an Azure custom image.
The exported storage connection string does not persist after a system reboot. If any of the commands in the following steps fail, export the connection string again.
Procedure
Upload the
VHDfile to the storage container. It may take several minutes. To get a list of storage containers, enter theaz storage container listcommand.az storage blob upload \ --account-name <storage-account-name> --container-name <container-name> \ --type page --file <path-to-vhd> --name <image-name>.vhd$ az storage blob upload \ --account-name <storage-account-name> --container-name <container-name> \ --type page --file <path-to-vhd> --name <image-name>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage blob upload \ --account-name azrhelclistact --container-name azrhelclistcont \ --type page --file rhel-image-{ProductNumber}.vhd --name rhel-image-{ProductNumber}.vhd Percent complete: %100.0[clouduser@localhost]$ az storage blob upload \ --account-name azrhelclistact --container-name azrhelclistcont \ --type page --file rhel-image-{ProductNumber}.vhd --name rhel-image-{ProductNumber}.vhd Percent complete: %100.0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the URL for the uploaded
VHDfile to use in the following step.az storage blob url -c <container-name> -n <image-name>.vhd
$ az storage blob url -c <container-name> -n <image-name>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage blob url -c azrhelclistcont -n rhel-image-9.vhd "https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd"
$ az storage blob url -c azrhelclistcont -n rhel-image-9.vhd "https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the Azure custom image.
az image create -n <image-name> -g <resource-group> -l <azure-region> --source <URL> --os-type linux
$ az image create -n <image-name> -g <resource-group> -l <azure-region> --source <URL> --os-type linuxCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe default hypervisor generation of the VM is V1. You can optionally specify a V2 hypervisor generation by including the option
--hyper-v-generation V2. Generation 2 VMs use a UEFI-based boot architecture. See Support for generation 2 VMs on Azure for information about generation 2 VMs.The command may return the error "Only blobs formatted as VHDs can be imported." This error may mean that the image was not aligned to the nearest 1 MB boundary before it was converted to
VHD.Example:
az image create -n rhel9 -g azrhelclirsgrp2 -l southcentralus --source https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd --os-type linux
$ az image create -n rhel9 -g azrhelclirsgrp2 -l southcentralus --source https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd --os-type linuxCopy to Clipboard Copied! Toggle word wrap Toggle overflow
3.8. Creating and starting the VM in Azure Copier lienLien copié sur presse-papiers!
The following steps provide the minimum command options to create a managed-disk Azure VM from the image. See az vm create for additional options.
Procedure
Enter the following command to create the VM.
az vm create \ -g <resource-group> -l <azure-region> -n <vm-name> \ --vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \ --os-disk-name <simple-name> --admin-username <administrator-name> \ --generate-ssh-keys --image <path-to-image>$ az vm create \ -g <resource-group> -l <azure-region> -n <vm-name> \ --vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \ --os-disk-name <simple-name> --admin-username <administrator-name> \ --generate-ssh-keys --image <path-to-image>Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe option
--generate-ssh-keyscreates a private/public key pair. Private and public key files are created in~/.sshon your system. The public key is added to theauthorized_keysfile on the VM for the user specified by the--admin-usernameoption. See Other authentication methods for additional information.Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note the
publicIpAddress. You need this address to log in to the VM in the following step.Start an SSH session and log in to the VM.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If you see a user prompt, you have successfully deployed your Azure VM.
You can now go to the Microsoft Azure portal and check the audit logs and properties of your resources. You can manage your VMs directly in this portal. If you are managing multiple VMs, you should use the Azure CLI. The Azure CLI provides a powerful interface to your resources in Azure. Enter az --help in the CLI or see the Azure CLI command reference to learn more about the commands you use to manage your VMs in Microsoft Azure.
3.9. Other authentication methods Copier lienLien copié sur presse-papiers!
While recommended for increased security, using the Azure-generated key pair is not required. The following examples show two methods for SSH authentication.
Example 1: These command options provision a new VM without generating a public key file. They allow SSH authentication by using a password.
az vm create \
-g <resource-group> -l <azure-region> -n <vm-name> \
--vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \
--os-disk-name <simple-name> --authentication-type password \
--admin-username <administrator-name> --admin-password <ssh-password> --image <path-to-image>
$ az vm create \
-g <resource-group> -l <azure-region> -n <vm-name> \
--vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \
--os-disk-name <simple-name> --authentication-type password \
--admin-username <administrator-name> --admin-password <ssh-password> --image <path-to-image>
ssh <admin-username>@<public-ip-address>
$ ssh <admin-username>@<public-ip-address>
Example 2: These command options provision a new Azure VM and allow SSH authentication by using an existing public key file.
az vm create \
-g <resource-group> -l <azure-region> -n <vm-name> \
--vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \
--os-disk-name <simple-name> --admin-username <administrator-name> \
--ssh-key-value <path-to-existing-ssh-key> --image <path-to-image>
$ az vm create \
-g <resource-group> -l <azure-region> -n <vm-name> \
--vnet-name <vnet-name> --subnet <subnet-name> --size Standard_A2 \
--os-disk-name <simple-name> --admin-username <administrator-name> \
--ssh-key-value <path-to-existing-ssh-key> --image <path-to-image>
ssh -i <path-to-existing-ssh-key> <admin-username>@<public-ip-address>
$ ssh -i <path-to-existing-ssh-key> <admin-username>@<public-ip-address>
3.10. Attaching Red Hat subscriptions Copier lienLien copié sur presse-papiers!
Using the subscription-manager command, you can register and attach your Red Hat subscription to a RHEL instance.
Prerequisites
- You must have enabled your subscriptions.
Procedure
Register your system.
subscription-manager register
# subscription-manager registerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Attach your subscriptions.
- You can use an activation key to attach subscriptions. See Creating Red Hat Customer Portal Activation Keys for more information.
- Also, you can manually attach a subscription by using the ID of subscription pool (Pool ID). See Attaching a host-based subscription to hypervisors.
Optional: To collect various system metrics about the instance in the Red Hat Hybrid Cloud Console, you can register the instance with Red Hat Lightspeed.
insights-client register --display-name <display_name_value>
# insights-client register --display-name <display_name_value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For information about further configuration of Red Hat Lightspeed, see Client Configuration Guide for Red Hat Lightspeed.
3.11. Setting up automatic registration on Azure Gold Images Copier lienLien copié sur presse-papiers!
To make deploying RHEL 9 virtual machines (VM) on Micorsoft Azure faster and more comfortable, you can set up Gold Images of RHEL 9 to be automatically registered to the Red Hat Subscription Manager (RHSM).
Prerequisites
RHEL 9 Gold Images are available to you in Microsoft Azure. For instructions, see Using Gold Images on Azure.
NoteA Microsoft Azure account can only be attached to a single Red Hat account at a time. Therefore, ensure no other users require access to the Azure account before attaching it to your Red Hat one.
Procedure
- Use the Gold Image to create a RHEL 9 VM in your Azure instance. For instructions, see Creating and starting the VM in Azure.
- Start the created VM.
In the RHEL 9 VM, enable automatic registration.
subscription-manager config --rhsmcertd.auto_registration=1
# subscription-manager config --rhsmcertd.auto_registration=1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the
rhsmcertdservice.systemctl enable rhsmcertd.service
# systemctl enable rhsmcertd.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the
redhat.reporepository.subscription-manager config --rhsm.manage_repos=0
# subscription-manager config --rhsm.manage_repos=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Power off the VM, and save it as a managed image on Azure. For instructions, see How to create a managed image of a virtual machine or VHD.
- Create VMs by using the managed image. They will be automatically subscribed to RHSM.
Verification
In a RHEL 9 VM created using the above instructions, verify the system is registered to RHSM by executing the
subscription-manager identitycommand. On a successfully registered system, this displays the UUID of the system. For example:subscription-manager identity system identity: fdc46662-c536-43fb-a18a-bbcb283102b7 name: 192.168.122.222 org name: 6340056 org ID: 6340056
# subscription-manager identity system identity: fdc46662-c536-43fb-a18a-bbcb283102b7 name: 192.168.122.222 org name: 6340056 org ID: 6340056Copy to Clipboard Copied! Toggle word wrap Toggle overflow
3.12. Configuring kdump for Microsoft Azure instances Copier lienLien copié sur presse-papiers!
If a kernel crash occurs in a RHEL instance, you can use the kdump service to determine the cause of the crash. If kdump is configured correctly when your instance kernel terminates unexpectedly, kdump generates a dump file, known as crash dump or a vmcore file. You can then analyze the file to find why the crash occurred and to debug your system.
For kdump to work on Microsoft Azure instances, you might need to adjust the kdump reserved memory and the vmcore target to fit VM sizes and RHEL versions.
Prerequisites
You are using a Microsoft Azure environment that supports
kdump:- Standard_DS2_v2 VM
- Standard NV16as v4
- Standard M416-208s v2
- Standard M416ms v2
-
You have
rootpermissions on the system. -
Your system meets the requirements for
kdumpconfigurations and targets. For details, see Supported kdump configurations and targets.
Procedure
Ensure that
kdumpand other necessary packages are installed on your system.dnf install kexec-tools
# dnf install kexec-toolsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the default location for crash dump files is set in the
kdumpconfiguration file and that the/var/crashfile is available.grep -v "#" /etc/kdump.conf path /var/crash core_collector makedumpfile -l --message-level 7 -d 31
# grep -v "#" /etc/kdump.conf path /var/crash core_collector makedumpfile -l --message-level 7 -d 31Copy to Clipboard Copied! Toggle word wrap Toggle overflow Based on the size and version of your RHEL virtual machine (VM) instance, decide whether you need a
vmcoretarget with more free space, such as/mnt/crash. To do so, use the following table.Expand Table 3.4. Virtual machine sizes that have been tested with GEN2 VM on Azure RHEL Version Standard DS1 v2 (1 vCPU, 3.5GiB) Standard NV16as v4 (16 vCPUs, 56 GiB) Standard M416-208s v2 (208 vCPUs, 5700 GiB) Standard M416ms v2 (416 vCPUs, 11400 GiB) RHEL 9.0 - RHEL 9.3
Default
Default
Target
Target
-
Default indicates that
kdumpworks as expected with the default memory and the defaultkdumptarget. The defaultkdumptarget is/var/crash. -
Target indicates that
kdumpworks as expected with the default memory. However, you might need to assign a target with more free space.
-
Default indicates that
If your instance requires it, assign a target with more free space, such as
/mnt/crash. To do so, edit the/etc/kdump.conffile and replace the default path.sed s/"path /var/crash"/"path /mnt/crash"
$ sed s/"path /var/crash"/"path /mnt/crash"Copy to Clipboard Copied! Toggle word wrap Toggle overflow The option path
/mnt/crashrepresents the path to the file system in whichkdumpsaves the crash dump file.For more options, such as writing the crash dump file to a different partition, directly to a device or storing it to a remote machine, see Configuring the kdump target.
If your instance requires it, increase the crash kernel size to the sufficient size for
kdumpto capture thevmcoreby adding the respective boot parameter.For example, for a Standard M416-208s v2 VM, the sufficient size is 512 MB, so the boot parameter would be
crashkernel=512M.Open the GRUB configuration file and add
crashkernel=512Mto the boot parameter line.vi /etc/default/grub GRUB_CMDLINE_LINUX="console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300 crashkernel=512M"
# vi /etc/default/grub GRUB_CMDLINE_LINUX="console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300 crashkernel=512M"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the GRUB configuration file.
In RHEL 9.2 and earlier:
grub2-mkconfig -o /boot/grub2/grub.cfg
# grub2-mkconfig -o /boot/grub2/grub.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow In RHEL 9.3 and later:
grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
# grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow
- Reboot the VM to allocate separate kernel crash memory to the VM.
Verification
Ensure that
kdumpis active and running.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 4. Configuring a Red Hat High Availability cluster on Microsoft Azure Copier lienLien copié sur presse-papiers!
To create a cluster where RHEL nodes automatically redistribute their workloads if a node failure occurs, use the Red Hat High Availability Add-On. Such high availability (HA) clusters can also be hosted on public cloud platforms, including Microsoft Azure. Creating RHEL HA clusters on Azure is similar to creating HA clusters in non-cloud environments, with certain specifics.
To configure a Red Hat HA cluster on Azure using Azure virtual machine (VM) instances as cluster nodes, see the following sections. The procedures in these sections assume that you are creating a custom image for Azure. You have a number of options for obtaining the RHEL 9 images you use for your cluster. See Red Hat Enterprise Linux Image Options on Azure for information on image options for Azure.
The following sections provide:
- Prerequisite procedures for setting up your environment for Azure. After you set up your environment, you can create and configure Azure VM instances.
- Procedures specific to the creation of HA clusters, which transform individual nodes into a cluster of HA nodes on Azure. These include procedures for installing the High Availability packages and agents on each cluster node, configuring fencing, and installing Azure network resource agents.
Prerequisites
- Sign up for a Red Hat Customer Portal account.
- Sign up for a Microsoft Azure account with administrator privileges.
- You need to install the Azure command-line interface (CLI). For more information, see Installing the Azure CLI.
4.1. The benefits of using high-availability clusters on public cloud platforms Copier lienLien copié sur presse-papiers!
A high-availability (HA) cluster is a set of computers (called nodes) that are linked together to run a specific workload. The purpose of HA clusters is to provide redundancy in case of a hardware or software failure. If a node in the HA cluster fails, the Pacemaker cluster resource manager distributes the workload to other nodes and no noticeable downtime occurs in the services that are running on the cluster.
You can also run HA clusters on public cloud platforms. In this case, you would use virtual machine (VM) instances in the cloud as the individual cluster nodes. Using HA clusters on a public cloud platform has the following benefits:
- Improved availability: In case of a VM failure, the workload is quickly redistributed to other nodes, so running services are not disrupted.
- Scalability: Additional nodes can be started when demand is high and stopped when demand is low.
- Cost-effectiveness: With the pay-as-you-go pricing, you pay only for nodes that are running.
- Simplified management: Some public cloud platforms offer management interfaces to make configuring HA clusters easier.
To enable HA on your Red Hat Enterprise Linux (RHEL) systems, Red Hat offers a High Availability Add-On. The High Availability Add-On provides all necessary components for creating HA clusters on RHEL systems. The components include high availability service management and cluster administration tools.
4.2. Creating resources in Azure Copier lienLien copié sur presse-papiers!
Complete the following procedure to create a region, resource group, storage account, virtual network, and availability set. You need these resources to set up a cluster on Microsoft Azure.
Procedure
Authenticate your system with Azure and log in.
az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf a browser is available in your environment, the CLI opens your browser to the Azure sign-in page.
Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a resource group in an Azure region.
az group create --name resource-group --location azure-region
$ az group create --name resource-group --location azure-regionCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a storage account.
az storage account create -l azure-region -n storage-account-name -g resource-group --sku sku_type --kind StorageV2
$ az storage account create -l azure-region -n storage-account-name -g resource-group --sku sku_type --kind StorageV2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the storage account connection string.
az storage account show-connection-string -n storage-account-name -g resource-group
$ az storage account show-connection-string -n storage-account-name -g resource-groupCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage account show-connection-string -n azrhelclistact -g azrhelclirsgrp { "connectionString": "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...==" }[clouduser@localhost]$ az storage account show-connection-string -n azrhelclistact -g azrhelclirsgrp { "connectionString": "DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...==" }Copy to Clipboard Copied! Toggle word wrap Toggle overflow Export the connection string by copying the connection string and pasting it into the following command. This string connects your system to the storage account.
export AZURE_STORAGE_CONNECTION_STRING="storage-connection-string"
$ export AZURE_STORAGE_CONNECTION_STRING="storage-connection-string"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="
[clouduser@localhost]$ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;EndpointSuffix=core.windows.net;AccountName=azrhelclistact;AccountKey=NreGk...=="Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the storage container.
az storage container create -n container-name
$ az storage container create -n container-nameCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage container create -n azrhelclistcont { "created": true }[clouduser@localhost]$ az storage container create -n azrhelclistcont { "created": true }Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a virtual network. All cluster nodes must be in the same virtual network.
az network vnet create -g resource group --name vnet-name --subnet-name subnet-name
$ az network vnet create -g resource group --name vnet-name --subnet-name subnet-nameCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create an availability set. All cluster nodes must be in the same availability set.
az vm availability-set create --name MyAvailabilitySet --resource-group MyResourceGroup
$ az vm availability-set create --name MyAvailabilitySet --resource-group MyResourceGroupCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.3. Required system packages for High Availability Copier lienLien copié sur presse-papiers!
The procedure assumes you are creating a VM image for Azure HA that uses Red Hat Enterprise Linux. To successfully complete the procedure, the following packages must be installed.
| Package | Repository | Description |
|---|---|---|
| libvirt | rhel-9-for-x86_64-appstream-rpms | Open source API, daemon, and management tool for managing platform virtualization |
| virt-install | rhel-9-for-x86_64-appstream-rpms | A command-line utility for building VMs |
| libguestfs | rhel-9-for-x86_64-appstream-rpms | A library for accessing and modifying VM file systems |
| guestfs-tools | rhel-9-for-x86_64-appstream-rpms |
System administration tools for VMs; includes the |
4.4. Azure VM configuration settings Copier lienLien copié sur presse-papiers!
Azure virtual machines (VMs) must have the following configuration settings. Some of these settings are enabled during the initial VM creation. Other settings are set when provisioning the VM image for Azure. Keep these settings in mind as you move through the procedures. Refer to them as necessary.
| Setting | Recommendation |
|---|---|
| SSH | SSH must be enabled to provide remote access to your Azure VMs. |
| dhcp | The primary virtual adapter should be configured for dhcp (IPv4 only). |
| swap space |
Do not create a dedicated swap file or |
| NIC |
Choose |
| encryption | For custom images, use Network Bound Disk Encryption (NBDE) for full disk encryption on Azure. |
4.5. Installing Hyper-V device drivers Copier lienLien copié sur presse-papiers!
Microsoft provides network and storage device drivers as part of their Linux Integration Services (LIS) for Hyper-V package. You may need to install Hyper-V device drivers on the VM image prior to provisioning it as an Azure virtual machine (VM). Use the lsinitrd | grep hv command to verify that the drivers are installed.
Procedure
Enter the following
grepcommand to determine if the required Hyper-V device drivers are installed.lsinitrd | grep hv
# lsinitrd | grep hvCopy to Clipboard Copied! Toggle word wrap Toggle overflow In the example below, all required drivers are installed.
lsinitrd | grep hv drwxr-xr-x 2 root root 0 Aug 12 14:21 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv -rw-r--r-- 1 root root 31272 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv/hv_vmbus.ko.xz -rw-r--r-- 1 root root 25132 Aug 11 08:46 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz -rw-r--r-- 1 root root 9796 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/scsi/hv_storvsc.ko.xz
# lsinitrd | grep hv drwxr-xr-x 2 root root 0 Aug 12 14:21 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv -rw-r--r-- 1 root root 31272 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/hv/hv_vmbus.ko.xz -rw-r--r-- 1 root root 25132 Aug 11 08:46 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz -rw-r--r-- 1 root root 9796 Aug 11 08:45 usr/lib/modules/3.10.0-932.el9.x86_64/kernel/drivers/scsi/hv_storvsc.ko.xzCopy to Clipboard Copied! Toggle word wrap Toggle overflow If all the drivers are not installed, complete the remaining steps.
NoteAn
hv_vmbusdriver may exist in the environment. Even if this driver is present, complete the following steps.-
Create a file named
hv.confin/etc/dracut.conf.d. Add the following driver parameters to the
hv.conffile.add_drivers+=" hv_vmbus " add_drivers+=" hv_netvsc " add_drivers+=" hv_storvsc " add_drivers+=" nvme "
add_drivers+=" hv_vmbus " add_drivers+=" hv_netvsc " add_drivers+=" hv_storvsc " add_drivers+=" nvme "Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteNote the spaces before and after the quotes, for example,
add_drivers+=" hv_vmbus ". This ensures that unique drivers are loaded in the event that other Hyper-V drivers already exist in the environment.Regenerate the
initramfsimage.dracut -f -v --regenerate-all
# dracut -f -v --regenerate-allCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
- Reboot the machine.
-
Run the
lsinitrd | grep hvcommand to verify that the drivers are installed.
4.6. Making configuration changes required for a Microsoft Azure deployment Copier lienLien copié sur presse-papiers!
Before you deploy your custom base image to Azure, you must perform additional configuration changes to ensure that the virtual machine (VM) can properly operate in Azure.
Procedure
- Log in to the VM.
Register the VM, and enable the Red Hat Enterprise Linux 9 repository.
subscription-manager register Installed Product Current Status: Product Name: Red Hat Enterprise Linux for x86_64 Status: Subscribed
# subscription-manager register Installed Product Current Status: Product Name: Red Hat Enterprise Linux for x86_64 Status: SubscribedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the
cloud-initandhyperv-daemonspackages are installed.dnf install cloud-init hyperv-daemons -y
# dnf install cloud-init hyperv-daemons -yCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create
cloud-initconfiguration files that are needed for integration with Azure services:To enable logging to the Hyper-V Data Exchange Service (KVP), create the
/etc/cloud/cloud.cfg.d/10-azure-kvp.cfgconfiguration file and add the following lines to that file.reporting: logging: type: log telemetry: type: hypervreporting: logging: type: log telemetry: type: hypervCopy to Clipboard Copied! Toggle word wrap Toggle overflow To add Azure as a datasource, create the
/etc/cloud/cloud.cfg.d/91-azure_datasource.cfgconfiguration file, and add the following lines to that file.datasource_list: [ Azure ] datasource: Azure: apply_network_config: Falsedatasource_list: [ Azure ] datasource: Azure: apply_network_config: FalseCopy to Clipboard Copied! Toggle word wrap Toggle overflow To configure swap space on the ephemeral disk, create the
/etc/cloud/cloud.cfg.d/00-azure-swap.cfgconfiguration file and add the following lines.ImportantThe ephemeral disk is temporary storage. Therefore, data stored on it, including swap space, is lost when the VM is deallocated or moved. Use the ephemeral disk only for temporary data such as swap space.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To ensure that specific kernel modules are blocked from loading automatically, edit or create the
/etc/modprobe.d/blocklist.conffile and add the following lines to that file.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Modify
udevnetwork device rules:Remove the following persistent network device rules if present.
rm -f /etc/udev/rules.d/70-persistent-net.rules rm -f /etc/udev/rules.d/75-persistent-net-generator.rules rm -f /etc/udev/rules.d/80-net-name-slot-rules
# rm -f /etc/udev/rules.d/70-persistent-net.rules # rm -f /etc/udev/rules.d/75-persistent-net-generator.rules # rm -f /etc/udev/rules.d/80-net-name-slot-rulesCopy to Clipboard Copied! Toggle word wrap Toggle overflow To ensure that Accelerated Networking on Azure works as intended, create a new network device rule
/etc/udev/rules.d/68-azure-sriov-nm-unmanaged.rulesand add the following line to it.SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1"SUBSYSTEM=="net", DRIVERS=="hv_pci", ACTION=="add", ENV{NM_UNMANAGED}="1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Set the
sshdservice to start automatically.systemctl enable sshd systemctl is-enabled sshd
# systemctl enable sshd # systemctl is-enabled sshdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify kernel boot parameters:
Open the
/etc/default/grubfile, and ensure theGRUB_TIMEOUTline has the following value.GRUB_TIMEOUT=10
GRUB_TIMEOUT=10Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the following options from the end of the
GRUB_CMDLINE_LINUXline if present.rhgb quiet
rhgb quietCopy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure the
/etc/default/grubfile contains the following lines with all the specified options.GRUB_CMDLINE_LINUX="loglevel=3 crashkernel=auto console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300" GRUB_TIMEOUT_STYLE=countdown GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_CMDLINE_LINUX="loglevel=3 crashkernel=auto console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300" GRUB_TIMEOUT_STYLE=countdown GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you are not running workloads on HDDs, add
elevator=noneto the end of theGRUB_CMDLINE_LINUXline. This sets the I/O scheduler tonone, which improves I/O performance on SSD-based systems.Regenerate the
grub.cfgfile.On a BIOS-based machine:
In RHEL 9.2 and earlier:
grub2-mkconfig -o /boot/grub2/grub.cfg
# grub2-mkconfig -o /boot/grub2/grub.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow In RHEL 9.3 and later:
grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
# grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow
On a UEFI-based machine:
In RHEL 9.2 and earlier:
grub2-mkconfig -o /boot/grub2/grub.cfg
# grub2-mkconfig -o /boot/grub2/grub.cfgCopy to Clipboard Copied! Toggle word wrap Toggle overflow In RHEL 9.3 and later:
grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdline
# grub2-mkconfig -o /boot/grub2/grub.cfg --update-bls-cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow WarningThe path to rebuild
grub.cfgis same for both BIOS and UEFI based machines. Actualgrub.cfgis present at BIOS path only. The UEFI path has a stub file that must not be modified or recreated usinggrub2-mkconfigcommand.If your system uses a non-default location for
grub.cfg, adjust the command accordingly.
Configure the Windows Azure Linux Agent (
WALinuxAgent):Install and enable the
WALinuxAgentpackage.dnf install WALinuxAgent -y systemctl enable waagent
# dnf install WALinuxAgent -y # systemctl enable waagentCopy to Clipboard Copied! Toggle word wrap Toggle overflow To disable swap configuration in WALinuxAgent (required when using
cloud-initto manage swap), edit the following lines in the/etc/waagent.conffile.Provisioning.DeleteRootPassword=y ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0
Provisioning.DeleteRootPassword=y ResourceDisk.Format=n ResourceDisk.EnableSwap=n ResourceDisk.SwapSizeMB=0Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteBy disabling swap in WALinuxAgent, you enable
cloud-initto manage the swap configuration on the ephemeral disk.
Prepare the VM for Azure provisioning:
Unregister the VM from Red Hat Subscription Manager.
subscription-manager unregister
# subscription-manager unregisterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Clean up the existing provisioning details.
waagent -force -deprovision
# waagent -force -deprovisionCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThis command generates warnings, which are expected because Azure handles the provisioning of VMs automatically.
Clean the shell history and shut down the VM.
export HISTSIZE=0 poweroff
# export HISTSIZE=0 # poweroffCopy to Clipboard Copied! Toggle word wrap Toggle overflow
4.7. Creating an Azure Active Directory application Copier lienLien copié sur presse-papiers!
Complete the following procedure to create an Azure Active Directory (AD) application. The Azure AD application authorizes and automates access for HA operations for all nodes in the cluster.
Prerequisites
- The Azure Command Line Interface (CLI) is installed on your system.
- You are an Administrator or Owner for the Microsoft Azure subscription. You need this authorization to create an Azure AD application.
Procedure
On any node in the HA cluster, log in to your Azure account.
az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
jsonconfiguration file for a custom role for the Azure fence agent. Use the following configuration, but replace <subscription-id> with your subscription IDs.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Define the custom role for the Azure fence agent. Use the
jsonfile created in the previous step to do this.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - In the Azure web console interface, select Virtual Machine → Click Identity in the left-side menu.
- Select On → Click Save → click Yes to confirm.
- Click Azure role assignments → Add role assignment.
-
Select the Scope required for the role, for example
Resource Group. - Select the required Resource Group.
- Optional: Change the Subscription if necessary.
- Select the Linux Fence Agent Role role.
- Click Save.
Verification
Display nodes visible to Azure AD.
fence_azure_arm --msi -o list node1, node2, [...]
# fence_azure_arm --msi -o list node1, node2, [...]Copy to Clipboard Copied! Toggle word wrap Toggle overflow If this command outputs all nodes on your cluster, the AD application has been configured successfully.
4.8. Converting the image to a fixed VHD format Copier lienLien copié sur presse-papiers!
All Microsoft Azure VM images must be in a fixed VHD format. The image must be aligned on a 1 MB boundary before it is converted to VHD. To convert the image from qcow2 to a fixed VHD format and align the image, see the following procedure. Once you have converted the image, you can upload it to Azure.
Procedure
Convert the image from
qcow2torawformat.qemu-img convert -f qcow2 -O raw <image-name>.qcow2 <image-name>.raw
$ qemu-img convert -f qcow2 -O raw <image-name>.qcow2 <image-name>.rawCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a shell script with the following content.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the script. This example uses the name
align.sh.sh align.sh <image-xxx>.raw
$ sh align.sh <image-xxx>.rawCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If the message "Your image is already aligned. You do not need to resize." displays, proceed to the following step.
- If a value displays, your image is not aligned.
Use the following command to convert the file to a fixed
VHDformat.The sample uses qemu-img version 2.12.0.
qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
$ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once converted, the
VHDfile is ready to upload to Azure.If the
rawimage is not aligned, complete the following steps to align it.Resize the
rawfile by using the rounded value displayed when you ran the verification script.qemu-img resize -f raw <image-xxx>.raw <rounded-value>
$ qemu-img resize -f raw <image-xxx>.raw <rounded-value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Convert the
rawimage file to aVHDformat.The sample uses qemu-img version 2.12.0.
qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhd
$ qemu-img convert -f raw -o subformat=fixed,force_size -O vpc <image-xxx>.raw <image.xxx>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once converted, the
VHDfile is ready to upload to Azure.
4.9. Uploading and creating an Azure image Copier lienLien copié sur presse-papiers!
Complete the following steps to upload the VHD file to your container and create an Azure custom image.
The exported storage connection string does not persist after a system reboot. If any of the commands in the following steps fail, export the connection string again.
Procedure
Upload the
VHDfile to the storage container. It may take several minutes. To get a list of storage containers, enter theaz storage container listcommand.az storage blob upload \ --account-name <storage-account-name> --container-name <container-name> \ --type page --file <path-to-vhd> --name <image-name>.vhd$ az storage blob upload \ --account-name <storage-account-name> --container-name <container-name> \ --type page --file <path-to-vhd> --name <image-name>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage blob upload \ --account-name azrhelclistact --container-name azrhelclistcont \ --type page --file rhel-image-{ProductNumber}.vhd --name rhel-image-{ProductNumber}.vhd Percent complete: %100.0[clouduser@localhost]$ az storage blob upload \ --account-name azrhelclistact --container-name azrhelclistcont \ --type page --file rhel-image-{ProductNumber}.vhd --name rhel-image-{ProductNumber}.vhd Percent complete: %100.0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the URL for the uploaded
VHDfile to use in the following step.az storage blob url -c <container-name> -n <image-name>.vhd
$ az storage blob url -c <container-name> -n <image-name>.vhdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
az storage blob url -c azrhelclistcont -n rhel-image-9.vhd "https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd"
$ az storage blob url -c azrhelclistcont -n rhel-image-9.vhd "https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the Azure custom image.
az image create -n <image-name> -g <resource-group> -l <azure-region> --source <URL> --os-type linux
$ az image create -n <image-name> -g <resource-group> -l <azure-region> --source <URL> --os-type linuxCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe default hypervisor generation of the VM is V1. You can optionally specify a V2 hypervisor generation by including the option
--hyper-v-generation V2. Generation 2 VMs use a UEFI-based boot architecture. See Support for generation 2 VMs on Azure for information about generation 2 VMs.The command may return the error "Only blobs formatted as VHDs can be imported." This error may mean that the image was not aligned to the nearest 1 MB boundary before it was converted to
VHD.Example:
az image create -n rhel9 -g azrhelclirsgrp2 -l southcentralus --source https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd --os-type linux
$ az image create -n rhel9 -g azrhelclirsgrp2 -l southcentralus --source https://azrhelclistact.blob.core.windows.net/azrhelclistcont/rhel-image-9.vhd --os-type linuxCopy to Clipboard Copied! Toggle word wrap Toggle overflow
4.10. Installing Red Hat HA packages and agents Copier lienLien copié sur presse-papiers!
Complete the following steps on all nodes.
Procedure
Launch an SSH terminal session and connect to the VM by using the administrator name and public IP address.
ssh administrator@PublicIP
$ ssh administrator@PublicIPCopy to Clipboard Copied! Toggle word wrap Toggle overflow To get the public IP address for an Azure VM, open the VM properties in the Azure Portal or enter the following Azure CLI command.
az vm list -g <resource_group> -d --output table
$ az vm list -g <resource_group> -d --output tableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
[clouduser@localhost ~] $ az vm list -g azrhelclirsgrp -d --output table Name ResourceGroup PowerState PublicIps Location ------ ---------------------- -------------- ------------- -------------- node01 azrhelclirsgrp VM running 192.98.152.251 southcentralus
[clouduser@localhost ~] $ az vm list -g azrhelclirsgrp -d --output table Name ResourceGroup PowerState PublicIps Location ------ ---------------------- -------------- ------------- -------------- node01 azrhelclirsgrp VM running 192.98.152.251 southcentralusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Register the VM with Red Hat.
sudo -i subscription-manager register
$ sudo -i # subscription-manager registerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Disable all repositories.
subscription-manager repos --disable=*
# subscription-manager repos --disable=*Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the RHEL 9 Server HA repositories.
subscription-manager repos --enable=rhel-9-for-x86_64-highavailability-rpms
# subscription-manager repos --enable=rhel-9-for-x86_64-highavailability-rpmsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update all packages.
dnf update -y
# dnf update -yCopy to Clipboard Copied! Toggle word wrap Toggle overflow Install the Red Hat High Availability Add-On software packages, along with the Azure fencing agent from the High Availability channel.
dnf install pcs pacemaker fence-agents-azure-arm
# dnf install pcs pacemaker fence-agents-azure-armCopy to Clipboard Copied! Toggle word wrap Toggle overflow The user
haclusterwas created during the pcs and pacemaker installation in the last step. Create a password forhaclusteron all cluster nodes. Use the same password for all nodes.passwd hacluster
# passwd haclusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
high availabilityservice to the RHEL Firewall iffirewalld.serviceis installed.firewall-cmd --permanent --add-service=high-availability firewall-cmd --reload
# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --reloadCopy to Clipboard Copied! Toggle word wrap Toggle overflow Start the
pcsservice and enable it to start on boot.systemctl start pcsd.service systemctl enable pcsd.service Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
# systemctl start pcsd.service # systemctl enable pcsd.service Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Ensure the
pcsservice is running.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.11. Creating a cluster Copier lienLien copié sur presse-papiers!
Complete the following steps to create the cluster of nodes.
Procedure
On one of the nodes, enter the following command to authenticate the pcs user
hacluster. In the command, specify the name of each node in the cluster.pcs host auth <hostname1> <hostname2> <hostname3>
# pcs host auth <hostname1> <hostname2> <hostname3>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the cluster.
pcs cluster setup <cluster_name> <hostname1> <hostname2> <hostname3>
# pcs cluster setup <cluster_name> <hostname1> <hostname2> <hostname3>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Enable the cluster.
pcs cluster enable --all node02: Cluster Enabled node03: Cluster Enabled node01: Cluster Enabled
[root@node01 clouduser]# pcs cluster enable --all node02: Cluster Enabled node03: Cluster Enabled node01: Cluster EnabledCopy to Clipboard Copied! Toggle word wrap Toggle overflow Start the cluster.
pcs cluster start --all node02: Starting Cluster... node03: Starting Cluster... node01: Starting Cluster...
[root@node01 clouduser]# pcs cluster start --all node02: Starting Cluster... node03: Starting Cluster... node01: Starting Cluster...Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.12. Fencing overview Copier lienLien copié sur presse-papiers!
If communication with a single node in the cluster fails, then other nodes in the cluster must be able to restrict or release access to resources that the failed cluster node may have access to. This cannot be accomplished by contacting the cluster node itself as the cluster node may not be responsive. Instead, you must provide an external method, which is called fencing with a fence agent.
A node that is unresponsive may still be accessing data. The only way to be certain that your data is safe is to fence the node by using STONITH. STONITH is an acronym for "Shoot The Other Node In The Head," and it protects your data from being corrupted by rogue nodes or concurrent access. Using STONITH, you can be certain that a node is truly offline before allowing the data to be accessed from another node.
4.13. Creating a fencing device Copier lienLien copié sur presse-papiers!
Complete the following steps to configure fencing. Complete these commands from any node in the cluster
Prerequisites
You need to set the cluster property stonith-enabled to true.
Procedure
Identify the Azure node name for each RHEL VM. You use the Azure node names to configure the fence device.
fence_azure_arm \ -l <AD-Application-ID> -p <AD-Password> \ --resourceGroup <MyResourceGroup> --tenantId <Tenant-ID> \ --subscriptionId <Subscription-ID> -o list# fence_azure_arm \ -l <AD-Application-ID> -p <AD-Password> \ --resourceGroup <MyResourceGroup> --tenantId <Tenant-ID> \ --subscriptionId <Subscription-ID> -o listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the options for the Azure ARM STONITH agent.
pcs stonith describe fence_azure_arm
# pcs stonith describe fence_azure_armCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
pcs stonith describe fence_apc Stonith options: password: Authentication key password_script: Script to run to retrieve password
# pcs stonith describe fence_apc Stonith options: password: Authentication key password_script: Script to run to retrieve passwordCopy to Clipboard Copied! Toggle word wrap Toggle overflow WarningFor fence agents that provide a method option, do not specify a value of cycle as it is not supported and can cause data corruption.
Some fence devices can fence only a single node, while other devices can fence multiple nodes. The parameters you specify when you create a fencing device depend on what your fencing device supports and requires.
You can use the
pcmk_host_listparameter when creating a fencing device to specify all of the machines that are controlled by that fencing device.You can use
pcmk_host_mapparameter when creating a fencing device to map host names to the specifications that comprehends the fence device.Create a fence device.
pcs stonith create clusterfence fence_azure_arm
# pcs stonith create clusterfence fence_azure_armCopy to Clipboard Copied! Toggle word wrap Toggle overflow - To ensure immediate and complete fencing, disable ACPI Soft-Off on all cluster nodes. For information about disabling ACPI Soft-Off, see Disabling ACPI for use with integrated fence device.
Verification
Test the fencing agent for one of the other nodes.
pcs stonith fence azurenodename
# pcs stonith fence azurenodenameCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the node that was fenced in the previous step.
pcs cluster start <hostname>
# pcs cluster start <hostname>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status to verify the node started.
pcs status
# pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.14. Creating an Azure internal load balancer Copier lienLien copié sur presse-papiers!
The Azure internal load balancer removes cluster nodes that do not answer health probe requests.
Perform the following procedure to create an Azure internal load balancer. Each step references a specific Microsoft procedure and includes the settings for customizing the load balancer for HA.
Prerequisites
Procedure
- Create a Basic load balancer. Select Internal load balancer, the Basic SKU, and Dynamic for the type of IP address assignment.
- Create a back-end address pool. Associate the backend pool to the availability set created while creating Azure resources in HA. Do not set any target network IP configurations.
- Create a health probe. For the health probe, select TCP and enter port 61000. You can use TCP port number that does not interfere with another service. For certain HA product applications (for example, SAP HANA and SQL Server), you may need to work with Microsoft to identify the correct port to use.
- Create a load balancer rule. To create the load balancing rule, the default values are prepopulated. Ensure to set Floating IP (direct server return) to Enabled.
4.15. Configuring the load balancer resource agent Copier lienLien copié sur presse-papiers!
After you have created the health probe, you must configure the load balancer resource agent. This resource agent runs a service that answers health probe requests from the Azure load balancer and removes cluster nodes that do not answer requests.
Procedure
Install the
nmap-ncatresource agents on all nodes.dnf install nmap-ncat resource-agents-cloud
# dnf install nmap-ncat resource-agents-cloudCopy to Clipboard Copied! Toggle word wrap Toggle overflow Perform the following steps on a single node.
Create the
pcsresources and group. Use your load balancer FrontendIP for the IPaddr2 address.pcs resource create resource-name IPaddr2 ip="10.0.0.7" --group cluster-resources-group
# pcs resource create resource-name IPaddr2 ip="10.0.0.7" --group cluster-resources-groupCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the
load balancerresource agent.pcs resource create resource-loadbalancer-name azure-lb port=port-number --group cluster-resources-group
# pcs resource create resource-loadbalancer-name azure-lb port=port-number --group cluster-resources-groupCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Run
pcs statusto see the results.pcs status
[root@node01 clouduser]# pcs statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 5. Configuring RHEL on Azure with Secure Boot Copier lienLien copié sur presse-papiers!
Secure Boot is a mechanism in the Unified Extensible Firmware Interface (UEFI) specification to control the execution of programs at boot time. Secure Boot verifies digital signatures of the boot loader and its components at boot time, to ensure only trusted and authorized programs are executed, and also prevent unauthorized programs from loading.
Secure Boot is enabled for publicly available RHEL images on Azure platform. By default, it has the Allowed Signature database (db) with Microsoft certificates. Microsoft Azure allows adding custom certificates to UEFI Secure Boot variables when a new image version is registered in Azure Compute Gallery.
5.1. Understanding Secure Boot for RHEL on cloud Copier lienLien copié sur presse-papiers!
Secure Boot is a feature of Unified Extensible Firmware Interface (UEFI) that ensures only trusted and digitally signed programs and components, such as the boot loader and kernel, are executed during boot time. Secure Boot verifies digital signatures against trusted keys stored in hardware, and aborts the boot process if it detects any components that have been tampered with or that are signed by untrusted entities. This prevents malicious software from compromising the operating system.
Secure Boot is an essential component for configuring a Confidential Virtual Machine (CVM), as it guarantees that only trusted entities are present in the boot chain. It provides authenticated access to specific device paths through defined interfaces, which ensures that only the latest configuration is used, and which also permanently overwrites earlier configurations. Additionally, when the Red Hat Enterprise Linux kernel boots with Secure Boot enabled, it enters the lockdown mode, which ensures that only kernel modules signed by a trusted vendor are loaded. Therefore, Secure Boot improves security of operating system boot sequence.
5.1.1. Components of Secure Boot Copier lienLien copié sur presse-papiers!
The Secure Boot mechanism consists of firmware, signature databases, cryptographic keys, boot loader, hardware modules, and the operating system. The following are the components of the UEFI trusted variables:
-
Key Exchange Key database (KEK): An exchange of public keys to establish trust between the RHEL operating system and the VM firmware. You can also update Allowed Signature database (
db) and Forbidden Signature database (dbx) by using these keys. - Platform Key database (PK): A self-signed single-key database to establish trust between the VM firmware and the cloud platform. The PK also updates the KEK database.
-
Allowed Signature database (
db): A database that maintains a list of certificates or binary hashes to check whether the binary file is allowed to boot on the system. Additionally, all certificates fromdbare imported to the.platformkeyring of the RHEL kernel. This feature allows you to add and load signed third party kernel modules in thelockdownmode. -
Forbidden Signature database (
dbx): A database that maintains a list of certificates or binary hashes that are forbidden to boot on the system.
Binary files check against the dbx database and the Secure Boot Advanced Targeting (SBAT) mechanism. SBAT allows you to revoke older versions of specific binaries by keeping the certificate that has signed binaries as valid.
5.1.2. Stages of Secure Boot for RHEL on Cloud Copier lienLien copié sur presse-papiers!
When a RHEL instance boots in the Unified Kernel Image (UKI) mode and with Secure Boot enabled, the RHEL instance interacts with the cloud service infrastructure in the following sequence:
- Initialization: When a RHEL instance boots, the cloud-hosted firmware initially boots and implements the Secure Boot mechanism.
- Variable store initialization: The firmware initializes UEFI variables from a variable store, a dedicated storage area for information that firmware needs to manage for the boot process and runtime operations. When the RHEL instance boots for the first time, the store is initialized from default values associated with the VM image.
Boot loader: When booted, the firmware loads the first stage boot loader. For the RHEL instance in a x86 UEFI environment, the first stage boot loader is shim. The shim boot loader authenticates and loads the next stage of the boot process and acts as a bridge between UEFI and GRUB.
-
The shim x86 binary in RHEL is currently signed by the
Microsoft Corporation UEFI CA 2011Microsoft certificate so that the RHEL instance can boot in the Secure Boot enabled mode on various hardware and virtualized platforms where the Allowed Signature database (db) contains the default Microsoft certificates. -
The shim binary extends the list of trusted certificates with Red Hat Secure Boot CA and optionally, with Machine Owner Key (
MOK).
-
The shim x86 binary in RHEL is currently signed by the
-
UKI: The shim binary loads the RHEL UKI (the
kernel-uki-virtpackage). UKI is signed by the corresponding certificate,Red Hat Secure Boot Signing 504on the x86_64 architecture, which can be found in theredhat-sb-certspackage. This certificate is signed by Red Hat Secure Boot CA, and thus passes the check. -
UKI add-ons: To use the UKI
cmdlineextensions, the RHEL kernel checks their signatures againstdb,MOK, and certificates shipped with shim to ensure that the extensions are signed by either the operating system vendor RHEL or a user.
When the RHEL kernel boots in the Secure Boot mode, it enters lockdown mode. After entering lockdown, the RHEL kernel adds the db keys to the .platform keyring and the MOK keys to the .machine keyring. During the kernel build process, standard RHEL kernel modules, such as kernel-modules-core, kernel-modules, kernel-modules-extra are signed with an ephemeral key, which consists of private and public key. After the completion of each kernel build, the private key becomes obsolete to sign third-party modules. Certificates from db and MOK can be used for this purpose.
5.2. Configuring a RHEL VM on Azure with Secure Boot Copier lienLien copié sur presse-papiers!
To ensure that your Red Hat Enterprise Linux instance on the Azure cloud platform has a secured operating system booting process, use Secure Boot. When a custom RHEL Azure image is registered, the image consists of pre-stored Unified Extensible Firmware Interface (UEFI) variables for Secure Boot. This enables all the instances launched from the RHEL images to use the Secure Boot mechanism with the required variables on the first boot.
Microsoft Azure supports Secure Boot with Trusted Launch VMs. These VMs provide security mechanisms to protect against rootkits and bootkits, while providing additional features such as Virtual Trusted Platform Manager (vTPM). When creating an instance by using the GUI, you can find the Enable secure boot option under the Configure security features setting.
Prerequisites
You have installed the packages:
-
python3 -
openssl -
efivar -
keyutils -
python3-virt-firmware
-
-
You have installed the
azure-cliutility. For details, see Installing the Azure CLI on Linux.
Procedure
Generate a custom certificate
custom_db.cerby using theopensslutility:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Convert the certificate into
base64-encoded format:echo base64 -w0 custom_db.cer MIIFIjCCAwqgAwIBAgITNf23J4k0d8c0NR ....
$ echo base64 -w0 custom_db.cer MIIFIjCCAwqgAwIBAgITNf23J4k0d8c0NR ....Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create and edit an
azure-example-template.jsonAzure Resource Manager (ARM) file for registering a new Azure Compute Gallery image version:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
azure-cliutility to register the image version:az deployment group create --name <example-deployment> \ --resource-group <example-resource-group> \ --template-file <example-template.json>
$ az deployment group create --name <example-deployment> \ --resource-group <example-resource-group> \ --template-file <example-template.json>Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Reboot the instance from the Azure Portal.
Verification
Check if the newly created RHEL instance has Secure Boot enabled:
mokutil --sb-state SecureBoot enabled
$ mokutil --sb-state SecureBoot enabledCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
keyctlutility to verify the kernel keyring for the custom certificate:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 6. Configuring RHEL on Public Cloud Platforms with Intel TDX Copier lienLien copié sur presse-papiers!
Intel Trust Domain Extensions (TDX) is a security type of Confidential Virtual Machine (CVM), which provides a secure and isolated environment for VM. This approach is an advancement to the former technology, Intel Software Guard Extensions (SGX).
SGX provides VM isolation from the hypervisor and cloud service providers by creating secure memory regions known as enclaves. Application code stored in enclaves has access to memory and data stored inside the enclave, making it inaccessible to outside entities.
TDX creates hardware-isolated VMs called Trusted Domains (TDs). It ensures that only a VM accesses its memory and TD VMs are isolated from Virtual Machine Manager (VMM), hypervisors, other VMs, and the host. This ensures that while using resources from hypervisor, CPU, TD VMs remain secure by maintaining data confidentiality and integrity.
The main difference between SGX and TDX is that SGX works at application level while TDX works at virtualization level by limiting hypervisor access.
Before deploying Red Hat Enterprise Linux (RHEL) on a public cloud platform, always check with the corresponding cloud service provider for the support status and certification of the particular RHEL instance type.
6.1. Understanding Intel TDX secure boot process Copier lienLien copié sur presse-papiers!
- Initialization and measurement: A TDX-enabled hypervisor sets the initial state of a VM. This hypervisor loads the firmware binary file into the VM memory and sets the initial register state. The Intel processor measures the initial state of the VM and provides details to verify the initial state of the VM.
- Firmware: The VM initiates the UEFI firmware. The firmware might include stateful or stateless Virtual Trusted Platform Module (vTPM) implementation. Stateful vTPM maintains persistent cryptographic state across VM reboots and migrations, whereas stateless vTPM generates fresh cryptographic state for each VM session without persistence. Virtual Machine Privilege Levels (VMPL) technology isolates vTPM from the guest. VMPL offers hardware-enforced privilege isolation between different VM components and the hypervisor.
- vTPM: Depending on your cloud service provider, for stateful vTPM implementation, the UEFI firmware might perform a remote attestation to decrypt the persistent state of vTPM. The vTPM also gathers data about the boot process, such as Secure Boot state, certificates used for signing boot artifacts, or UEFI binary hashes.
Shim : When the UEFI firmware finishes the initialization process, it searches for the extended firmware interface (EFI) system partition. Then, the UEFI firmware verifies and executes the first stage boot loader from there. For RHEL, this is
shim. Theshimprogram allows non-Microsoft operating systems to load the second stage boot loader from the EFI system partition.-
shimuses a Red Hat certificate to verify the second stage boot loader (grub) or Red Hat Unified Kernel Image (UKI). -
gruborUKIunpacks, verifies, and executes Linux kernel and initramfs, and the kernel command line. This process ensures that the Linux kernel is loaded in a trusted and secured environment.
-
Initramfs: In initramfs, vTPM information automatically unlocks the encrypted root partition in case of full disk encryption technology.
-
When the root volume becomes available,
initramfstransfers the execution flow there.
-
When the root volume becomes available,
- Attestation: The VM tenant gets access to the system and can perform a remote attestation to ensure that the accessed VM is an untampered Confidential Virtual Machine (CVM). Attestation is performed based on information from the Intel processor and vTPM. This process confirms the authenticity and reliability of the initial CPU and memory state of the RHEL instance and Intel processor.
- TEE: This process creates a Trusted Execution Environment (TEE) to ensure that booting of the VM is in a trusted and secured environment.
6.2. Configuring a RHEL VM on Azure with Intel TDX Copier lienLien copié sur presse-papiers!
By using Intel TDX, you can create hardware-assisted isolated VMs known as trusted domains (TDs). It ensures that only the VM has access to its resources, while remaining inaccessible to hypervisors and hosts.
Prerequisites
-
You have installed the
opensshandopenssh-clientspackages. - You have installed the Azure CLI utility. For details, see Installing the Azure CLI on Linux.
- You have launched the RHEL instance from a supported Azure instance type. For details, see Azure Confidential VM options.
Procedure
Log in to Azure by using the
azure cliutility:az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an Azure resource group for the selected availability zone:
az group create --name <example_resource_group> --location westeurope
$ az group create --name <example_resource_group> --location westeuropeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy a RHEL instance with TDX enabled, for example, the
Standard_DC2eds_v5instance type:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Connect to the RHEL instance:
ssh <example_azure_user>@<example_ip_address_of_the_instance>
$ ssh <example_azure_user>@<example_ip_address_of_the_instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check kernel logs to verify status of TDX:
dmesg | grep -i tdx
$ dmesg | grep -i tdxCopy to Clipboard Copied! Toggle word wrap Toggle overflow ... [ 0.733613] Memory Encryption Features active: Intel TDX [ 4.320222] systemd[1]: Detected confidential virtualization tdx. [ 5.977432] systemd[1]: Detected confidential virtualization tdx. ...
... [ 0.733613] Memory Encryption Features active: Intel TDX [ 4.320222] systemd[1]: Detected confidential virtualization tdx. [ 5.977432] systemd[1]: Detected confidential virtualization tdx. ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check metadata of the RHEL instance configuration:
az vm show --resource-group <example_resource_group> \ --name <example_rhel_instance> \ --query "securityProfile.enableTrustedDomainExtensions" \ --output json
$ az vm show --resource-group <example_resource_group> \ --name <example_rhel_instance> \ --query "securityProfile.enableTrustedDomainExtensions" \ --output jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 7. Configuring RHEL on Public Cloud Platforms with AMD SEV SNP Copier lienLien copié sur presse-papiers!
AMD Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP) aims to prevent VM integrity-based attacks and reduce the dangers of memory integrity violations. For the secure boot process, AMD processors offer three hardware-based security mechanisms; Secure Encrypted Virtualization (SEV), SEV Encrypted State (SEV-ES), and SEV Secure Nested Paging (SEV-SNP).
- SEV: The SEV mechanism encrypts virtual machine (VM) memory to prevent the hypervisor from accessing VM data.
- SEV-ES: SEV with Encrypted State (SEV-ES) extends SEV by encrypting CPU register states. This mechanism prevents the hypervisor from accessing or modifying VM CPU registers. Despite providing isolation between hypervisor and VM, it is still vulnerable to memory integrity attacks.
SEV-SNP: SEV-SNP is an enhancement to SEV-ES that adds memory integrity protection along with VM encryption. This mechanism prevents the hypervisor from modifying page tables to redirect VM memory access, protecting against replay attacks and memory tampering.
NoteBefore deploying Red Hat Enterprise Linux (RHEL) on a public cloud platform, always check with the corresponding cloud service provider for the support status and certification of the particular RHEL instance type.
7.1. Properties of SEV-SNP Copier lienLien copié sur presse-papiers!
-
Secure Processor: The AMD
EPYCprocessor integrates a Secure Processor (SP) subsystem. AMD SP is a dedicated hardware component to manage keys and encryption operations. - Memory Integrity: For managing virtualization and isolation, memory management unit (MMU) utilizes page tables to translate virtual addresses to guest-physical addresses. SEV-SNP uses nested page tables for translating guest-physical addresses to host-physical addresses. Once nested page tables are defined, the hypervisor or host cannot alter page tables to modify the VM into accessing different pages, resulting in protection of memory integrity. SEV-SNP uses this method to offer protection against replay attacks and malicious modifications to VM memory.
-
Memory Encryption: The AMD
EPYCprocessor hides the memory encryption key, which remains hidden from both host and VM. Attestation report for verification: A CPU-generated report about RHEL instance information in an authorized cryptographic format. This process confirms the authenticity and reliability of the initial CPU and memory state of the RHEL instance and AMD processor.
NoteEven if a hypervisor creates the primary memory and CPU register state of the VM, they remain hidden and inaccessible to the hypervisor after initialization of that VM.
7.2. Understanding AMD SEV SNP secure boot process Copier lienLien copié sur presse-papiers!
- Initialization and measurement: A SEV-SNP enabled hypervisor sets the initial state of a VM. This hypervisor loads firmware binary into the VM memory and sets the initial register state. AMD Secure Processor (SP) measures the initial state of the VM and provides details to verify the initial state of the VM.
- Firmware: The VM initiates the UEFI firmware. The firmware might include either stateful or stateless Virtual Trusted Platform Module (vTPM) implementation. Stateful vTPM maintains persistent cryptographic state across VM reboots and migrations, whereas stateless vTPM generates fresh cryptographic state for each VM session without persistence. Virtual Machine Privilege Levels (VMPL) technology isolates vTPM from the guest. VMPL offers hardware-enforced privilege isolation between different VM components and the hypervisor.
vTPM: Depending on your cloud service provider, for stateful vTPM implementation, the UEFI firmware might perform a remote attestation to decrypt the persistent state of vTPM.
- The vTPM also measures facts about the boot process such as Secure Boot state, certificates used for signing boot artifacts, UEFI binary hashes, and so on.
Shim: When the UEFI firmware finishes the initialization process, it searches for the extended firmware interface (EFI) system partition. Then, the UEFI firmware verifies and executes the first stage boot loader from there. For RHEL, this is
shim. Theshimprogram allows non-Microsoft operating systems to load the second stage boot loader from the EFI system partition.-
shimuses a Red Hat certificate to verify the second stage boot loader (grub) or Red Hat Unified Kernel Image (UKI). -
gruborUKIunpacks, verifies, and executes Linux kernel and initial RAM filesystem (initramfs), and the kernel command line. This process ensures that the Linux kernel is loaded in a trusted and secured environment.
-
Initramfs: In
initramfs, vTPM information automatically unlocks the encrypted root partition in case of full disk encryption technology.-
When the root volume becomes available,
initramfstransfers the execution flow to the root volume.
-
When the root volume becomes available,
- Attestation: The VM tenant gets access to the system and can perform a remote attestation to ensure that the accessed VM is an untampered Confidential Virtual Machine (CVM). Attestation is performed based on information from AMD SP and vTPM. This process confirms the authenticity and reliability of the initial CPU and memory state of the RHEL instance and AMD processor.
- TEE: This process creates a Trusted Execution Environment (TEE) to ensure that booting of the VM is in a trusted and secured environment.
7.3. Configuring a RHEL VM on Azure with AMD SEV SNP Copier lienLien copié sur presse-papiers!
AMD Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP) is a security type of the Confidential Virtual Machine (CVM) technology for Red Hat Enterprise Linux (RHEL) on Azure Virtual Machines (VMs) and available only for AMD EPYC processor family. SEV-SNP provides a trusted boot environment so that the entire process becomes secured and protected such that hypervisor and cloud service provider cannot access the data.
Prerequisites
-
You have installed the
opensshandopenssh-clientspackages. - You have installed the Azure CLI utility. For details, see Installing the Azure CLI.
- You have launched the instance only from the mentioned Azure instance types. For details, see Supported VM sizes for CVM.
Procedure
Log in to Azure by using the Azure CLI utility:
az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an Azure resource group for selected availability zone:
az group create --name <example_resource_group> --location eastus
$ az group create --name <example_resource_group> --location eastusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy a RHEL instance with SEV-SNP, for example, the
Standard_DC4as_V5instance type:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Connect to the RHEL instance:
ssh <example_azure_user>@<example_ip_address_of_VM>
$ ssh <example_azure_user>@<example_ip_address_of_VM>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check kernel logs to verify status of SEV-SNP:
sudo dmesg | grep -i sev
$ sudo dmesg | grep -i sevCopy to Clipboard Copied! Toggle word wrap Toggle overflow ... [ 0.547223] Memory Encryption Features active: AMD SEV [ 4.843171] kvm-guest: setup_efi_kvm_sev_migration : EFI live migration variable not found ...
... [ 0.547223] Memory Encryption Features active: AMD SEV [ 4.843171] kvm-guest: setup_efi_kvm_sev_migration : EFI live migration variable not found ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 8. Deploying an HPC cluster on Azure by using RHEL system roles Copier lienLien copié sur presse-papiers!
High-performance computing (HPC) workloads on Microsoft Azure require specialized configurations for optimal performance and scalability. The HPC RHEL system role automates the configuration of RHEL images with HPC-specific optimizations, including InfiniBand support, performance tuning, and required libraries.
After configuring an HPC-enabled image, you can generalize the virtual machine and create reusable image versions in Azure Compute Gallery. These images serve as the foundation for deploying HPC clusters on Microsoft Azure by using Azure CycleCloud, a cluster orchestration tool that integrates with the Slurm workload manager to schedule and manage computational jobs. Environment Modules provides a flexible framework for managing multiple software versions and their dependencies across HPC cluster nodes.
8.1. Configuring a RHEL Azure VM for HPC by using the HPC RHEL system role Copier lienLien copié sur presse-papiers!
To configure a high-performance computing (HPC) RHEL system role on a customized Red Hat Enterprise Linux (RHEL) image, you can use the cloud-init utility. With cloud-init, you can automate configuration of Ansible collection and running Ansible playbooks on Microsoft Azure.
Use one of the following methods to configure an HPC RHEL system role on a customized RHEL image.
8.1.1. Configuring a RHEL HPC VM by using Azure Portal Copier lienLien copié sur presse-papiers!
By using Ansible with the ansible-core utility, you can automate the configuration of custom RHEL images for Azure by applying RHEL system roles during the image building process. Utilities like cloud-init embed high-performance computing (HPC) RHEL system role configurations to create and configure an HPC RHEL image before deployment in Azure.
Prerequisites
- You have an active Azure cloud subscription.
Procedure
- Go to the Azure Console.
- Click Virtual Machines → Create Virtual Machine.
Select the following configurations for a virtual machine from the Basics tab:
- In the Virtual machine name field, enter your VM name.
- Security type: Standard
- Image → See All Images → Search for Red Hat Enterprise Linux (RHEL) for High Performance Computing (HPC) on Azure → Select Red Hat Enterprise Linux for HPC 9.6 VM - x64 Gen2
- VM architecture: x64
Size: Standard_NC4as_T4_v3 - 4 vcpus, 28 GiB memory
NoteTo get optimal performance, use only GPU-optimized VMs, such as NC and ND series. For details, see Virtual machine sizes in Azure.
Go to the Advanced tab and enter the following details to the Custom data field:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Click the Review + create button to create a VM with the given configuration.
- In the Azure Console, check that the VM has been deployed successfully and is ready for use.
- Click the Go to resource button.
- Copy the Public IP address.
- In the Azure Console, check that the VM is running.
Connect to the VM:
ssh -i ~/.ssh/azure_hpc <example_azureuser>@<192.0.2.101>
$ ssh -i ~/.ssh/azure_hpc <example_azureuser>@<192.0.2.101>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the VM status:
sudo cloud-init status --wait
$ sudo cloud-init status --waitCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once ready, run the HPC RHEL system role:
sudo ANSIBLE_LOG_PATH=/var/log/ansible_hpc_full_install.log ansible-playbook /root/hpc_full_install.yaml --verbose
$ sudo ANSIBLE_LOG_PATH=/var/log/ansible_hpc_full_install.log ansible-playbook /root/hpc_full_install.yaml --verboseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Reboot the VM.
ImportantWait for the completion of the initial reboot as the HPC RHEL system role configurations finalize during this phase.
Verification
Connect to the VM through SSH:
ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>
$ ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the list of installed packages:
sudo dnf list installed| grep -i -E 'nvidia-driver|cuda-toolkit|nccl|fabric-manager|rdma|openmpi'
$ sudo dnf list installed| grep -i -E 'nvidia-driver|cuda-toolkit|nccl|fabric-manager|rdma|openmpi'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify installed
Lmodenvironment modules:ml available
$ ml availableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
-
L: Module is loaded -
D: Default Module
-
8.1.2. Configuring a RHEL HPC VM by using Azure CLI Copier lienLien copié sur presse-papiers!
By using Ansible with the ansible-core utility, you can automate the configuration of custom Red Hat Enterprise Linux (RHEL) images for Azure by applying RHEL system roles during the image building process. Utilities like cloud-init and Azure CLI manage high-performance computing (HPC) RHEL system roles to create and configure an HPC RHEL image before deployment in Azure.
Prerequisites
- You have an active Azure cloud subscription.
Procedure
Connect to the Azure portal:
az login
$ az loginCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a key pair:
ssh-keygen -t ed25519 -b 3072 -C "<azureuser@hpc>" -f ~/.ssh/azure_hpc
$ ssh-keygen -t ed25519 -b 3072 -C "<azureuser@hpc>" -f ~/.ssh/azure_hpcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
user-data.ymlfile with the following details:vi user-data.yml
$ vi user-data.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a resource group:
az group create --name <example_vm_resource_group>
$ az group create --name <example_vm_resource_group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Select and accept terms and conditions of the relevant image for your account:
For North America (NA) or Global accounts, use:
az vm image terms accept --urn "redhat:rh-rhel-hpc:rh-rhel-hpc96:latest"
$ az vm image terms accept --urn "redhat:rh-rhel-hpc:rh-rhel-hpc96:latest"Copy to Clipboard Copied! Toggle word wrap Toggle overflow For Europe, the Middle East and Africa (EMEA) accounts, use:
az vm image terms accept --urn "redhat-limited:rh-rhel-hpc:rh-rhel-hpc96:latest"
$ az vm image terms accept --urn "redhat-limited:rh-rhel-hpc:rh-rhel-hpc96:latest"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Create an image based on the specified configuration in the last step:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Check the Azure Console if the VM is running.
Connect to the VM through SSH:
ssh -i ~/.ssh/azure_hpc <example_azureuser>@<192.0.2.101>
$ ssh -i ~/.ssh/azure_hpc <example_azureuser>@<192.0.2.101>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the VM status:
sudo cloud-init status --wait
$ sudo cloud-init status --waitCopy to Clipboard Copied! Toggle word wrap Toggle overflow Once ready, run the HPC RHEL system role:
sudo ANSIBLE_LOG_PATH=/var/log/ansible_hpc_full_install.log ansible-playbook /root/hpc_full_install.yaml --verbose
$ sudo ANSIBLE_LOG_PATH=/var/log/ansible_hpc_full_install.log ansible-playbook /root/hpc_full_install.yaml --verboseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Reboot the VM.
ImportantWait for the completion of the initial reboot as the HPC RHEL system role configurations finalize during this phase.
Verification
Connect to the VM through SSH:
ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>
$ ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the list of installed packages:
sudo dnf list installed| grep -i -E 'nvidia-driver|cuda-toolkit|nccl|fabric-manager|rdma|openmpi'
$ sudo dnf list installed| grep -i -E 'nvidia-driver|cuda-toolkit|nccl|fabric-manager|rdma|openmpi'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify installed
Lmodenvironment modules:ml available
$ ml availableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
-
L: Module is loaded -
D: Default Module
-
8.2. Generalizing an Azure virtual machine for image creation Copier lienLien copié sur presse-papiers!
By generalizing a virtual machine (VM), you prepare the VM to use as a template or base image for image versioning. In this process, you need to remove specific data, stop the VM, deallocate resources, and mark the VM as generalized. By using generalized images, you can create multiple image versions from the same image.
Prerequisites
- You have already configured a RHEL HPC image. For details, see Configuring a RHEL HPC image by using the HPC RHEL system role.
Procedure
Connect to the VM through SSH:
ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>
$ ssh -i <example_private_key.pem> <example_azureuser>@<192.0.2.101>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove any temporary users, network and host information:
sudo waagent -deprovision+user -force
$ sudo waagent -deprovision+user -forceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log out or press Ctrl + D to close the SSH session.
Stop the VM:
az vm stop --name <example_vm_name> --resource-group <example_vm_resource_group>
$ az vm stop --name <example_vm_name> --resource-group <example_vm_resource_group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Deallocate the resources to stop incurring Azure charges:
az vm deallocate --name <example_vm_name> --resource-group <example_vm_resource_group>
$ az vm deallocate --name <example_vm_name> --resource-group <example_vm_resource_group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Generalize the VM to make sure that this image is generic and ready to clone:
az vm generalize --name <example_vm_name> --resource-group <example_vm_resource_group>
$ az vm generalize --name <example_vm_name> --resource-group <example_vm_resource_group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3. Preparing an Azure Image version from a generalized VM Copier lienLien copié sur presse-papiers!
To create a reusable Azure image version from a generalized virtual machine, you must first create a resource group to organize related resources such as compute, network, and storage. Within this resource group, set up an Azure Compute Gallery to manage and share images across your organization. Define image definitions in the gallery to logically group images and specify their properties and requirements. Based on these image definitions, you can create multiple image versions for consistency and scalability.
With an image version, you can create replicas and multiple versions of the same image. With Azure Compute Gallery, you can create a marketplace compatible custom images to share across your organization. You can use Azure CLI or Azure Cloud Shell. For details on Azure Cloud Shell, see Get started with Azure Cloud Shell.
Prerequisites
- You have installed Azure CLI. For details, see Installing the Azure CLI.
- You have created a generalized VM. For details, see Generalizing an Azure virtual machine for image creation.
Procedure
Create a resource group for hosting the gallery:
az group create --name <example_image_resource_group>
$ az group create --name <example_image_resource_group>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a gallery in the above resource group:
az sig create --resource-group <example_image_resource_group> \ --gallery-name <example_image_gallery_name>
$ az sig create --resource-group <example_image_resource_group> \ --gallery-name <example_image_gallery_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set security type to
Standardfor subscription:az feature register --name UseStandardSecurityType \ --namespace Microsoft.Compute
$ az feature register --name UseStandardSecurityType \ --namespace Microsoft.ComputeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Register the provider:
az provider register --namespace Microsoft.Compute
$ az provider register --namespace Microsoft.ComputeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an image definition to manage image versions:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - <Publisher>
- The entity or organization that provides the image.
- <Offer>
- A collection of related images from a publisher.
- <Stock Keeping Unit (SKU)>
- An edition within an offer, indicating a major release.
- <VERSION>
- A version number of a given SKU.
Get information about the image:
az vm list --output table
$ az vm list --output tableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use generalized image name from the output:
az vm get-instance-view --resource-group <example_vm_resource_group> \ --name <example_vm_name> \ --query id
$ az vm get-instance-view --resource-group <example_vm_resource_group> \ --name <example_vm_name> \ --query idCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Copy the ID of the image definition from the output.
Create the image version by using the image ID obtained in the previous step:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Delete the VM and associated resources:
az vm delete --resource-group <example_image_resource_group> \ --name <example_vm_name>
$ az vm delete --resource-group <example_image_resource_group> \ --name <example_vm_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.4. Deploying an HPC cluster by using Azure CycleCloud and Slurm Copier lienLien copié sur presse-papiers!
You can configure a Red Hat Enterprise Linux (RHEL) high-performance computing (HPC) cluster on Azure Cloud. HPC clusters are useful for solving complex problems that require intense processing and computation, by distributing tasks across multiple machines, also known as nodes.
Azure CycleCloud, a cloud-native orchestrator, manages HPC clusters for Azure Cloud. With Azure CycleCloud, you can administer HPC clusters for automatic deployment and scaling suitable workloads. Azure CycleCloud manages parallel computing jobs, resources, and sets the Slurm workload manager. Slurm, however, manages resource allocation for scheduling and running tasks in a cluster. The following steps use Slurm and Azure CycleCloud 8.x for deployment and managing a RHEL HPC cluster.
To configure a RHEL HPC cluster in the Azure environment, you can use Microsoft Azure services such as Azure CycleCloud. Use these tools at your own risk.
Prerequisites
- You have an active Azure cloud subscription.
- You have a RHEL HPC image. For details, see Configuring a RHEL HPC image by using the HPC RHEL system role.
- You have a generalized VM. For details, see Generalizing an Azure virtual machine for image creation.
- You have a prepared Azure image version. For details, see Preparing an Azure image version from a generalized VM.
Procedure
Install and deploy CycleCloud on Azure:
- For Azure Marketplace installation, see Install Azure CycleCloud from Azure Marketplace.
- For manual installation, see Install Azure CycleCloud manually.
Display ID of the custom RHEL HPC image:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Provision a Slurm workload manager with Azure CycleCloud by following the steps in the Run Slurm with CycleCloud:
WarningDue to a known limitation of IPv4 addresses, selecting the
Public Head Nodeoption causes provisioning failure with the Slurm head node. As a workaround, ensure thePublic Head Nodeoption is unchecked and determine the most suitable way of accessing the Slurm head node in your environment. For details, see the related Slurm issue on GitHub.NoteUse the custom RHEL image ID obtained in the earlier step for all cluster nodes. For details, see how to specify a custom OS image.
- On the CycleCloud homepage, select the existing Slurm cluster.
- To launch the Slurm cluster, click Start.
- Log in to the Slurm head node by selecting the cluster view and clicking Connect. Use the standard Slurm command-line tools to schedule HPC jobs. For details, see How do I submit jobs? (Slurm) section.
8.5. Managing Environment Modules for HPC clusters Copier lienLien copié sur presse-papiers!
The environment module subsystem uses Lua-based environment modules (Lmod) to list installed modules. With Lmod, you can dynamically modify the environment by loading and unloading various software packages and their dependencies. It manages multiple versions of compilers, libraries, and applications in the high-performance computing (HPC) environment, so you can select specific software configurations for each version. To do this, you can use the module utility, which is shortened to ml.
8.5.1. Commands for environment modules management Copier lienLien copié sur presse-papiers!
You can use the following commands to manage environment modules:
Display all commands related to the
moduleutility:module help
$ module helpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Display the individual command options and syntax:
module <command> help
$ module <command> helpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Display details for a specific module:
ml whatis pmix/pmix-4.2.9
$ ml whatis pmix/pmix-4.2.9Copy to Clipboard Copied! Toggle word wrap Toggle overflow pmix/pmix-4.2.9 : Description: PMIx 4.2.9 installed in /opt/pmix/4.2.9 pmix/pmix-4.2.9 : Version: 4.2.9-1
pmix/pmix-4.2.9 : Description: PMIx 4.2.9 installed in /opt/pmix/4.2.9 pmix/pmix-4.2.9 : Version: 4.2.9-1Copy to Clipboard Copied! Toggle word wrap Toggle overflow List available modules in the HPC environment:
ml available
$ ml availableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
-
L: Module is loaded -
D: Default Module -
The
(L)annotation indicates that thempi/openmpi-5.0.8-cuda12-gpumodule is loaded. -
This also loads the
pmix/pmix-4.2.9module as a dependency. - The module system automatically loads and unloads dependent modules as needed.
-
List the loaded modules:
ml list
$ ml listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Currently Loaded Modules: 1) pmix/pmix-4.2.9 2) mpi/openmpi-5.0.8-cuda12-gpu
Currently Loaded Modules: 1) pmix/pmix-4.2.9 2) mpi/openmpi-5.0.8-cuda12-gpuCopy to Clipboard Copied! Toggle word wrap Toggle overflow Unload a module and its dependencies:
ml unload mpi/openmpi-5.0.8-cuda12-gpu ml list
$ ml unload mpi/openmpi-5.0.8-cuda12-gpu $ ml listCopy to Clipboard Copied! Toggle word wrap Toggle overflow No modules loaded
No modules loadedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Unload an existing module and load a new module:
ml load mpi/openmpi-5.0.8-cuda12-gpu ml list
$ ml load mpi/openmpi-5.0.8-cuda12-gpu $ ml listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Currently Loaded Modules: 1) pmix/pmix-4.2.9 2) mpi/openmpi-5.0.8-cuda12-gpu
Currently Loaded Modules: 1) pmix/pmix-4.2.9 2) mpi/openmpi-5.0.8-cuda12-gpuCopy to Clipboard Copied! Toggle word wrap Toggle overflow Switch the loaded module:
ml swap mpi/openmpi-x86_64
$ ml swap mpi/openmpi-x86_64Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following have been reloaded with a version change: 1) mpi/openmpi-5.0.8-cuda12-gpu => mpi/openmpi-x86_64
The following have been reloaded with a version change: 1) mpi/openmpi-5.0.8-cuda12-gpu => mpi/openmpi-x86_64Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the available modules:
ml list
$ ml listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Currently Loaded Modules: 1) mpi/openmpi-x86_64
Currently Loaded Modules: 1) mpi/openmpi-x86_64Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.5.2. Modulefiles layout and rules Copier lienLien copié sur presse-papiers!
A modulefile is a script that defines and manages environment variables for loading, unloading, and switching software environments on HPC systems. The recommended directory structure and naming conventions enable efficient organization of environment modulefiles, including module definitions, wrapper modules, and consistent rules for managing multiple software environments in RHEL-based HPC deployments.
- Manual method for module definitions
-
The
moduleutility automatically discovers the module definitions placed in the/usr/share/modulefilesdirectory. If you have other directories with module definitions, you need to add them to theMODULEPATHenvironment variable. - Wrapper modules
-
To avoid modifying environment variables and allow
Lmodto find package-specific modules, place thewrappermodules in the/usr/share/modulefilesdirectory so that these modules modifyMODULEPATHand load the relevant module from the external location. An example of this style of wrapper is thempi/hpcx-2.24.1environment module. - Module file format
Lmodsupports environment modules written in Lua as well as the Tool Command Language (Tcl) format. Lua scripting is the preferred method for defining environment modules. Use the.luaextension with the755permission set. The module name for themlcommand omits the.luasuffix. Examples given in this document use the Lua script interfaces. It has the following requirements:-
All packages that provide the same functionality, for example MPI libraries, are in a common module subdirectory. In this case,
../mpi/for all the MPI library variants. -
Add a
conflict()definition to the module for the package subdirectory (e.g.conflict("mpi")) which allows only one module of that package type to load at a time. - Keep package naming consistent.
Use the feature name as the subdirectory name, while the individual modules for each package instance should be named according to the following pattern:
<package provider>-<version>-<build>-<options>
<package provider>-<version>-<build>-<options>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For multiple MPI library variants, some are from different providers, while others are multiple builds of a given package with different compilers and build options. In such cases, the naming scheme provides consistency.
-
All packages that provide the same functionality, for example MPI libraries, are in a common module subdirectory. In this case,
8.5.3. Available MPI environments Copier lienLien copié sur presse-papiers!
In an HPC Slurm cluster, the message passing interface (MPI) environment provides runtime support for applications to communicate and synchronize across nodes. Slurm integrates with MPI implementations such as OpenMPI to manage distributed jobs efficiently and optimize use of cluster resources. Slurm uses mpirun to orchestrate execution across allocated nodes for scalable, high-performance jobs. With the provided RHEL HPC cluster, the following MPI environments are available:
-
Openmpi.x86_64: Standard open MPI build that does not provide CUDA or GPU acceleration support. -
Openmpi-5.0.8-cuda-gpu: Open MPI module compiled with CUDA support for GPU-aware MPI communication -
Hpcx-2.24.1: NVIDIA comprehensive HPC software stack based on Open MPI. -
Hpcx-2.24.1-pmix: HPC-X build configured with Process Management Interface (PMIx) for Slurm integration and job scheduling.
For consistency in naming of MPI libraries, specify Unified Communication X (UCX) as the point to point messaging layer (PML) implementation for the MPI infrastructure:
mpirun -mca pml ucx .
$ mpirun -mca pml ucx .
- openmpi.x86_64
- This module provides a standard build of OpenMPI 4.1.1. Although it includes PMIx 3.x support, it does not provide CUDA or GPU acceleration support. The RHEL InfiniBand drivers and infrastructure provide InfiniBand (IB)/Remote Direct Memory Access (RDMA) functionality. The OpenMPI libraries are compiled with gcc-11.4. Use this module if your application does not use CUDA language extension or require GPU offload support.
- openmpi-5.0.8-cuda-gpu
This module provides OpenMPI 5.0.8 with PMIx 4.x support and CUDA and NVIDIA GPU acceleration support by using the NVIDIA HPC-X package libraries. The RHEL InfiniBand drivers and infrastructure provide InfiniBand (IB)/Remote Direct Memory Access (RDMA) functionality. The OpenMPI libraries are compiled with gcc-11.5. Only select this module if you are using CUDA-enabled applications that require GPU acceleration.
NoteIn the absence of InfiniBand NICs, a warning is displayed for failure of InfiniBand NIC autodiscovery:
mpirun -n 2 /lib64/openmpi/bin/mpitests-osu_allreduce
$ mpirun -n 2 /lib64/openmpi/bin/mpitests-osu_allreduceCopy to Clipboard Copied! Toggle word wrap Toggle overflow ... [test-vm] Error: coll_hcoll_module.c:312 - mca_coll_hcoll_comm_query() Hcol library init failed ...
... [test-vm] Error: coll_hcoll_module.c:312 - mca_coll_hcoll_comm_query() Hcol library init failed ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can remove this warning by adding the
-mca coll ^hcollparameter to thempiruncommand:mpirun -mca coll ^hcoll -n 2 /lib64/openmpi/bin/mpitests-osu_allreduce
$ mpirun -mca coll ^hcoll -n 2 /lib64/openmpi/bin/mpitests-osu_allreduceCopy to Clipboard Copied! Toggle word wrap Toggle overflow # Size Avg Latency(us) 1 8.36 2 6.85
# Size Avg Latency(us) 1 8.36 2 6.85Copy to Clipboard Copied! Toggle word wrap Toggle overflow - hpcx-2.24.1
- This module provides an NVIDIA built OpenMPI 4.1.5 environment. It has no PMIx support, but provides CUDA and NVIDIA GPU acceleration support. The RHEL InfiniBand drivers and infrastructure provide InfiniBand (IB)/Remote Direct Memory Access (RDMA) functionality.
- hpcx-2.24.1-pmix
-
This module provides the same environment as the
mpi/hpcx-2.24.1module and also has PMIx 4.x support enabled.