Technical Reference
The technical architecture of Red Hat Virtualization environments
Abstract
Chapter 1. Introduction
1.1. Red Hat Virtualization Manager
The Red Hat Virtualization Manager provides centralized management for a virtualized environment. A number of different interfaces can be used to access the Red Hat Virtualization Manager. Each interface facilitates access to the virtualized environment in a different manner.
Figure 1.1. Red Hat Virtualization Manager Architecture
The Red Hat Virtualization Manager provides graphical interfaces and an Application Programming Interface (API). Each interface connects to the Manager, an application delivered by an embedded instance of the Red Hat JBoss Enterprise Application Platform. There are a number of other components which support the Red Hat Virtualization Manager in addition to Red Hat JBoss Enterprise Application Platform.
1.2. Red Hat Virtualization Host
A Red Hat Virtualization environment has one or more hosts attached to it. A host is a server that provides the physical hardware that virtual machines make use of.
Red Hat Virtualization Host (RHVH) runs an optimized operating system installed using a special, customized installation media specifically for creating virtualization hosts.
Red Hat Enterprise Linux hosts are servers running a standard Red Hat Enterprise Linux operating system that has been configured after installation to permit use as a host.
Both methods of host installation result in hosts that interact with the rest of the virtualized environment in the same way, and so, will both referred to as hosts.
Figure 1.2. Host Architecture
- Kernel-based Virtual Machine (KVM)
- The Kernel-based Virtual Machine (KVM) is a loadable kernel module that provides full virtualization through the use of the Intel VT or AMD-V hardware extensions. Though KVM itself runs in kernel space, the guests running upon it run as individual QEMU processes in user space. KVM allows a host to make its physical hardware available to virtual machines.
- QEMU
- QEMU is a multi-platform emulator used to provide full system emulation. QEMU emulates a full system, for example a PC, including one or more processors, and peripherals. QEMU can be used to launch different operating systems or to debug system code. QEMU, working in conjunction with KVM and a processor with appropriate virtualization extensions, provides full hardware assisted virtualization.
- Red Hat Virtualization Manager Host Agent, VDSM
-
In Red Hat Virtualization, VDSM initiates actions on virtual machines and storage. It also facilitates inter-host communication. VDSM monitors host resources such as memory, storage, and networking. Additionally, VDSM manages tasks such as virtual machine creation, statistics accumulation, and log collection. A VDSM instance runs on each host and receives management commands from the Red Hat Virtualization Manager using the re-configurable port
54321
.
- VDSM-REG
-
VDSM uses VDSM-REG to register each host with the Red Hat Virtualization Manager. VDSM-REG supplies information about itself and its host using port
80
or port443
. - libvirt
- Libvirt facilitates the management of virtual machines and their associated virtual devices. When Red Hat Virtualization Manager initiates virtual machine life-cycle commands (start, stop, reboot), VDSM invokes libvirt on the relevant host machines to execute them.
- Storage Pool Manager, SPM
The Storage Pool Manager (SPM) is a role assigned to one host in a data center. The SPM host has sole authority to make all storage domain structure metadata changes for the data center. This includes creation, deletion, and manipulation of virtual disks, snapshots, and templates. It also includes allocation of storage for sparse block devices on a Storage Area Network(SAN). The role of SPM can be migrated to any host in a data center. As a result, all hosts in a data center must have access to all the storage domains defined in the data center.
Red Hat Virtualization Manager ensures that the SPM is always available. In case of storage connectivity errors, the Manager re-assigns the SPM role to another host.
- Guest Operating System
Guest operating systems do not need to be modified to be installed on virtual machines in a Red Hat Virtualization environment. The guest operating system, and any applications on the guest, are unaware of the virtualized environment and run normally.
Red Hat provides enhanced device drivers that allow faster and more efficient access to virtualized devices. You can also install the Red Hat Virtualization Guest Agent on guests, which provides enhanced guest information to the management console.
1.3. Components that Support the Manager
- Red Hat JBoss Enterprise Application Platform
- Red Hat JBoss Enterprise Application Platform is a Java application server. It provides a framework to support efficient development and delivery of cross-platform Java applications. The Red Hat Virtualization Manager is delivered using Red Hat JBoss Enterprise Application Platform.
The version of the Red Hat JBoss Enterprise Application Platform bundled with Red Hat Virtualization Manager is not to be used to serve other applications. It has been customized for the specific purpose of serving the Red Hat Virtualization Manager. Using the Red Hat JBoss Enterprise Application Platform that is included with the Manager for additional purposes adversely affects its ability to service the Red Hat Virtualization environment.
- Gathering Reports and Historical Data
The Red Hat Virtualization Manager includes a data warehouse that collects monitoring data about hosts, virtual machines, and storage. A number of pre-defined reports are available. Customers can analyze their environments and create reports using any query tools that support SQL.
The Red Hat Virtualization Manager installation process creates two databases. These databases are created on a Postgres instance which is selected during installation.
- The engine database is the primary data store used by the Red Hat Virtualization Manager. Information about the virtualization environment like its state, configuration, and performance are stored in this database.
The ovirt_engine_history database contains configuration information and statistical metrics which are collated over time from the engine operational database. The configuration data in the engine database is examined every minute, and changes are replicated to the ovirt_engine_history database. Tracking the changes to the database provides information on the objects in the database. This enables you to analyze and enhance the performance of your Red Hat Virtualization environment and resolve difficulties.
For more information on generating reports based on the ovirt_engine_history database see the History Database in the Red Hat Virtualization Data Warehouse Guide.
ImportantThe replication of data to the ovirt_engine_history database is performed by the
RHEVM History Service
, ovirt-engine-dwhd.- Directory services
- Directory services provide centralized network-based storage of user and organizational information. Types of information stored include application settings, user profiles, group data, policies, and access control. The Red Hat Virtualization Manager supports Active Directory, Identity Management (IdM), OpenLDAP, and Red Hat Directory Server 9. There is also a local, internal domain for administration purposes only. This internal domain has only one user: the admin user.
1.4. Storage
Red Hat Virtualization uses a centralized storage system for virtual disks, templates, snapshots, and ISO files. Storage is logically grouped into storage pools, which are comprised of storage domains. A storage domain is a combination of storage capacity and metadata that describes the internal structure of the storage. See Storage Domain Types
The data domain is the only one required by each data center. A data storage domain is exclusive to a single data center. Export and ISO domains are optional. Storage domains are shared resources, and must be accessible to all hosts in a data center.
1.5. Network
The Red Hat Virtualization network architecture facilitates connectivity between the different elements of the Red Hat Virtualization environment. The network architecture not only supports network connectivity, it also allows for network segregation.
Figure 1.3. Network Architecture
Networking is defined in Red Hat Virtualization in several layers. The underlying physical networking infrastructure must be in place and configured to allow connectivity between the hardware and the logical components of the Red Hat Virtualization environment.
- Networking Infrastructure Layer
The Red Hat Virtualization network architecture relies on some common hardware and software devices:
- Network Interface Controllers (NICs) are physical network interface devices that connect a host to the network.
- Virtual NICs (vNICs) are logical NICs that operate using the host’s physical NICs. They provide network connectivity to virtual machines.
- Bonds bind multiple NICs into a single interface.
- Bridges are a packet-forwarding technique for packet-switching networks. They form the basis of virtual machine logical networks.
- Logical Networks
Logical networks allow segregation of network traffic based on environment requirements. The types of logical network are:
- logical networks that carry virtual machine network traffic,
- logical networks that do not carry virtual machine network traffic,
- optional logical networks,
- and required networks.
All logical networks can either be required or optional.
A logical network that carries virtual machine network traffic is implemented at the host level as a software bridge device. By default, one logical network is defined during the installation of the Red Hat Virtualization Manager: the ovirtmgmt management network.
Other logical networks that can be added by an administrator are: a dedicated storage logical network, and a dedicated display logical network. Logical networks that do not carry virtual machine traffic do not have an associated bridge device on hosts. They are associated with host network interfaces directly.
Red Hat Virtualization segregates management-related network traffic from migration-related network traffic. This makes it possible to use a dedicated network (without routing) for live migration, and ensures that the management network (ovirtmgmt) does not lose its connection to hypervisors during migrations.
- Explanation of logical networks on different layers
- Logical networks have different implications for each layer of the virtualization environment.
Data Center Layer
Logical networks are defined at the data center level. Each data center has the ovirtmgmt management network by default. Further logical networks are optional but recommended. Designation as a VM Network and a custom MTU can be set at the data center level. A logical network that is defined for a data center must also be added to the clusters that use the logical network.
Cluster Layer
Logical networks are made available from a data center, and must be added to the clusters that will use them. Each cluster is connected to the management network by default. You can optionally add to a cluster logical networks that have been defined for the cluster’s parent data center. When a required logical network has been added to a cluster, it must be implemented for each host in the cluster. Optional logical networks can be added to hosts as needed.
Host Layer
Virtual machine logical networks are implemented for each host in a cluster as a software bridge device associated with a given network interface. Non-virtual machine logical networks do not have associated bridges, and are associated with host network interfaces directly. Each host has the management network implemented as a bridge using one of its network devices as a result of being included in a Red Hat Virtualization environment. Further required logical networks that have been added to a cluster must be associated with network interfaces on each host to become operational for the cluster.
Virtual Machine Layer
Logical networks can be made available to virtual machines in the same way that a network can be made available to a physical machine. A virtual machine can have its virtual NIC connected to any virtual machine logical network that has been implemented on the host that runs it. The virtual machine then gains connectivity to any other devices or destinations that are available on the logical network it is connected to.
Example 1.1. Management Network
The management logical network, named ovirtmgmt
, is created automatically when the Red Hat Virtualization Manager is installed. The ovirtmgmt
network is dedicated to management traffic between the Red Hat Virtualization Manager and hosts. If no other specifically purposed bridges are set up, ovirtmgmt
is the default bridge for all traffic.
1.6. Data Centers
A data center is the highest level of abstraction in Red Hat Virtualization. A data center contains three types of information:
- Storage
- This includes storage types, storage domains, and connectivity information for storage domains. Storage is defined for a data center, and available to all clusters in the data center. All host clusters within a data center have access to the same storage domains.
- Logical networks
- This includes details such as network addresses, VLAN tags and STP support. You can define logical networks for a data center and apply them to clusters.
- Clusters
- Clusters are groups of hosts with compatible processor cores, either AMD or Intel processors. Clusters are migration domains; virtual machines can be live-migrated to any host within a cluster, and not to other clusters. One data center can hold multiple clusters, and each cluster can contain multiple hosts.
1.7. Data Center and Cluster Compatibility Levels
Red Hat Virtualization data centers and clusters have a compatibility version.
The data center compatibility version indicates the version of Red Hat Virtualization that the data center is intended to be compatible with. All clusters in the data center must support the desired compatibility level.
The cluster compatibility version indicates the features of Red Hat Virtualization supported by all of the hosts in the cluster. The cluster compatibility is set according to the version of the least capable host operating system in the cluster.
The table below provides a compatibility matrix of RHV versions and the required data center and cluster compatibility levels.
Compatibility Level | RHV Version | Description |
---|---|---|
4.7 | 4.4 | Compatibility Level 4.7 was introduced in RHV 4.4 to support new features introduced by RHEL 8.6 hypervisors. |
4.6 | 4.4.6 | Compatibility Level 4.6 was introduced in RHV 4.4.6 to support new features introduced by RHEL 8.4 hypervisors with Advanced Virtualization 8.4 packages. |
4.5 | 4.4.3 | Compatibility Level 4.5 was introduced in RHV 4.4.3 to support new features introduced by RHEL 8.3 hypervisors with Advanced Virtualization 8.3 packages. |
Limitations
Virtio NICs are enumerated as a different device after upgrading the cluster compatibility level to 4.6. Therefore, the NICs might need to be reconfigured. Red Hat recommends that you test the virtual machines before you upgrade the cluster by setting the cluster compatibility level to 4.6 on the virtual machine and verifying the network connection.
If the network connection for the virtual machine fails, configure the virtual machine with a custom emulated machine that matches the current emulated machine, for example pc-q35-rhel8.3.0 for 4.5 compatibility version, before upgrading the cluster.
Chapter 2. Storage
2.1. Storage Domains Overview
A storage domain is a collection of images that have a common storage interface. A storage domain contains complete images of templates and virtual machines (including snapshots), ISO files, and metadata about themselves. A storage domain can be made of either block devices (SAN - iSCSI or FCP) or a file system (NAS - NFS, GlusterFS, or other POSIX compliant file systems).
GlusterFS Storage is deprecated, and will no longer be supported in future releases.
On NAS, all virtual disks, templates, and snapshots are files.
On SAN (iSCSI/FCP), each virtual disk, template or snapshot is a logical volume. Block devices are aggregated into a logical entity called a volume group, and then divided by LVM (Logical Volume Manager) into logical volumes for use as virtual hard disks. See Red Hat Enterprise Linux Configuring and managing logical volumes for more information on LVM.
Virtual disks can have one of two formats, either QCOW2 or raw. The type of storage can be either sparse or preallocated. Snapshots are always sparse but can be taken for disks of either format.
Virtual machines that share the same storage domain can be migrated between hosts that belong to the same cluster.
2.2. Types of Storage Backing Storage Domains
Storage domains can be implemented using block based and file based storage.
- File Based Storage
The file based storage types supported by Red Hat Virtualization are NFS, GlusterFS, other POSIX compliant file systems, and storage local to hosts.
NoteGlusterFS Storage is deprecated, and will no longer be supported in future releases.
File based storage is managed externally to the Red Hat Virtualization environment.
NFS storage is managed by a Red Hat Enterprise Linux NFS server, or other third party network attached storage server.
Hosts can manage their own local storage file systems.
- Block Based Storage
Block storage uses unformatted block devices. Block devices are aggregated into volume groups by the Logical Volume Manager (LVM). An instance of LVM runs on all hosts, unaware of the instances running on other hosts. VDSM adds clustering logic on top of LVM by scanning volume groups for changes. When changes are detected, VDSM updates individual hosts by telling them to refresh their volume group information. The hosts divide the volume group into logical volumes, writing logical volume metadata to disk. If more storage capacity is added to an existing storage domain, the Red Hat Virtualization Manager causes VDSM on each host to refresh volume group information.
A Logical Unit Number (LUN) is an individual block device. One of the supported block storage protocols, iSCSI or Fibre Channel, is used to connect to a LUN. The Red Hat Virtualization Manager manages software iSCSI connections to the LUNs. All other block storage connections are managed externally to the Red Hat Virtualization environment. Any changes in a block based storage environment, such as the creation of logical volumes, extension or deletion of logical volumes and the addition of a new LUN are handled by LVM on a specially selected host called the Storage Pool Manager. Changes are then synced by VDSM which storage metadata refreshes across all hosts in the cluster.
2.3. Storage Domain Types
Red Hat Virtualization supports the following types of storage domains, as well as the storage types that each storage domain supports.
The Data Domain stores the hard disk images of all virtual machines in the Red Hat Virtualization environment. Disk images may contain an installed operating system or data stored or generated by a virtual machine. Data storage domains support NFS, iSCSI, FCP, GlusterFS and POSIX compliant storage. A data domain cannot be shared between multiple data centers.
NoteGlusterFS Storage is deprecated, and will no longer be supported in future releases.
The Export Domain provides transitory storage for hard disk images and virtual machine templates being transferred between data centers. Additionally, export storage domains store backed up copies of virtual machines. Export storage domains support NFS storage. Multiple data centers can access a single export storage domain but only one data center can use it at a time.
NoteThe Export domain is deprecated. Storage data domains can be unattached from a data center and imported to another data center in the same environment, or in a different environment. Virtual machines, floating virtual disks, and templates can then be uploaded from the imported storage domain to the attached data center.
The ISO Domain stores ISO files, also called images. ISO files are representations of physical CDs or DVDs. In the Red Hat Virtualization environment the common types of ISO files are operating system installation disks, application installation disks, and guest agent installation disks. These images can be attached to virtual machines and booted in the same way that physical disks are inserted into a disk drive and booted. ISO storage domains allow all hosts within the data center to share ISOs, eliminating the need for physical optical media.
NoteThe ISO domain is a deprecated storage domain type. The ISO Uploader tool has been deprecated. Red Hat recommends uploading ISO images to the data domain with the Administration Portal or with the REST API.
2.4. Storage Formats for Virtual Disks
- QCOW2 Formatted Virtual Machine Storage
QCOW2 is a storage format for virtual disks. QCOW stands for QEMU copy-on-write. The QCOW2 format decouples the physical storage layer from the virtual layer by adding a mapping between logical and physical blocks. Each logical block is mapped to its physical offset, which enables storage over-commitment and virtual machine snapshots, where each QCOW volume only represents changes made to an underlying virtual disk.
The initial mapping points all logical blocks to the offsets in the backing file or volume. When a virtual machine writes data to a QCOW2 volume after a snapshot, the relevant block is read from the backing volume, modified with the new information and written into a new snapshot QCOW2 volume. Then the map is updated to point to the new place.
- Raw
The raw storage format has a performance advantage over QCOW2 in that no formatting is applied to virtual disks stored in the raw format. Virtual machine data operations on virtual disks stored in raw format require no additional work from hosts. When a virtual machine writes data to a given offset in its virtual disk, the I/O is written to the same offset on the backing file or logical volume.
Raw format requires that the entire space of the defined image be preallocated unless using externally managed thin provisioned LUNs from a storage array.
2.5. Virtual Disk Storage Allocation Policies
- Preallocated Storage
- All of the storage required for a virtual disk is allocated prior to virtual machine creation. If a 20 GB disk image is created for a virtual machine, the disk image uses 20 GB of storage domain capacity. Preallocated disk images cannot be enlarged. Preallocating storage can mean faster write times because no storage allocation takes place during runtime, at the cost of flexibility. Allocating storage this way reduces the capacity of the Red Hat Virtualization Manager to overcommit storage. Preallocated storage is recommended for virtual machines used for high intensity I/O tasks with less tolerance for latency in storage. Generally, server virtual machines fit this description.
If thin provisioning functionality provided by your storage back-end is being used, preallocated storage should still be selected from the Administration Portal when provisioning storage for virtual machines.
- Sparsely Allocated Storage
- The upper size limit for a virtual disk is set at virtual machine creation time. Initially, the disk image does not use any storage domain capacity. Usage grows as the virtual machine writes data to disk, until the upper limit is reached. Capacity is not returned to the storage domain when data in the disk image is removed. Sparsely allocated storage is appropriate for virtual machines with low or medium intensity I/O tasks with some tolerance for latency in storage. Generally, desktop virtual machines fit this description.
If thin provisioning functionality is provided by your storage back-end, it should be used as the preferred implementation of thin provisioning. Storage should be provisioned from the graphical user interface as preallocated, leaving thin provisioning to the back-end solution.
2.6. Storage Metadata Versions in Red Hat Virtualization
Red Hat Virtualization stores information about storage domains as metadata on the storage domains themselves. Each major release of Red Hat Virtualization has seen improved implementations of storage metadata.
V1 metadata (Red Hat Virtualization 2.x series)
- Each storage domain contains metadata describing its own structure, and all of the names of physical volumes that are used to back virtual disks.
- Master domains additionally contain metadata for all the domains and physical volume names in the storage pool. The total size of this metadata is limited to 2 KB, limiting the number of storage domains that can be in a pool.
- Template and virtual machine base images are read only.
- V1 metadata is applicable to NFS, iSCSI, and FC storage domains.
V2 metadata (Red Hat Enterprise Virtualization 3.0)
- All storage domain and pool metadata is stored as logical volume tags rather than written to a logical volume. Metadata about virtual disk volumes is still stored in a logical volume on the domains.
- Physical volume names are no longer included in the metadata.
- Template and virtual machine base images are read only.
- V2 metadata is applicable to iSCSI, and FC storage domains.
V3 metadata (Red Hat Enterprise Virtualization 3.1 and later)
- All storage domain and pool metadata is stored as logical volume tags rather than written to a logical volume. Metadata about virtual disk volumes is still stored in a logical volume on the domains.
- Virtual machine and template base images are no longer read only. This change enables live snapshots, live storage migration, and clone from snapshot.
- Support for unicode metadata is added, for non-English volume names.
V3 metadata is applicable to NFS, GlusterFS, POSIX, iSCSI, and FC storage domains.
NoteGlusterFS Storage is deprecated, and will no longer be supported in future releases.
V4 metadata (Red Hat Virtualization 4.1 and later)
- Support for QCOW2 compat levels - the QCOW image format includes a version number to allow introducing new features that change the image format so that it is incompatible with earlier versions. Newer QEMU versions (1.7 and above) support QCOW2 version 3, which is not backwards compatible, but introduces improvements such as zero clusters and improved performance.
A new xleases volume to support VM leases - this feature adds the ability to acquire a lease per virtual machine on shared storage without attaching the lease to a virtual machine disk.
A VM lease offers two important capabilities:
- Avoiding split-brain.
- Starting a VM on another host if the original host becomes non-responsive, which improves the availability of HA VMs.
V5 metadata (Red Hat Virtualization 4.3 and later)
- Support for 4K (4096 byte) block storage.
- Support for variable SANLOCK allignments.
Support for new properties:
-
BLOCK_SIZE
- stores the block size of the storage domain in bytes. ALIGNMENT
- determines the formatting and size of the xlease volume. (1MB to 8MB). Determined by the maximum number of host to be supported (value provided by the user) and disk block size.For example: a 512b block size and support for 2000 hosts results in a 1MB xlease volume.
A 4K block size with 2000 hosts results in a 8MB xlease volume.
The default value of maximum hosts is 250, resulting in an xlease volume of 1MB for 4K disks.
-
Deprecated properties:
-
The
LOGBLKSIZE
,PHYBLKSIZE
,MTIME
, andPOOL_UUID
fields were removed from the storage domain metadata. -
The
SIZE
(size in blocks) field was replaced byCAP
(size in bytes).
-
The
- You cannot boot from a 4K format disk, as the boot disk always uses a 512 byte emulation.
- The nfs format always uses 512 bytes.
2.7. Storage Domain Autorecovery in Red Hat Virtualization
Hosts in a Red Hat Virtualization environment monitor storage domains in their data centers by reading metadata from each domain. A storage domain becomes inactive when all hosts in a data center report that they cannot access the storage domain.
Rather than disconnecting an inactive storage domain, the Manager assumes that the storage domain has become inactive temporarily, because of a temporary network outage for example. Once every 5 minutes, the Manager attempts to re-activate any inactive storage domains.
Administrator intervention may be required to remedy the cause of the storage connectivity interruption, but the Manager handles re-activating storage domains as connectivity is restored.
2.8. The Storage Pool Manager
Red Hat Virtualization uses metadata to describe the internal structure of storage domains. Structural metadata is written to a segment of each storage domain. Hosts work with the storage domain metadata based on a single writer, and multiple readers configuration. Storage domain structural metadata tracks image and snapshot creation and deletion, and volume and domain extension.
The host that can make changes to the structure of the data domain is known as the Storage Pool Manager (SPM). The SPM coordinates all metadata changes in the data center, such as creating and deleting disk images, creating and merging snapshots, copying images between storage domains, creating templates and storage allocation for block devices. There is one SPM for every data center. All other hosts can only read storage domain structural metadata.
A host can be manually selected as the SPM, or it can be assigned by the Red Hat Virtualization Manager. The Manager assigns the SPM role by causing a potential SPM host to attempt to assume a storage-centric lease. The lease allows the SPM host to write storage metadata. It is storage-centric because it is written to the storage domain rather than being tracked by the Manager or hosts. Storage-centric leases are written to a special logical volume in the master
storage domain called leases. Metadata about the structure of the data domain is written to a special logical volume called metadata. The leases logical volume protects the metadata logical volume from changes.
The Manager uses VDSM to issue the spmStart command to a host, causing VDSM on that host to attempt to assume the storage-centric lease. If the host is successful it becomes the SPM and retains the storage-centric lease until the Red Hat Virtualization Manager requests that a new host assume the role of SPM.
The Manager moves the SPM role to another host if:
-
The SPM host can not access all storage domains, but can access the
master
storage domain - The SPM host is unable to renew the lease because of a loss of storage connectivity or the lease volume is full and no write operation can be performed
- The SPM host crashes
Figure 2.1. The Storage Pool Manager Exclusively Writes Structural Metadata.
2.9. Storage Pool Manager Selection Process
If a host has not been manually assigned the Storage Pool Manager (SPM) role, the SPM selection process is initiated and managed by the Red Hat Virtualization Manager.
First, the Red Hat Virtualization Manager requests that VDSM confirm which host has the storage-centric lease.
The Red Hat Virtualization Manager tracks the history of SPM assignment from the initial creation of a storage domain onward. The availability of the SPM role is confirmed in three ways:
- The "getSPMstatus" command: the Manager uses VDSM to check with the host that had SPM status last and receives one of "SPM", "Contending", or "Free".
- The metadata volume for a storage domain contains the last host with SPM status.
- The metadata volume for a storage domain contains the version of the last host with SPM status.
If an operational, responsive host retains the storage-centric lease, the Red Hat Virtualization Manager marks that host SPM in the administrator portal. No further action is taken.
If the SPM host does not respond, it is considered unreachable. If power management has been configured for the host, it is automatically fenced. If not, it requires manual fencing. The Storage Pool Manager role cannot be assigned to a new host until the previous Storage Pool Manager is fenced.
When the SPM role and storage-centric lease are free, the Red Hat Virtualization Manager assigns them to a randomly selected operational host in the data center.
If the SPM role assignment fails on a new host, the Red Hat Virtualization Manager adds the host to a list containing hosts the operation has failed on, marking these hosts as ineligible for the SPM role. This list is cleared at the beginning of the next SPM selection process so that all hosts are again eligible.
The Red Hat Virtualization Manager continues request that the Storage Pool Manager role and storage-centric lease be assumed by a randomly selected host that is not on the list of failed hosts until the SPM selection succeeds.
Each time the current SPM is unresponsive or unable to fulfill its responsibilities, the Red Hat Virtualization Manager initiates the Storage Pool Manager selection process.
2.10. Exclusive Resources and Sanlock in Red Hat Virtualization
Certain resources in the Red Hat Virtualization environment must be accessed exclusively.
The SPM role is one such resource. If more than one host were to become the SPM, there would be a risk of data corruption as the same data could be changed from two places at once.
Prior to Red Hat Enterprise Virtualization 3.1, SPM exclusivity was maintained and tracked using a VDSM feature called safelease. The lease was written to a special area on all of the storage domains in a data center. All of the hosts in an environment could track SPM status in a network-independent way. The VDSM’s safe lease only maintained exclusivity of one resource: the SPM role.
Sanlock provides the same functionality, but treats the SPM role as one of the resources that can be locked. Sanlock is more flexible because it allows additional resources to be locked.
Applications that require resource locking can register with Sanlock. Registered applications can then request that Sanlock lock a resource on their behalf, so that no other application can access it. For example, instead of VDSM locking the SPM status, VDSM now requests that Sanlock do so.
Locks are tracked on disk in a lockspace. There is one lockspace for every storage domain. In the case of the lock on the SPM resource, each host’s liveness is tracked in the lockspace by the host’s ability to renew the hostid it received from the Manager when it connected to storage, and to write a timestamp to the lockspace at a regular interval. The ids logical volume tracks the unique identifiers of each host, and is updated every time a host renews its hostid. The SPM resource can only be held by a live host.
Resources are tracked on disk in the leases logical volume. A resource is said to be taken when its representation on disk has been updated with the unique identifier of the process that has taken it. In the case of the SPM role, the SPM resource is updated with the hostid that has taken it.
The Sanlock process on each host only needs to check the resources once to see that they are taken. After an initial check, Sanlock can monitor the lockspaces until timestamp of the host with a locked resource becomes stale.
Sanlock monitors the applications that use resources. For example, VDSM is monitored for SPM status and hostid. If the host is unable to renew it’s hostid from the Manager, it loses exclusivity on all resources in the lockspace. Sanlock updates the resource to show that it is no longer taken.
If the SPM host is unable to write a timestamp to the lockspace on the storage domain for a given amount of time, the host’s instance of Sanlock requests that the VDSM process release its resources. If the VDSM process responds, its resources are released, and the SPM resource in the lockspace can be taken by another host.
If VDSM on the SPM host does not respond to requests to release resources, Sanlock on the host kills the VDSM process. If the kill command is unsuccessful, Sanlock escalates by attempting to kill VDSM using sigkill. If the sigkill is unsuccessful, Sanlock depends on the watchdog daemon to reboot the host.
Every time VDSM on the host renews its hostid and writes a timestamp to the lockspace, the watchdog daemon receives a pet. When VDSM is unable to do so, the watchdog daemon is no longer being petted. After the watchdog daemon has not received a pet for a given amount of time, it reboots the host. This final level of escalation, if reached, guarantees that the SPM resource is released, and can be taken by another host.
2.11. Thin Provisioning and Storage Over-Commitment
The Red Hat Virtualization Manager provides provisioning policies to optimize storage usage within the virtualization environment. A thin provisioning policy allows you to over-commit storage resources, provisioning storage based on the actual storage usage of your virtualization environment.
Storage over-commitment is the allocation of more storage to virtual machines than is physically available in the storage pool. Generally, virtual machines use less storage than what has been allocated to them. Thin provisioning allows a virtual machine to operate as if the storage defined for it has been completely allocated, when in fact only a fraction of the storage has been allocated.
While the Red Hat Virtualization Manager provides its own thin provisioning function, you should use the thin provisioning functionality of your storage back-end if it provides one.
To support storage over-commitment, VDSM defines a threshold which compares logical storage allocation with actual storage usage. This threshold is used to make sure that the data written to a disk image is smaller than the logical volume that backs the disk image. QEMU identifies the highest offset written to in a logical volume, which indicates the point of greatest storage use. VDSM monitors the highest offset marked by QEMU to ensure that the usage does not cross the defined threshold. So long as VDSM continues to indicate that the highest offset remains below the threshold, the Red Hat Virtualization Manager knows that the logical volume in question has sufficient storage to continue operations.
When QEMU indicates that usage has risen to exceed the threshold limit, VDSM communicates to the Manager that the disk image will soon reach the size of it’s logical volume. The Red Hat Virtualization Manager requests that the SPM host extend the logical volume. This process can be repeated as long as the data storage domain for the data center has available space. When the data storage domain runs out of available free space, you must manually add storage capacity to expand it.
2.12. Logical Volume Extension
The Red Hat Virtualization Manager uses thin provisioning to overcommit the storage available in a storage pool, and allocates more storage than is physically available. Virtual machines write data as they operate. A virtual machine with a thinly-provisioned disk image will eventually write more data than the logical volume backing its disk image can hold. When this happens, logical volume extension is used to provide additional storage and facilitate the continued operations for the virtual machine.
Red Hat Virtualization provides a thin provisioning mechanism over LVM. When using QCOW2 formatted storage, Red Hat Virtualization relies on the host system process qemu-kvm to map storage blocks on disk to logical blocks in a sequential manner. This allows, for example, the definition of a logical 100 GB disk backed by a 1 GB logical volume. When qemu-kvm crosses a usage threshold set by VDSM, the local VDSM instance makes a request to the SPM for the logical volume to be extended by another one gigabyte. VDSM on the host running a virtual machine in need of volume extension notifies the SPM VDSM that more space is required. The SPM extends the logical volume and the SPM VDSM instance causes the host VDSM to refresh volume group information and recognize that the extend operation is complete. The host can continue operations.
Logical Volume extension does not require that a host know which other host is the SPM; it could even be the SPM itself. The storage extension communication is done via a storage mailbox. The storage mailbox is a dedicated logical volume on the data storage domain. A host that needs the SPM to extend a logical volume writes a message in an area designated to that particular host in the storage mailbox. The SPM periodically reads the incoming mail, performs requested logical volume extensions, and writes a reply in the outgoing mail. After sending the request, a host monitors its incoming mail for responses every two seconds. When the host receives a successful reply to its logical volume extension request, it refreshes the logical volume map in device mapper to recognize the newly allocated storage.
When the physical storage available to a storage pool is nearly exhausted, multiple images can run out of usable storage with no means to replenish their resources. A storage pool that exhausts its storage causes QEMU to return an enospc error, which indicates that the device no longer has any storage available. At this point, running virtual machines are automatically paused and manual intervention is required to add a new LUN to the volume group.
When a new LUN is added to the volume group, the Storage Pool Manager automatically distributes the additional storage to logical volumes that need it. The automatic allocation of additional resources allows the relevant virtual machines to automatically continue operations uninterrupted or resume operations if stopped.
2.13. The Effect of Storage Domain Actions on Storage Capacity
- Power on, power off, and reboot a stateless virtual machine
- These three processes affect the copy-on-write (COW) layer in a stateless virtual machine. For more information, see the Stateless row of the Virtual Machine General Settings table in the Virtual Machine Management Guide.
- Create a storage domain
Creating a block storage domain results in files with the same names as the seven LVs shown below, and initially should take less capacity.
ids 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-ao---- 128.00m inbox 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-a----- 128.00m leases 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-a----- 2.00g master 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-ao---- 1.00g metadata 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-a----- 512.00m outbox 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-a----- 128.00m xleases 64f87b0f-88d6-49e9-b797-60d36c9df497 -wi-a----- 1.00g
- Delete a storage domain
- Deleting a storage domain frees up capacity on the disk by the same of amount of capacity the process deleted.
- Migrate a storage domain
- Migrating a storage domain does not use additional storage capacity. For more information about migrating storage domains, see Migrating Storage Domains Between Data Centers in the Same Environment in the Administration Guide.
- Move a virtual disk to other storage domain
Migrating a virtual disk requires enough free space to be available on the target storage domain. You can see the target domain’s approximate free space in the Administration Portal.
The storage types in the move process affect the visible capacity. For example, if you move a preallocated disk from block storage to file storage, the resulting free space may be considerably smaller than the initial free space.
Live migrating a virtual disk to another storage domain also creates a snapshot, which is automatically merged after the migration is complete. To learn more about moving virtual disks, see Moving a Virtual Disk in the Administration Guide.
- Pause a storage domain
- Pausing a storage domain does not use any additional storage capacity.
- Create a snapshot of a virtual machine
Creating a snapshot of a virtual machine can affect the storage domain capacity.
- Creating a live snapshot uses memory snapshots by default and generates two additional volumes per virtual machine. The first volume is the sum of the memory, video memory, and 200 MB of buffer. The second volume contains the virtual machine configuration, which is several MB in size. When using block storage, rounding up occurs to the nearest unit Red Hat Virtualization can provide.
- Creating an offline snapshot initially consumes 1 GB of block storage and is dynamic up to the size of the disk.
- Cloning a snapshot creates a new disk the same size as the original disk.
- Committing a snapshot removes all child volumes, depending on where in the chain the commit occurs.
- Deleting a snapshot eventually removes the child volume for each disk and is only supported with a running virtual machine.
- Previewing a snapshot creates a temporary volume per disk, so sufficient capacity must be available to allow the creation of the preview.
- Undoing a snapshot preview removes the temporary volume created by the preview.
- Attach and remove direct LUNs
- Attaching and removing direct LUNs does not affect the storage domain since they are not a storage domain component. For more information, see Overview of Live Storage Migration in the Administration Guide.
Chapter 3. Networking
3.1. Host networking
On the data link layer (layer 2), RHV enables the configuration of Linux bonds to connect to VLANs and define the MTU for network interfaces. These networks can be shared via Linux bridges to virtual machines.
For SR-IOV, you can configure the number of virtual functions and their mapping to logical networks.
FCoE manages its own VLANs. These FCoE managed VLANs are used exclusively for storage access. They are invisible to the Manager and any virtual machines.
iSCSI manages iSCSI bonds. They are not part of RHV’s visible host network configuration. You can use iSCSI without iSCSI bonds, which are useful only to improve the reliability of iSCSI storage.
All hosts in a cluster must use either IPv4 or IPv6 as the IP stack for their management network. Dual stack is not supported.
You can configure the DNS resolver that the host uses.
It is also possible to manage network roles and QoS.
3.2. Virtual machine networking types
In RHV, virtual NICs of virtual machines can connect to the following types of networks:
- Linux bridges
- SR-IOV NICs
- RHV’s internal OVN
The following diagram shows the structure of these three approaches, where:
- Host 1 represents Linux bridges
- Host 2 represents SR-IOV NICs
- Host 3 represents OVN
Linux bridge | SR-IOV | RHV internal OVN | |
---|---|---|---|
Isolation from physical host networks | Layer 3, Separate IP network possible | Layer 2, Separate VLANs possible | Isolated |
Live Migration | x | x | x |
QoS | x | ||
Port Mirroring | x | ||
configuration of plugged vNIC | x | x | |
MAC address management | x | x | x |
MTU propagation | x | x | |
VLAN filtering, might require configuration on the physical switch | x | x | Technology Preview |
MAC Spoofing Protection | x | x | |
IP Spoofing Protection | x | x | |
Predefined Network Filters | x | ||
Custom Layer 3/4 Filtering | x | ||
NAT | |||
DHCP/Router Advertisements | x | ||
Layer 3 Router | x | ||
Performance |
|
|
|
virtual machine network data encapsulation | flat, VLAN | flat, VLAN | Stable: GENEVE; Technology Preview: flat, VLAN |
Networking choices for various scenarios
Linux bridge is the default, and the most proven option. It fits most use cases.
For scenarios that require very low network latency or a large number of Ethernet frames, consider investing in SR-IOV. Keep in mind, however, that SR-IOV requires hardware support and additional configuration steps.
RHV’s internal OVN networks enable virtual machines to communicate with each other without any manual network configuration.
The Manager provides only a subset of software-defined networking (SDN) features and user interfaces. To use all SDN features, similar to RHV’s internal OVN or a third-party SDN, you need to use an additional client, such as CloudForms.
You can combine all network types on a single host and connect them to the same virtual machine.
3.2.1. Interaction with guest operating system
RHV supports the initial configuration of a virtual machine by providing configuration data via cloud-init. If the qemu-guest-agent runs inside the virtual machine, RHV can report the IP addresses of the virtual machine.
If the virtual machine uses a VirtIO NIC, the MTU of the RHV logical networks are provided to the guest operating system. The guest operating system can pick up the MTU from DHCPv4 or IPv6 router advertisements if the logical network supports these advertisements.
3.2.2. Host and virtual machine networking
Linux bridge networking separates virtual machine and host networking on OSI layer 3. Therefore the networking configuration, including VLAN, bonding, and MTU, is shared between the host and its virtual machines.
To reduce their surface, hosts should not assign IP addresses to VLANs that are connected to virtual machines. By not assigning IP addresses, the hosts can avoid potential confusion caused by virtual machine traffic.
The IP address associated with the Linux bridge is not required to be within the same subnet as the virtual machines that use the bridge for connectivity. If the bridge is assigned an IP address on the same subnet as the virtual machines that use it, the host is addressable within the logical network by virtual machines. As a rule, it is not recommended to run network exposed services on a virtualization host.
3.3. Network Architecture
Networking in Red Hat Virtualization includes basic networking, networking within a cluster, and host networking configurations.
- Basic networking
- The basic hardware and software elements that facilitate networking.
- Networking within a cluster
- Network interactions among cluster objects such as hosts, logical networks and virtual machines.
- Host networking configurations
- Supported configurations for networking within a host.
A well designed and built network ensures that high bandwidth tasks receive adequate bandwidth, that latency does not impact user interactions, and that virtual machines can be successfully migrated within a migration domain. A poorly built network can cause unacceptable latency, and migration and cloning failures that result from network flooding.
An alternative method of managing your network is by integrating with Cisco Application Centric Infrastructure (ACI), by configuring Red Hat Virtualization on Cisco’s Application Policy Infrastructure Controller (APIC) version 3.1(1) and later according to Cisco’s documentation. On the Red Hat Virtualization side, all that is required is connecting the hosts' NICs to the network and the virtual machines' vNICs to the required network. The remaining configuration tasks are managed by Cisco ACI.
3.4. Basic Networking Terms
Red Hat Virtualization provides networking functionality between virtual machines, virtualization hosts, and wider networks using:
- Logical networks
- A network interface controller (NIC)
- A Linux bridge
- A Bond
- A virtual network interface controller (vNIC)
- A virtual LAN (VLAN)
NICs, Linux bridges, and vNICs enable network communication between hosts, virtual machines, local area networks, and the Internet. Bonds and VLANs are optionally implemented to enhance security, fault tolerance, and network capacity.
3.5. Network Interface Controller
The network interface controller (NIC) is a network adapter or LAN adapter that connects a computer to a computer network. The NIC operates on both the physical and data link layers of the machine and enables network connectivity. All virtualization hosts in a Red Hat Virtualization environment have at least one NIC, though it is more common for a host to have two or more NICs.
One physical NIC can have multiple virtual NICs (vNICs) logically connected to it. A virtual NIC acts as a network interface for a virtual machine. To distinguish between a vNIC and the NIC that supports it, the Red Hat Virtualization Manager assigns each vNIC a unique MAC address.
3.6. Linux bridge
A Linux bridge is a software device that uses packet forwarding in a packet-switched network. Bridging allows multiple network interface devices to share the connectivity of one NIC and appear on a network as separate physical devices. The bridge examines a packet’s source addresses to determine relevant target addresses. Once the target address is determined, the bridge adds the location to a table for future reference. This allows a host to redirect network traffic to virtual machine associated vNICs that are members of a bridge.
Custom properties can be defined for both the bridge and the Ethernet connection. VDSM passes the network definition and custom properties to the setup network hook script.
3.7. Bonds
A bond is a collection of multiple network interface cards into a single software-defined device. Because bonded network interfaces combine the transmission capability of the network interface cards included in the bond to act as a single network interface, they can provide greater transmission speed than that of a single network interface card. Also, because all network interface cards in the bond must fail for the bond itself to fail, bonding provides increased fault tolerance. However, one limitation is that the network interface cards that form a bonded network interface must be of the same make and model to ensure that all network interface cards in the bond support the same options and modes.
The packet dispersal algorithm for a bond is determined by the bonding mode used.
Modes 1, 2, 3 and 4 support both virtual machine (bridged) and non-virtual machine (bridgeless) network types. Modes 0, 5 and 6 support non-virtual machine (bridgeless) networks only.
3.8. Bonding Modes
Red Hat Virtualization uses Mode 4 by default, but supports the following common bonding modes:
Mode 0 (round-robin policy)
- Transmits packets through network interface cards in sequential order. Packets are transmitted in a loop that begins with the first available network interface card in the bond and ends with the last available network interface card in the bond. All subsequent loops then start with the first available network interface card. Mode 0 offers fault tolerance and balances the load across all network interface cards in the bond. However, Mode 0 cannot be used in conjunction with bridges, and is therefore not compatible with virtual machine logical networks.
Mode 1 (active-backup policy)
- Sets all network interface cards to a backup state while one network interface card remains active. In the event of failure in the active network interface card, one of the backup network interface cards replaces that network interface card as the only active network interface card in the bond. The MAC address of the bond in Mode 1 is visible on only one port to prevent any confusion that might otherwise be caused if the MAC address of the bond changed to reflect that of the active network interface card. Mode 1 provides fault tolerance and is supported in Red Hat Virtualization.
Mode 2 (XOR policy)
-
Selects the network interface card through which to transmit packets based on the result of an XOR operation on the source and destination MAC addresses modulo network interface card
slave
count. This calculation ensures that the same network interface card is selected for each destination MAC address used. Mode 2 provides fault tolerance and load balancing and is supported in Red Hat Virtualization. Mode 3 (broadcast policy)
- Transmits all packets to all network interface cards. Mode 3 provides fault tolerance and is supported in Red Hat Virtualization.
Mode 4 (IEEE 802.3ad policy)
- Creates aggregation groups in which the interfaces share the same speed and duplex settings. Mode 4 uses all network interface cards in the active aggregation group in accordance with the IEEE 802.3ad specification and is supported in Red Hat Virtualization.
Mode 5 (adaptive transmit load balancing policy)
- Ensures the distribution of outgoing traffic accounts for the load on each network interface card in the bond and that the current network interface card receives all incoming traffic. If the network interface card assigned to receive traffic fails, another network interface card is assigned to the role of receiving incoming traffic. Mode 5 cannot be used in conjunction with bridges, therefore it is not compatible with virtual machine logical networks.
Mode 6 (adaptive load balancing policy)
- Combines Mode 5 (adaptive transmit load balancing policy) with receive load balancing for IPv4 traffic without any special switch requirements. ARP negotiation is used for balancing the receive load. Mode 6 cannot be used in conjunction with bridges, therefore it is not compatible with virtual machine logical networks.
3.9. Switch Configuration for Bonding
Switch configurations vary per the requirements of your hardware. Refer to the deployment and networking configuration guides for your operating system.
For every type of switch it is important to set up the switch bonding with the Link Aggregation Control Protocol (LACP) protocol and not the Cisco Port Aggregation Protocol (PAgP) protocol.
3.10. Virtual Network Interface Cards
Virtual network interface cards (vNICs) are virtual network interfaces that are based on the physical NICs of a host. Each host can have multiple NICs, and each NIC can be a base for multiple vNICs.
When you attach a vNIC to a virtual machine, the Red Hat Virtualization Manager creates several associations between the virtual machine to which the vNIC is being attached, the vNIC itself, and the physical host NIC on which the vNIC is based. Specifically, when a vNIC is attached to a virtual machine, a new vNIC and MAC address are created on the physical host NIC on which the vNIC is based. Then, the first time the virtual machine starts after that vNIC is attached, libvirt
assigns the vNIC a PCI address. The MAC address and PCI address are then used to obtain the name of the vNIC (for example, eth0
) in the virtual machine.
The process for assigning MAC addresses and associating those MAC addresses with PCI addresses is slightly different when creating virtual machines based on templates or snapshots:
- If PCI addresses have already been created for a template or snapshot, the vNICs on virtual machines created based on that template or snapshot are ordered in accordance with those PCI addresses. MAC addresses are then allocated to the vNICs in that order.
- If PCI addresses have not already been created for a template, the vNICs on virtual machines created based on that template are ordered alphabetically. MAC addresses are then allocated to the vNICs in that order.
- If PCI addresses have not already been created for a snapshot, the Red Hat Virtualization Manager allocates new MAC addresses to the vNICs on virtual machines based on that snapshot.
Once created, vNICs are added to a network bridge device. The network bridge devices connect virtual machines to virtual logical networks.
Running the ip addr show
command on a virtualization host shows all of the vNICs that are associated with virtual machines on that host. Also visible are any network bridges that have been created to back logical networks, and any NICs used by the host.
[root@rhev-host-01 ~]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:21:86:a2:85:cd brd ff:ff:ff:ff:ff:ff inet6 fe80::221:86ff:fea2:85cd/64 scope link valid_lft forever preferred_lft forever 3: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000 link/ether 00:21:6b:cc:14:6c brd ff:ff:ff:ff:ff:ff 5: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether 4a:d5:52:c2:7f:4b brd ff:ff:ff:ff:ff:ff 6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 7: bond4: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 8: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 9: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 10: bond3: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 11: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 00:21:86:a2:85:cd brd ff:ff:ff:ff:ff:ff inet 10.64.32.134/23 brd 10.64.33.255 scope global ovirtmgmt inet6 fe80::221:86ff:fea2:85cd/64 scope link valid_lft forever preferred_lft forever
The console output from the command shows several devices: one loop back device (lo), one Ethernet device (eth0), one wireless device (wlan0), one VDSM dummy device (;vdsmdummy;), five bond devices (bond0, bond4, bond1, bond2, bond3), and one network bridge (ovirtmgmt).
vNICs are all members of a network bridge device and logical network. Bridge membership can be displayed using the brctl show
command:
[root@rhev-host-01 ~]# brctl show bridge name bridge id STP enabled interfaces ovirtmgmt 8000.e41f13b7fdd4 no vnet002 vnet001 vnet000 eth0
The console output from the brctl show
command shows that the virtio vNICs are members of the ovirtmgmt bridge. All of the virtual machines that the vNICs are associated with are connected to the ovirtmgmt logical network. The eth0 NIC is also a member of the ovirtmgmt bridge. The eth0 device is cabled to a switch that provides connectivity beyond the host.
3.11. Virtual LAN (VLAN)
A VLAN (Virtual LAN) is an attribute that can be applied to network packets. Network packets can be "tagged" into a numbered VLAN. A VLAN is a security feature used to isolate network traffic at the switch level. VLANs are separate and mutually exclusive. The Red Hat Virtualization Manager is VLAN-aware and able to tag and redirect VLAN traffic, however VLAN implementation requires a switch that supports VLANs.
At the switch level, ports are assigned a VLAN designation. A switch applies a VLAN tag to traffic originating from a particular port, marking the traffic as part of a VLAN, and ensures that responses carry the same VLAN tag. A VLAN can extend across multiple switches. VLAN tagged network traffic on a switch is undetectable except by machines connected to a port designated with the correct VLAN. A given port can be tagged into multiple VLANs, which allows traffic from multiple VLANs to be sent to a single port, to be deciphered using software on the machine that receives the traffic.
3.12. Network Labels
You can use network labels to simplify several administrative tasks associated with creating and administering logical networks and associating those logical networks with physical host network interfaces and bonds.
A network label is a plain text, human readable label that you can attach to a logical network or a physical host network interface. Follow these rules when creating a label:
- There is no limit on the length of a label.
- You must use a combination of lowercase and uppercase letters, underscores and hyphens.
- You cannot use use spaces or special characters.
Attaching a label to a logical network or physical host network interface creates an association with other logical networks or physical host network interfaces to which the same label has been attached:
Network Label Associations
- When you attach a label to a logical network, that logical network will be automatically associated with any physical host network interfaces with the given label.
- When you attach a label to a physical host network interface, any logical networks with the given label will be automatically associated with that physical host network interface.
- Changing the label attached to a logical network or physical host network interface is the same as removing a label and adding a new label. The association between related logical networks or physical host network interfaces is updated.
Network Labels and Clusters
- When a labeled logical network is added to a cluster and there is a physical host network interface in that cluster with the same label, the logical network is automatically added to that physical host network interface.
- When a labeled logical network is detached from a cluster and there is a physical host network interface in that cluster with the same label, the logical network is automatically detached from that physical host network interface.
Network Labels and Logical Networks With Roles
When a labeled logical network is assigned to act as a display network or migration network, that logical network is then configured on the physical host network interface using DHCP so that the logical network can be assigned an IP address.
Setting a label on a role network (for instance, "a migration network" or "a display network") causes a mass deployment of that network on all hosts. Such mass additions of networks are achieved through the use of DHCP. This method of mass deployment was chosen over a method of typing in static addresses, because of the unscalable nature of the task of typing in many static IP addresses.
3.13. Cluster Networking
Cluster level networking objects include:
- Clusters
- Logical Networks
A data center is a logical grouping of multiple clusters and each cluster is a logical group of multiple hosts. The following diagram depicts the contents of a single cluster.
Figure 3.1. Networking within a cluster
Hosts in a cluster all have access to the same storage domains. Hosts in a cluster also have logical networks applied to the cluster. For a virtual machine logical network to become operational for use with virtual machines, the network must be defined and implemented for each host in the cluster using the Red Hat Virtualization Manager. Other logical network types can be implemented on only the hosts that use them.
Multi-host network configuration automatically applies any updated network settings to all of the hosts within the data center to which the network is assigned.
3.14. Logical Networks
Logical networking enables the Red Hat Virtualization environment to separate network traffic by type. For example, the ovirtmgmt network is created by default during the installation of Red Hat Virtualization to be used for management communication between the Manager and hosts. A typical use for logical networks is to group network traffic with similar requirements and usage together. In many cases, a storage network and a display network are created by an administrator to isolate traffic of each respective type for optimization and troubleshooting.
Red Hat Virtualization supports the following logical network types:
- Logical networks that carry only host network traffic, such as storage or migration traffic
- Logical networks that carry host and virtual machine network traffic
- Logical networks that carry only virtual machine network traffic, such as OVN networks
Logical networks are defined at the data center level.
If necessary, the Red Hat Virtualization Manager automatically instantiates logical networks on the host, depending on the type of virtual machine network. For more information, see virtual machine networking types.
Example 3.1. Example usage of a logical network.
A system administrator wants to use a logical network to test a web server.
There are two hosts called Red Host and White Host in a cluster called Pink Cluster in a data center called Purple Data Center. Both Red Host and White Host have been using the default logical network, ovirtmgmt, for all networking functions. The system administrator responsible for Pink Cluster decides to isolate network testing for a web server by placing the web server and some client virtual machines on a separate logical network. She decides to call the new logical network test_logical_network.
- She creates a new logical network, named test_logical_network, for the Purple Data Center with VLAN tagging enabled. VLAN tagging is necessary when you have two logical networks connected to the same physical NIC. She applies test_logical_network to the Pink Cluster.
- In Red Host, she attaches test_logical_network to a physical NIC that will be included in the bridge that RHV creates. The network is non-operational until she sets up the corresponding bridge in all hosts in the cluster by adding a physical network interface on each host in the Pink cluster to test_logical_network. She repeats this step for White Host. When both White Host and Red Host have the test_logical_network logical network bridged to a physical network interface, the test_logical_network becomes operational and is ready to be used by virtual machines.
- She associates the virtual machines on the Red Host and White Host with the new network.
3.15. Required Networks, Optional Networks, and Virtual Machine Networks
A required network is a logical network that must be available to all hosts in a cluster. When a host’s required network becomes non-operational, virtual machines running on that host are migrated to another host; the extent of this migration is dependent upon the chosen scheduling policy. This is beneficial if you have virtual machines running mission critical workloads.
An optional network is a logical network that has not been explicitly declared as Required. Optional networks can be implemented on only the hosts that use them. The presence or absence of optional networks does not affect the Operational status of a host. When a non-required network becomes non-operational, the virtual machines running on the network are not migrated to another host. This prevents unnecessary I/O overload caused by mass migrations. Note that when a logical network is created and added to clusters, the Required box is selected by default.
To change a network’s Required designation, from the Administration Portal, select a network, click the Cluster tab, and click the Manage Networks button.
A virtual machine network, called a VM network in the user interface, is a logical network designated to carry only virtual machine network traffic. A virtual machine network can be required or optional. Virtual machines that use an optional virtual machine network only start on hosts with that network.
3.16. Port Mirroring
Port mirroring copies layer 3 network traffic on a given logical network and host to a virtual interface on a virtual machine. This virtual machine can be used for network debugging and tuning, intrusion detection, and monitoring the behavior of other virtual machines on the same host and logical network.
The only traffic copied is internal to one logical network on one host. There is no increase in traffic on the network external to the host. However, a virtual machine with port mirroring enabled uses more host CPU and RAM than other virtual machines.
Port mirroring is enabled or disabled in the vNIC profiles of logical networks, and has the following limitations:
- Hot linking vNICs with a profile that has port mirroring enabled is not supported.
- Port mirroring cannot be altered when the vNIC profile is attached to a virtual machine.
Given the above limitations, it is recommended that you enable port mirroring on an additional, dedicated vNIC profile.
Enabling port mirroring reduces the privacy of other network users.
3.17. Host Networking Configurations
Cluster Networking can be helpful to understand these networking configurations.
Common types of networking configurations for virtualization hosts include:
Bridge and NIC configuration.
This configuration uses a Linux bridge to connect one or more virtual machines to the host’s NIC.
An example of this configuration is the automatic creation of the
ovirtmgmt
network when installing the Red Hat Virtualization Manager. Then, during host installation, the Red Hat Virtualization Manager installs VDSM on the host. The VDSM installation process creates theovirtmgmt
bridge which obtains the host’s IP address to enable communication with the Manager.ImportantAll hosts in a cluster must use either IPv4 or IPv6 as the IP stack for their management network. Dual stack is not supported.
Bridge, VLAN, and NIC configuration.
A VLAN can be included in the bridge and NIC configuration to provide a secure channel for data transfer over the network and supports connecting multiple bridges to a single NIC using multiple VLANs.
Bridge, Bond, and VLAN configuration.
A bond creates a logical link that combines the two (or more) physical Ethernet links. The resultant benefits include NIC fault tolerance and potential bandwidth extension, depending on the bonding mode.
Multiple Bridge, Multiple VLAN, and NIC configuration.
This configuration connects a NIC to multiple VLANs.
For example, to connect a single NIC to two VLANs, the network switch can be configured to pass network traffic that has been tagged into one of the two VLANs to one NIC on the host. The host uses two vNICs to separate VLAN traffic, one for each VLAN. Traffic tagged into either VLAN then connects to a separate bridge by having the appropriate vNIC as a bridge member. Each bridge, in turn, connects to multiple virtual machines.
NoteYou can also bond multiple NICs to facilitate a connection with multiple VLANs. Each VLAN in this configuration is defined over the bond comprising the multiple NICs. Each VLAN connects to an individual bridge and each bridge connects to one or more guests.
Chapter 4. Power Management
4.1. Introduction to Power Management and Fencing
The Red Hat Virtualization environment is most flexible and resilient when power management and fencing have been configured. Power management allows the Red Hat Virtualization Manager to control host power cycle operations, most importantly to reboot hosts on which problems have been detected. Fencing is used to isolate problem hosts from a functional Red Hat Virtualization environment by rebooting them, in order to prevent performance degradation. Fenced hosts can then be returned to responsive status through administrator action and be reintegrated into the environment.
Power management and fencing make use of special dedicated hardware in order to restart hosts independently of host operating systems. The Red Hat Virtualization Manager connects to a power management devices using a network IP address or hostname. In the context of Red Hat Virtualization, a power management device and a fencing device are the same thing.
4.2. Power Management by Proxy in Red Hat Virtualization
The Red Hat Virtualization Manager does not communicate directly with fence agents. Instead, the Manager uses a proxy to send power management commands to a host power management device. The Manager uses VDSM to execute power management device actions, so another host in the environment is used as a fencing proxy.
You can select between:
- Any host in the same cluster as the host requiring fencing.
- Any host in the same data center as the host requiring fencing.
A viable fencing proxy host has a status of either UP or Maintenance.
4.3. Power Management
The Red Hat Virtualization Manager is capable of rebooting hosts that have entered a non-operational or non-responsive state, as well as preparing to power off under-utilized hosts to save power. This functionality depends on a properly configured power management device. The Red Hat Virtualization environment supports the following power management devices:
-
American Power Conversion (
apc
) -
IBM Bladecenter (
Bladecenter
) -
Cisco Unified Computing System (
cisco_ucs
) -
Dell Remote Access Card 5 (
drac5
) -
Dell Remote Access Card 7 (
drac7
) -
Electronic Power Switch (
eps
) -
HP BladeSystem (
hpblade
) -
Integrated Lights Out (
ilo
,ilo2
,ilo3
,ilo4
) -
Intelligent Platform Management Interface (
ipmilan
) -
Remote Supervisor Adapter (
rsa
) -
Fujitsu-Siemens RSB (
rsb
) -
Western Telematic, Inc (
wti
)
HP servers should use ilo3
or ilo4
, Dell servers use drac5
or Integrated Dell Remote Access Controllers (idrac
), and IBM servers use ipmilan
. Integrated Management Module (IMM) uses the IPMI protocol, and therefore IMM users can use ipmilan
.
APC 5.x power management devices are not supported by the apc fence agent. Use the apc_snmp fence agent instead.
In order to communicate with the listed power management devices, the Red Hat Virtualization Manager makes use of fence agents. The Red Hat Virtualization Manager allows administrators to configure a fence agent for the power management device in their environment with parameters the device will accept and respond to. Basic configuration options can be configured using the graphical user interface. Special configuration options can also be entered, and are passed un-parsed to the fence device. Special configuration options are specific to a given fence device, while basic configuration options are for functionalities provided by all supported power management devices. The basic functionalities provided by all power management devices are:
- Status: check the status of the host.
- Start: power on the host.
- Stop: power down the host.
- Restart: restart the host. Actually implemented as stop, wait, status, start, wait, status.
Best practice is to test the power management configuration once when initially configuring it, and occasionally after that to ensure continued functionality.
Resilience is provided by properly configured power management devices in all of the hosts in an environment. Fencing agents allow the Red Hat Virtualization Manager to communicate with host power management devices to bypass the operating system on a problem host, and isolate the host from the rest of its environment by rebooting it. The Manager can then reassign the SPM role, if it was held by the problem host, and safely restart any highly available virtual machines on other hosts.
4.4. Fencing
In the context of the Red Hat Virtualization environment, fencing is a host reboot initiated by the Manager using a fence agent and performed by a power management device. Fencing allows a cluster to react to unexpected host failures as well as enforce power saving, load balancing, and virtual machine availability policies.
Fencing ensures that the role of Storage Pool Manager (SPM) is always assigned to a functional host. If the fenced host was the SPM, the SPM role is relinquished and reassigned to a responsive host. Because the host with the SPM role is the only host that is able to write data domain structure metadata, a non-responsive, un-fenced SPM host causes its environment to lose the ability to create and destroy virtual disks, take snapshots, extend logical volumes, and all other actions that require changes to data domain structure metadata.
When a host becomes non-responsive, all of the virtual machines that are currently running on that host can also become non-responsive. However, the non-responsive host retains the lock on the virtual machine hard disk images for virtual machines it is running. Attempting to start a virtual machine on a second host and assign the second host write privileges for the virtual machine hard disk image can cause data corruption.
Fencing allows the Red Hat Virtualization Manager to assume that the lock on a virtual machine hard disk image has been released; the Manager can use a fence agent to confirm that the problem host has been rebooted. When this confirmation is received, the Red Hat Virtualization Manager can start a virtual machine from the problem host on another host without risking data corruption. Fencing is the basis for highly-available virtual machines. A virtual machine that has been marked highly-available can not be safely started on an alternate host without the certainty that doing so will not cause data corruption.
When a host becomes non-responsive, the Red Hat Virtualization Manager allows a grace period of thirty (30) seconds to pass before any action is taken, to allow the host to recover from any temporary errors. If the host has not become responsive by the time the grace period has passed, the Manager automatically begins to mitigate any negative impact from the non-responsive host. The Manager uses the fencing agent for the power management card on the host to stop the host, confirm it has stopped, start the host, and confirm that the host has been started. When the host finishes booting, it attempts to rejoin the cluster that it was a part of before it was fenced. If the issue that caused the host to become non-responsive has been resolved by the reboot, then the host is automatically set to Up status and is once again capable of starting and hosting virtual machines.
4.5. Soft-Fencing Hosts
Hosts can sometimes become non-responsive due to an unexpected problem, and though VDSM is unable to respond to requests, the virtual machines that depend upon VDSM remain alive and accessible. In these situations, restarting VDSM returns VDSM to a responsive state and resolves this issue.
"SSH Soft Fencing" is a process where the Manager attempts to restart VDSM via SSH on non-responsive hosts. If the Manager fails to restart VDSM via SSH, the responsibility for fencing falls to the external fencing agent if an external fencing agent has been configured.
Soft-fencing over SSH works as follows. Fencing must be configured and enabled on the host, and a valid proxy host (a second host, in an UP state, in the data center) must exist. When the connection between the Manager and the host times out, the following happens:
- On the first network failure, the status of the host changes to "connecting".
- The Manager then makes three attempts to ask VDSM for its status, or it waits for an interval determined by the load on the host. The formula for determining the length of the interval is configured by the configuration values TimeoutToResetVdsInSeconds (the default is 60 seconds) + [DelayResetPerVmInSeconds (the default is 0.5 seconds)]*(the count of running virtual machines on host) + [DelayResetForSpmInSeconds (the default is 20 seconds)] * 1 (if host runs as SPM) or 0 (if the host does not run as SPM). To give VDSM the maximum amount of time to respond, the Manager chooses the longer of the two options mentioned above (three attempts to retrieve the status of VDSM or the interval determined by the above formula).
-
If the host does not respond when that interval has elapsed,
vdsm restart
is executed via SSH. -
If
vdsm restart
does not succeed in re-establishing the connection between the host and the Manager, the status of the host changes toNon Responsive
and, if power management is configured, fencing is handed off to the external fencing agent.
Soft-fencing over SSH can be executed on hosts that have no power management configured. This is distinct from "fencing": fencing can be executed only on hosts that have power management configured.
4.6. Using Multiple Power Management Fencing Agents
Single agents are treated as primary agents. The secondary agent is valid when there are two fencing agents, for example for dual-power hosts in which each power switch has two agents connected to the same power switch. Agents can be of the same or different types.
Having multiple fencing agents on a host increases the reliability of the fencing procedure. For example, when the sole fencing agent on a host fails, the host will remain in a non-operational state until it is manually rebooted. The virtual machines previously running on the host will be suspended, and only fail over to another host in the cluster after the original host is manually fenced. With multiple agents, if the first agent fails, the next agent can be called.
When two fencing agents are defined on a host, they can be configured to use a concurrent or sequential flow:
- Concurrent: Both primary and secondary agents have to respond to the Stop command for the host to be stopped. If one agent responds to the Start command, the host will go up.
- Sequential: To stop or start a host, the primary agent is used first, and if it fails, the secondary agent is used.
Chapter 5. Load Balancing, Scheduling, and Migration
5.1. Load Balancing, Scheduling, and Migration
Individual hosts have finite hardware resources, and are susceptible to failure. To mitigate against failure and resource exhaustion, hosts are grouped into clusters, which are essentially a grouping of shared resources. A Red Hat Virtualization environment responds to changes in demand for host resources using load balancing policy, scheduling, and migration. The Manager is able to ensure that no single host in a cluster is responsible for all of the virtual machines in that cluster. Conversely, the Manager is able to recognize an underutilized host, and migrate all virtual machines off of it, allowing an administrator to shut down that host to save power.
Available resources are checked as a result of three events:
- Virtual machine start - Resources are checked to determine on which host a virtual machine will start.
- Virtual machine migration - Resources are checked in order to determine an appropriate target host.
- Time elapses - Resources are checked at a regular interval to determine whether individual host load is in compliance with cluster load balancing policy.
The Manager responds to changes in available resources by using the load balancing policy for a cluster to schedule the migration of virtual machines from one host in a cluster to another. The relationship between load balancing policy, scheduling, and virtual machine migration are discussed in the following sections.
5.2. Load Balancing Policy
Load balancing policy is set for a cluster, which includes one or more hosts that may each have different hardware parameters and available memory. The Red Hat Virtualization Manager uses a load balancing policy to determine which host in a cluster to start a virtual machine on. Load balancing policy also allows the Manager determine when to move virtual machines from over-utilized hosts to under-utilized hosts.
The load balancing process runs once every minute for each cluster in a data center. It determines which hosts are over-utilized, which are hosts under-utilized, and which are valid targets for virtual machine migration. The determination is made based on the load balancing policy set by an administrator for a given cluster. The options for load balancing policies are VM_Evenly_Distributed, Evenly_Distributed, Power_Saving, Cluster_Maintenance, and None.
5.3. Load Balancing Policy: VM_Evenly_Distributed
A virtual machine evenly distributed load balancing policy distributes virtual machines evenly between hosts based on a count of the virtual machines. The high virtual machine count is the maximum number of virtual machines that can run on each host, beyond which qualifies as overloading the host. The VM_Evenly_Distributed policy allows an administrator to set a high virtual machine count for hosts. The maximum inclusive difference in virtual machine count between the most highly-utilized host and the least-utilized host is also set by an administrator. The cluster is balanced when every host in the cluster has a virtual machine count that falls inside this migration threshold. The administrator also sets the number of slots for virtual machines to be reserved on SPM hosts. The SPM host will have a lower load than other hosts, so this variable defines how many fewer virtual machines than other hosts it can run. If any host is running more virtual machines than the high virtual machine count and at least one host has a virtual machine count that falls outside of the migration threshold, virtual machines are migrated one by one to the host in the cluster that has the lowest CPU utilization. One virtual machine is migrated at a time until every host in the cluster has a virtual machine count that falls within the migration threshold.
5.4. Load Balancing Policy: Evenly_Distributed
Figure 5.1. Evenly Distributed Scheduling Policy
An evenly distributed load balancing policy selects the host for a new virtual machine according to lowest CPU load or highest available memory. The maximum CPU load and minimum available memory that is allowed for hosts in a cluster for a set amount of time are defined by the evenly distributed scheduling policy’s parameters. Beyond these limits the environment’s performance will degrade. The evenly distributed policy allows an administrator to set these levels for running virtual machines. If a host has reached the defined maximum CPU load or minimum available memory and the host stays there for more than the set time, virtual machines on that host are migrated one by one to the host in the cluster that has the lowest CPU or highest available memory depending on which parameter is being utilized. Host resources are checked once per minute, and one virtual machine is migrated at a time until the host CPU load is below the defined limit or the host available memory is above the defined limit.
5.5. Load Balancing Policy: Power_Saving
Figure 5.2. Power Saving Scheduling Policy
A power saving load balancing policy selects the host for a new virtual machine according to lowest CPU or highest available memory. The maximum CPU load and minimum available memory that is allowed for hosts in a cluster for a set amount of time is defined by the power saving scheduling policy’s parameters. Beyond these limits the environment’s performance will degrade. The power saving parameters also define the minimum CPU load and maximum available memory allowed for hosts in a cluster for a set amount of time before the continued operation of a host is considered an inefficient use of electricity. If a host has reached the maximum CPU load or minimum available memory and stays there for more than the set time, the virtual machines on that host are migrated one by one to the host that has the lowest CPU or highest available memory depending on which parameter is being utilized. Host resources are checked once per minute, and one virtual machine is migrated at a time until the host CPU load is below the defined limit or the host available memory is above the defined limit. If the host’s CPU load falls below the defined minimum level or the host’s available memory rises above the defined maximum level the virtual machines on that host are migrated to other hosts in the cluster as long as the other hosts in the cluster remain below maximum CPU load and above minimum available memory. When an under-utilized host is cleared of its remaining virtual machines, the Manager will automatically power down the host machine, and restart it again when load balancing requires or there are not enough free hosts in the cluster.
5.6. Load Balancing Policy: None
If no load balancing policy is selected, virtual machines are started on the host within a cluster with the lowest CPU utilization and available memory. To determine CPU utilization a combined metric is used that takes into account the virtual CPU count and the CPU usage percent. This approach is the least dynamic, as the only host selection point is when a new virtual machine is started. Virtual machines are not automatically migrated to reflect increased demand on a host.
An administrator must decide which host is an appropriate migration target for a given virtual machine. Virtual machines can also be associated with a particular host using pinning. Pinning prevents a virtual machine from being automatically migrated to other hosts. For environments where resources are highly consumed, manual migration is the best approach.
5.7. Load Balancing Policy: Cluster_Maintenance
A cluster maintenance scheduling policy limits activity in a cluster during maintenance tasks. When a cluster maintenance policy is set:
- No new virtual machines may be started, except highly available virtual machines. (Users can create highly available virtual machines and start them manually.)
- In the event of host failure, highly available virtual machines will restart properly and any virtual machine can migrate.
5.8. Highly Available Virtual Machine Reservation
A highly available (HA) virtual machine reservation policy enables the Red Hat Virtualization Manager to monitor cluster capacity for highly available virtual machines. The Manager has the capability to flag individual virtual machines for High Availability, meaning that in the event of a host failure, these virtual machines will be rebooted on an alternative host. This policy balances highly available virtual machines across the hosts in a cluster. If any host in the cluster fails, the remaining hosts can support the migrating load of highly available virtual machines without affecting cluster performance. When highly available virtual machine reservation is enabled, the Manager ensures that appropriate capacity exists within a cluster for HA virtual machines to migrate in the event that their existing host fails unexpectedly.
5.9. Scheduling
In Red Hat Virtualization, scheduling refers to the way the Red Hat Virtualization Manager selects a host in a cluster as the target for a new or migrated virtual machine.
For a host to be eligible to start a virtual machine or accept a migrated virtual machine from another host, it must have enough free memory and CPUs to support the requirements of the virtual machine being started on or migrated to it. A virtual machine will not start on a host with an overloaded CPU. By default, a host’s CPU is considered overloaded if it has a load of more than 80% for 5 minutes, but these values can be changed using scheduling policies. If multiple hosts are eligible targets, one will be selected based on the load balancing policy for the cluster. For example, if the Evenly_Distributed policy is in effect, the Manager chooses the host with the lowest CPU utilization. If the Power_Saving policy is in effect, the host with the lowest CPU utilization between the maximum and minimum service levels will be selected. The Storage Pool Manager (SPM) status of a given host also affects eligibility as a target for starting virtual machines or virtual machine migration. A non-SPM host is a preferred target host, for instance, the first virtual machine started in a cluster will not run on the SPM host if the SPM role is held by a host in that cluster.
See Scheduling Policies in the Administration Guide for more information.
5.10. Migration
The Red Hat Virtualization Manager uses migration to enforce load balancing policies for a cluster. Virtual machine migration takes place according to the load balancing policy for a cluster and current demands on hosts within a cluster. Migration can also be configured to automatically occur when a host is fenced or moved to maintenance mode. The Red Hat Virtualization Manager first migrates virtual machines with the lowest CPU utilization. This is calculated as a percentage, and does not take into account RAM usage or I/O operations, except as I/O operations affect CPU utilization. If there are more than one virtual machines with the same CPU usage, the one that will be migrated first is the first virtual machine returned by the database query run by the Red Hat Virtualization Manager to determine virtual machine CPU usage.
Virtual machine migration has the following limitations by default:
- A bandwidth limit of 52 MiBps is imposed on each virtual machine migration.
- A migration will time out after 64 seconds per GB of virtual machine memory.
- A migration will abort if progress is stalled for 240 seconds.
- Concurrent outgoing migrations are limited to one per CPU core per host, or 2, whichever is smaller.
See Understanding live migration "migration_max_bandwidth" and "max_outgoing_migrations" parameters in vdsm.conf for details about tuning migration settings.
Chapter 6. Directory Services
6.1. Directory Services
The Red Hat Virtualization platform relies on directory services for user authentication and authorization. Interactions with all Manager interfaces, including the VM Portal, Administration Portal, and REST API are limited to authenticated, authorized users. Virtual machines within the Red Hat Virtualization environment can use the same directory services to provide authentication and authorization, however they must be configured to do so. The currently supported providers of directory services for use with the Red Hat Virtualization Manager are Identity Management (IdM), Red Hat Directory Server 9 (RHDS), Active Directory (AD), and OpenLDAP. The Red Hat Virtualization Manager interfaces with the directory server for:
- Portal logins (User, Power User, Administrator, REST API).
- Queries to display user information.
- Adding the Manager to a domain.
Authentication is the verification and identification of a party who generated some data, and of the integrity of the generated data. A principal is the party whose identity is verified. The verifier is the party who demands assurance of the principal’s identity. In the case of Red Hat Virtualization, the Manager is the verifier and a user is a principal. Data integrity is the assurance that the data received is the same as the data generated by the principal.
Confidentiality and authorization are closely related to authentication. Confidentiality protects data from disclosure to those not intended to receive it. Strong authentication methods can optionally provide confidentiality. Authorization determines whether a principal is allowed to perform an operation. Red Hat Virtualization uses directory services to associate users with roles and provide authorization accordingly. Authorization is usually performed after the principal has been authenticated, and may be based on information local or remote to the verifier.
During installation, a local, internal domain is automatically configured for administration of the Red Hat Virtualization environment. After the installation is complete, more domains can be added.
6.2. Local Authentication: Internal Domain
The Red Hat Virtualization Manager creates a limited, internal administration domain during installation. This domain is not the same as an AD or IdM domain, because it exists based on a key in the Red Hat Virtualization PostgreSQL database rather than as a directory service user on a directory server. The internal domain is also different from an external domain because the internal domain will only have one user: the admin@internal user. Taking this approach to initial authentication allows Red Hat Virtualization to be evaluated without requiring a complete, functional directory server, and ensures an administrative account is available to troubleshoot any issues with external directory services.
The admin@internal user is for the initial configuration of an environment. This includes installing and accepting hosts, adding external AD or IdM authentication domains, and delegating permissions to users from external domains.
6.3. Remote Authentication Using GSSAPI
In the context of Red Hat Virtualization, remote authentication refers to authentication that is handled by a remote service, not the Red Hat Virtualization Manager. Remote authentication is used for user or API connections coming to the Manager from within an AD, IdM, or RHDS domain. The Red Hat Virtualization Manager must be configured by an administrator using the engine-manage-domains tool to be a part of an RHDS, AD, or IdM domain. This requires that the Manager be provided with credentials for an account from the RHDS, AD, or IdM directory server for the domain with sufficient privileges to join a system to the domain. After domains have been added, domain users can be authenticated by the Red Hat Virtualization Manager against the directory server using a password. The Manager uses a framework called the Simple Authentication and Security Layer (SASL) which in turn uses the Generic Security Services Application Program Interface (GSSAPI) to securely verify the identity of a user, and ascertain the authorization level available to the user.
Figure 6.1. GSSAPI Authentication
Chapter 7. Templates and Pools
7.1. Templates and Pools
The Red Hat Virtualization environment provides administrators with tools to simplify the provisioning of virtual machines to users. These are templates and pools. A template is a shortcut that allows an administrator to quickly create a new virtual machine based on an existing, pre-configured virtual machine, bypassing operating system installation and configuration. This is especially helpful for virtual machines that will be used like appliances, for example web server virtual machines. If an organization uses many instances of a particular web server, an administrator can create a virtual machine that will be used as a template, installing an operating system, the web server, any supporting packages, and applying unique configuration changes. The administrator can then create a template based on the working virtual machine that will be used to create new, identical virtual machines as they are required.
Virtual machine pools are groups of virtual machines based on a given template that can be rapidly provisioned to users. Permission to use virtual machines in a pool is granted at the pool level; a user who is granted permission to use the pool will be assigned any virtual machine from the pool. Inherent in a virtual machine pool is the transitory nature of the virtual machines within it. Because users are assigned virtual machines without regard for which virtual machine in the pool they have used in the past, pools are not suited for purposes which require data persistence. Virtual machine pools are best suited for scenarios where either user data is stored in a central location and the virtual machine is a means to accessing and using that data, or data persistence is not important. The creation of a pool results in the creation of the virtual machines that populate the pool, in a stopped state. These are then started on user request.
7.2. Templates
To create a template, an administrator creates and customizes a virtual machine. Desired packages are installed, customized configurations are applied, the virtual machine is prepared for its intended purpose in order to minimize the changes that must be made to it after deployment. An optional but recommended step before creating a template from a virtual machine is generalization. Generalization is used to remove details like system user names, passwords, and timezone information that will change upon deployment. Generalization does not affect customized configurations. Generalization of Windows and Linux guests in the Red Hat Virtualization environment is discussed further in Templates in the Virtual Machine Management Guide. Red Hat Enterprise Linux guests are generalized using sys-unconfig
. Windows guests are generalized using sys-prep
.
When the virtual machine that provides the basis for a template is satisfactorily configured, generalized if desired, and stopped, an administrator can create a template from the virtual machine. Creating a template from a virtual machine causes a read-only copy of the specially configured virtual disk to be created. The read-only image forms the backing image for all subsequently created virtual machines that are based on that template. In other words, a template is essentially a customized read-only virtual disk with an associated virtual hardware configuration. The hardware can be changed in virtual machines created from a template, for instance, provisioning two gigabytes of RAM for a virtual machine created from a template that has one gigabyte of RAM. The template virtual disk, however, cannot be changed as doing so would result in changes for all virtual machines based on the template.
When a template has been created, it can be used as the basis for multiple virtual machines. Virtual machines are created from a given template using a Thin provisioning method or a Clone provisioning method. Virtual machines that are cloned from templates take a complete writable copy of the template base image, sacrificing the space savings of the thin creation method in exchange for no longer depending on the presence of the template. Virtual machines that are created from a template using the thin method use the read-only image from the template as a base image, requiring that the template and all virtual machines created from it be stored on the same storage domain. Changes to data and newly generated data are stored in a copy-on-write image. Each virtual machine based on a template uses the same base read-only image, as well as a copy-on-write image that is unique to the virtual machine. This provides storage savings by limiting the number of times identical data is kept in storage. Furthermore, frequent use of the read-only backing image can cause the data being accessed to be cached, resulting in a net performance increase.
7.3. Pools
Virtual machine pools allow for rapid provisioning of numerous identical virtual machines to users as desktops. Users who have been granted permission to access and use virtual machines from a pool receive an available virtual machine based on their position in a queue of requests. Virtual machines in a pool do not allow data persistence; each time a virtual machine is assigned from a pool, it is allocated in its base state. This is ideally suited to be used in situations where user data is stored centrally.
Virtual machine pools are created from a template. Each virtual machine in a pool uses the same backing read-only image, and uses a temporary copy-on-write image to hold changed and newly generated data. Virtual machines in a pool are different from other virtual machines in that the copy-on-write layer that holds user-generated and -changed data is lost at shutdown. The implication of this is that a virtual machine pool requires no more storage than the template that backs it, plus some space for data generated or changed during use. Virtual machine pools are an efficient way to provide computing power to users for some tasks without the storage cost of providing each user with a dedicated virtual desktop.
Example 7.1. Example Pool Usage
A technical support company employs 10 help desk staff. However, only five are working at any given time. Instead of creating ten virtual machines, one for each help desk employee, a pool of five virtual machines can be created. Help desk employees allocate themselves a virtual machine at the beginning of their shift and return it to the pool at the end.
Chapter 8. Virtual Machine Snapshots
8.1. Snapshots
Snapshots are a storage function that allows an administrator to create a restore point of a virtual machine’s operating system, applications, and data at a certain point in time. Snapshots save the data currently present in a virtual machine hard disk image as a COW volume and allow for a recovery to the data as it existed at the time the snapshot was taken. A snapshot causes a new COW layer to be created over the current layer. All write actions performed after a snapshot is taken are written to the new COW layer.
It is important to understand that a virtual machine hard disk image is a chain of one or more volumes. From the perspective of a virtual machine, these volumes appear as a single disk image. A virtual machine is oblivious to the fact that its disk is comprised of multiple volumes.
The term COW volume and COW layer are used interchangeably, however, layer more clearly recognizes the temporal nature of snapshots. Each snapshot is created to allow an administrator to discard unsatisfactory changes made to data after the snapshot is taken. Snapshots provide similar functionality to the Undo function present in many word processors.
Snapshots of virtual machine hard disks marked shareable and those that are based on Direct LUN connections are not supported, live or otherwise.
The three primary snapshot operations are:
- Creation, which involves the first snapshot created for a virtual machine.
- Previews, which involves previewing a snapshot to determine whether or not to restore the system data to the point in time that the snapshot was taken.
- Deletion, which involves deleting a restoration point that is no longer required.
For task-based information about snapshot operations, see Snapshots in the Red Hat Virtualization Virtual Machine Management Guide.
8.2. Live Snapshots in Red Hat Virtualization
Snapshots of virtual machine hard disks marked shareable and those that are based on Direct LUN connections are not supported, live or otherwise.
Any other virtual machine that is not being cloned or migrated can have a snapshot taken when running, paused, or stopped.
When a live snapshot of a virtual machine is initiated, the Manager requests that the SPM host create a new volume for the virtual machine to use. When the new volume is ready, the Manager uses VDSM to communicate with libvirt and qemu on the host running the virtual machine that it should begin using the new volume for virtual machine write operations. If the virtual machine is able to write to the new volume, the snapshot operation is considered a success and the virtual machine stops writing to the previous volume. If the virtual machine is unable to write to the new volume, the snapshot operation is considered a failure, and the new volume is deleted.
The virtual machine requires access to both its current volume and the new one from the time when a live snapshot is initiated until after the new volume is ready, so both volumes are opened with read-write access.
Virtual machines with an installed guest agent that supports quiescing can ensure filesystem consistency across snapshots. Registered Red Hat Enterprise Linux guests can install the qemu-guest-agent to enable quiescing before snapshots.
If a quiescing compatible guest agent is present on a virtual machine when it a snapshot is taken, VDSM uses libvirt to communicate with the agent to prepare for a snapshot. Outstanding write actions are completed, and then filesystems are frozen before a snapshot is taken. When the snapshot is complete, and libvirt has switched the virtual machine to the new volume for disk write actions, the filesystem is thawed, and writes to disk resume.
All live snapshots attempted with quiescing enabled. If the snapshot command fails because there is no compatible guest agent present, the live snapshot is re-initiated without the use-quiescing flag. When a virtual machine is reverted to its pre-snapshot state with quiesced filesystems, it boots cleanly with no filesystem check required. Reverting the previous snapshot using an un-quiesced filesystem requires a filesystem check on boot.
8.3. Snapshot Creation
In Red Hat Virtualization the initial snapshot for a virtual machine is different from subsequent snapshots in that the initial snapshot retains its format, either QCOW2 or raw. The first snapshot for a virtual machine uses existing volumes as a base image. Additional snapshots are additional COW layers tracking the changes made to the data stored in the image since the previous snapshot.
As depicted in Initial Snapshot Creation, the creation of a snapshot causes the volumes that comprise a virtual disk to serve as the base image for all subsequent snapshots.
Figure 8.1. Initial Snapshot Creation
Snapshots taken after the initial snapshot result in the creation of new COW volumes in which data that is created or changed after the snapshot is taken will be stored. Each newly created COW layer contains only COW metadata. Data that is created by using and operating the virtual machine after a snapshot is taken is written to this new COW layer. When a virtual machine is used to modify data that exists in a previous COW layer, the data is read from the previous layer, and written into the newest layer. Virtual machines locate data by checking each COW layer from most recent to oldest, transparently to the virtual machine.
Figure 8.2. Additional Snapshot Creation
8.4. Monitoring snapshot health with the image discrepancies tool
The RHV Image Discrepancies tool analyzes image data in the Storage Domain and RHV Database. It alerts you if it finds discrepancies in volumes and volume attributes, but does not fix those discrepancies. Use this tool in a variety of scenarios, such as:
- Before upgrading versions, to avoid carrying over broken volumes or chains to the new version.
- Following a failed storage operation, to detect volumes or attributes in a bad state.
- After restoring the RHV database or storage from backup.
- Periodically, to detect potential problems before they worsen.
- To analyze a snapshot- or live storage migration-related issues, and to verify system health after fixing these types of problems.
Prerequisites
-
Required Versions: this tool was introduced in RHV version 4.3.8 with
rhv-log-collector-analyzer-0.2.15-0.el7ev
. - Because data collection runs simultaneously at different places and is not atomic, stop all activity in the environment that can modify the storage domains. That is, do not create or remove snapshots, edit, move, create, or remove disks. Otherwise, false detection of inconsistencies may occur. Virtual Machines can remain running normally during the process.
Procedure
To run the tool, enter the following command on the RHV Manager:
# rhv-image-discrepancies
- If the tool finds discrepancies, rerun it to confirm the results, especially if there is a chance some operations were performed while the tool was running.
This tool includes any Export and ISO storage domains and may report discrepancies for them. If so, these can be ignored, as these storage domains do not have entries for images in the RHV database.
Understanding the results
The tool reports the following:
- If there are volumes that appear on the storage but are not in the database, or appear in the database but are not on the storage.
- If some volume attributes differ between the storage and the database.
Sample output:
Checking storage domain c277ad93-0973-43d9-a0ca-22199bc8e801 Looking for missing images... No missing images found Checking discrepancies between SD/DB attributes... image ef325650-4b39-43cf-9e00-62b9f7659020 has a different attribute capacity on storage(2696984576) and on DB(2696986624) image 852613ce-79ee-4adc-a56a-ea650dcb4cfa has a different attribute capacity on storage(5424252928) and on DB(5424254976) Checking storage domain c64637b4-f0e8-408c-b8af-6a52946113e2 Looking for missing images... No missing images found Checking discrepancies between SD/DB attributes... No discrepancies found
8.5. Snapshot Previews
To select which snapshot a virtual disk will be reverted to, the administrator can preview all previously created snapshots.
From the available snapshots per guest, the administrator can select a snapshot volume to preview its contents. As depicted in Preview Snapshot, each snapshot is saved as a COW volume, and when it is previewed, a new preview layer is copied from the snapshot being previewed. The guest interacts with the preview instead of the actual snapshot volume.
After the administrator previews the selected snapshot, the preview can be committed to restore the guest data to the state captured in the snapshot. If the administrator commits the preview, the guest is attached to the preview layer.
After a snapshot is previewed, the administrator can select Undo to discard the preview layer of the viewed snapshot. The layer that contains the snapshot itself is preserved despite the preview layer being discarded.
Figure 8.3. Preview Snapshot
8.6. Snapshot Deletion
You can delete individual snapshots that are no longer required. Deleting a snapshot removes the ability to restore a virtual disk to that particular restoration point. It does not necessarily reclaim the disk space consumed by the snapshot, nor does it delete the data. The disk space will only be reclaimed if a subsequent snapshot has overwritten the data of the deleted snapshot. For example, if the third snapshot out of five snapshots is deleted, the unchanged data in the third snapshot must be preserved on the disk for the fourth and fifth snapshots to be usable; however, if the fourth or fifth snapshot has overwritten the data of the third, then the third snapshot has been made redundant and the disk space can be reclaimed. Aside from potential disk space reclamation, deleting a snapshot may also improve the performance of the virtual machine.
Figure 8.4. Snapshot Deletion
Snapshot deletion is handled as an asynchronous block job in which VDSM maintains a record of the operation in the recovery file for the virtual machine so that the job can be tracked even if VDSM is restarted or the virtual machine is shut down during the operation. Once the operation begins, the snapshot being deleted cannot be previewed or used as a restoration point, even if the operation fails or is interrupted. In operations in which the active layer is to be merged with its parent, the operation is split into a two-stage process during which data is copied from the active layer to the parent layer, and disk writes are mirrored to both the active layer and the parent. Finally, the job is considered complete once the data in the snapshot being deleted has been merged with its parent snapshot and VDSM synchronizes the changes throughout the image chain.
If the deletion fails, fix the underlying problem (for example, a failed host, an inaccessible storage device, or even a temporary network issue) and try again.
Chapter 9. Hardware Drivers and Devices
9.1. Virtualized Hardware
Red Hat Virtualization presents three distinct types of system devices to virtualized guests. These hardware devices all appear as physically attached hardware devices to the virtualized guest but the device drivers work in different ways.
- Emulated devices
- Emulated devices, sometimes referred to as virtual devices, exist entirely in software. Emulated device drivers are a translation layer between the operating system running on the host (which manages the source device) and the operating systems running on the guests. The device level instructions directed to and from the emulated device are intercepted and translated by the hypervisor. Any device of the same type as that being emulated and recognized by the Linux kernel is able to be used as the backing source device for the emulated drivers.
- Para-virtualized Devices
- Para-virtualized devices require the installation of device drivers on the guest operating system providing it with an interface to communicate with the hypervisor on the host machine. This interface is used to allow traditionally intensive tasks such as disk I/O to be performed outside of the virtualized environment. Lowering the overhead inherent in virtualization in this manner is intended to allow guest operating system performance closer to that expected when running directly on physical hardware.
- Physically shared devices
- Certain hardware platforms allow virtualized guests to directly access various hardware devices and components. This process in virtualization is known as passthrough or device assignment. Passthrough allows devices to appear and behave as if they were physically attached to the guest operating system.
9.2. Stable Device Addresses in Red Hat Virtualization
Virtual hardware PCI address allocations are persisted in the ovirt-engine database.
PCI addresses are allocated by QEMU
at virtual machine creation time, and reported to VDSM
by libvirt
. VDSM
reports them back to the Manager, where they are stored in the ovirt-engine database.
When a virtual machine is started, the Manager sends VDSM
the device address out of the database. VDSM
passes them to libvirt
which starts the virtual machine using the PCI device addresses that were allocated when the virtual machine was run for the first time.
When a device is removed from a virtual machine, all references to it, including the stable PCI address, are also removed. If a device is added to replace the removed device, it is allocated a PCI address by QEMU
, which is unlikely to be the same as the device it replaced.
9.3. Central Processing Unit (CPU)
Each host within a cluster has a number of virtual CPUs (vCPUs). The virtual CPUs are in turn exposed to guests running on the hosts. All virtual CPUs exposed by hosts within a cluster are of the type selected when the cluster was initially created via Red Hat Virtualization Manager. Mixing of virtual CPU types within a cluster is not possible.
Each available virtual CPU type has characteristics based on physical CPUs of the same name. The virtual CPU is indistinguishable from the physical CPU to the guest operating system.
Support for x2APIC:
All virtual CPU models provided by Red Hat Enterprise Linux 7 hosts include support for x2APIC. This provides an Advanced Programmable Interrupt Controller (APIC) to better handle hardware interrupts.
9.4. System Devices
System devices are critical for the guest to run and cannot be removed. Each system device attached to a guest also takes up an available PCI slot. The default system devices are:
- Host bridge
- ISA bridge and USB bridge (The USB and ISA bridges are the same device)
- Graphics card using the VGA or qxl driver
- Memory balloon device
For information about how to use PCI Express and conventional PCI devices with Intel Q35-based virtual machines, see Using PCI Express and Conventional PCI Devices with the Q35 Virtual Machine.
9.5. Network Devices
Red Hat Virtualization is able to expose three different types of network interface controller to guests. The type of network interface controller to expose to a guest is chosen when the guest is created but is changeable from the Red Hat Virtualization Manager.
- The e1000 network interface controller exposes a virtualized Intel PRO/1000 (e1000) to guests.
- The virtio network interface controller exposes a para-virtualized network device to guests.
- The rtl8139 network interface controller exposes a virtualized Realtek Semiconductor Corp RTL8139 to guests.
Multiple network interface controllers are permitted per guest. Each controller added takes up an available PCI slot on the guest.
9.6. Graphics Devices
The SPICE or VNC graphics protocols can be used to connect to the emulated graphics devices.
You can select a Video Type in the Administration Portal:
- QXL: Emulates a para-virtualized video card that works best with QXL guest drivers
-
VGA: Emulates a dummy VGA card with
Bochs
VESA extensions - BOCHS: Emulates a dummy VGA card without legacy emulation for guest machines that that run with UEFI. This is the default display video card emulator for UEFI servers.
For a virtual machine of type server
that is set with UEFI and uses compatibility level 4.6 or above, BOCHS is the default value of Video Type.
In Red Hat Virtualization 4.4.5, you must do the following to enable this feature:
Run the following command:
engine-config --set EnableBochsDisplay=true --cver=<version>
where
<version>
is the compatibility version.- Restart the engine.
- Set Video Type to BOCHS manually.
9.7. Storage Devices
Storage devices and storage pools can use the block device drivers to attach storage devices to virtualized guests. Note that the storage drivers are not storage devices. The drivers are used to attach a backing storage device, file or storage pool volume to a virtualized guest. The backing storage device can be any supported type of storage device, file, or storage pool volume.
- The IDE driver exposes an emulated block device to guests. The emulated IDE driver can be used to attach any combination of up to four virtualized IDE hard disks or virtualized IDE CD-ROM drives to each virtualized guest. The emulated IDE driver is also used to provide virtualized DVD-ROM drives.
- The VirtIO driver exposes a para-virtualized block device to guests. The para-virtualized block driver is a driver for all storage devices supported by the hypervisor attached to the virtualized guest (except for floppy disk drives, which must be emulated).
9.8. Sound Devices
Two emulated sound devices are available:
- The ac97 emulates an Intel 82801AA AC97 Audio compatible sound card.
- The es1370 emulates an ENSONIQ AudioPCI ES1370 sound card.
9.9. Serial Driver
The para-virtualized serial driver (virtio-serial) is a bytestream-oriented, character stream driver. The para-virtualized serial driver provides a simple communication interface between the host’s user space and the guest’s user space where networking is not be available or unusable.
9.10. Balloon Driver
The balloon driver allows guests to express to the hypervisor how much memory they require. The balloon driver allows the host to efficiently allocate and memory to the guest and allow free memory to be allocated to other guests and processes.
Guests using the balloon driver can mark sections of the guest’s RAM as not in use (balloon inflation). The hypervisor can free the memory and use the memory for other host processes or other guests on that host. When the guest requires the freed memory again, the hypervisor can reallocate RAM to the guest (balloon deflation).
Appendix A. Enumerated Value Translation
The API uses Red Hat Virtualization Query Language to perform search queries. For more information, see Searches in the Introduction to the Administration Portal.
Note that certain enumerated values in the API require a different search query when using the Query Language. The following tables provides a translation for these key enumerated values according to resource type.
API Enumerable Type | API Enumerable Value | Query Language Value |
---|---|---|
|
|
|
|
|
|
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
Appendix B. Event Codes
This table lists all event codes.
Code | Name | Severity | Message |
---|---|---|---|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|