30.2. System Requirements
Processor Architectures
One or more processors implementing the Intel 64 instruction set are required: that is, a processor of the AMD64 or Intel 64 architecture.
RAM
Each VDO volume has two distinct memory requirements:
- The VDO module requires 370 MB plus an additional 268 MB per each 1 TB of physical storage managed.
- The Universal Deduplication Service (UDS) index requires a minimum of 250 MB of DRAM, which is also the default amount that deduplication uses. For details on the memory usage of UDS, see Section 30.2.1, “UDS Index Memory Requirements”.
Storage
A VDO volume is a thinly provisioned block device. To prevent running out of physical space, place the volume on top of storage that you can expand at a later time. Examples of such expandable storage are LVM volumes or MD RAID arrays.
A single VDO volume can be configured to use up to 256 TB of physical storage. See Section 30.2.2, “VDO Storage Space Requirements” for the calculations to determine the usable size of a VDO-managed volume from the physical size of the storage pool the VDO is given.
Additional System Software
VDO depends on the following software:
- LVM
- Python 2.7
The
yum
package manager will install all necessary software dependencies automatically.
Placement of VDO in the Storage Stack
As a general rule, you should place certain storage layers under VDO and others on top of VDO:
- Under VDO: DM-Multipath, DM-Crypt, and software RAID (LVM or
mdraid
). - On top of VDO: LVM cache, LVM snapshots, and LVM Thin Provisioning.
The following configurations are not supported:
- VDO on top of VDO volumes: storage
VDO LVM VDO - VDO on top of LVM Snapshots
- VDO on top of LVM Cache
- VDO on top of the loopback device
- VDO on top of LVM Thin Provisioning
- Encrypted volumes on top of VDO: storage
VDO DM-Crypt - Partitions on a VDO volume:
fdisk
,parted
, and similar partitions - RAID (LVM, MD, or any other type) on top of a VDO volume
Important
VDO supports two write modes:
sync
and async
. When VDO is in sync
mode, writes to the VDO device are acknowledged when the underlying storage has written the data permanently. When VDO is in async
mode, writes are acknowledged before being written to persistent storage.
It is critical to set the VDO write policy to match the behavior of the underlying storage. By default, VDO write policy is set to the
auto
option, which selects the appropriate policy automatically.
For more information, see Section 30.4.2, “Selecting VDO Write Modes”.
30.2.1. UDS Index Memory Requirements
The UDS index consists of two parts:
- A compact representation is used in memory that contains at most one entry per unique block.
- An on-disk component which records the associated block names presented to the index as they occur, in order.
UDS uses an average of 4 bytes per entry in memory (including cache).
The on-disk component maintains a bounded history of data passed to UDS. UDS provides deduplication advice for data that falls within this deduplication window, containing the names of the most recently seen blocks. The deduplication window allows UDS to index data as efficiently as possible while limiting the amount of memory required to index large data repositories. Despite the bounded nature of the deduplication window, most datasets which have high levels of deduplication also exhibit a high degree of temporal locality — in other words, most deduplication occurs among sets of blocks that were written at about the same time. Furthermore, in general, data being written is more likely to duplicate data that was recently written than data that was written a long time ago. Therefore, for a given workload over a given time interval, deduplication rates will often be the same whether UDS indexes only the most recent data or all the data.
Because duplicate data tends to exhibit temporal locality, it is rarely necessary to index every block in the storage system. Were this not so, the cost of index memory would outstrip the savings of reduced storage costs from deduplication. Index size requirements are more closely related to the rate of data ingestion. For example, consider a storage system with 100 TB of total capacity but with an ingestion rate of 1 TB per week. With a deduplication window of 4 TB, UDS can detect most redundancy among the data written within the last month.
UDS's Sparse Indexing feature (the recommended mode for VDO) further exploits temporal locality by attempting to retain only the most relevant index entries in memory. UDS can maintain a deduplication window that is ten times larger while using the same amount of memory. While the sparse index provides the greatest coverage, the dense index provides more advice. For most workloads, given the same amount of memory, the difference in deduplication rates between dense and sparse indexes is negligible.
The memory required for the index is determined by the desired size of the deduplication window:
- For a dense index, UDS will provide a deduplication window of 1 TB per 1 GB of RAM. A 1 GB index is generally sufficient for storage systems of up to 4 TB.
- For a sparse index, UDS will provide a deduplication window of 10 TB per 1 GB of RAM. A 1 GB sparse index is generally sufficient for up to 40 TB of physical storage.
For concrete examples of UDS Index memory requirements, see Section 30.2.3, “Examples of VDO System Requirements by Physical Volume Size”
30.2.2. VDO Storage Space Requirements
VDO requires storage space both for VDO metadata and for the actual UDS deduplication index:
- VDO writes two types of metadata to its underlying physical storage:
- The first type scales with the physical size of the VDO volume and uses approximately 1 MB for each 4 GB of physical storage plus an additional 1 MB per slab.
- The second type scales with the logical size of the VDO volume and consumes approximately 1.25 MB for each 1 GB of logical storage, rounded up to the nearest slab.
See Section 30.1.3, “VDO Volume” for a description of slabs. - The UDS index is stored within the VDO volume group and is managed by the associated VDO instance. The amount of storage required depends on the type of index and the amount of RAM allocated to the index. For each 1 GB of RAM, a dense UDS index will use 17 GB of storage, and a sparse UDS index will use 170 GB of storage.
For concrete examples of VDO storage requirements, see Section 30.2.3, “Examples of VDO System Requirements by Physical Volume Size”
30.2.3. Examples of VDO System Requirements by Physical Volume Size
The following tables provide approximate system requirements of VDO based on the size of the underlying physical volume. Each table lists requirements appropriate to the intended deployment, such as primary storage or backup storage.
The exact numbers depend on your configuration of the VDO volume.
Primary Storage Deployment
In the primary storage case, the UDS index is between 0.01% to 25% the size of the physical volume.
Physical Volume Size | 10 GB – 1–TB | 2–10 TB | 11–50 TB | 51–100 TB | 101–256 TB |
---|---|---|---|---|---|
RAM Usage | 250 MB |
Dense: 1 GB
Sparse: 250 MB
| 2 GB | 3 GB | 12 GB |
Disk Usage | 2.5 GB |
Dense: 10 GB
Sparse: 22 GB
| 170 GB | 255 GB | 1020 GB |
Index Type | Dense | Dense or Sparse | Sparse | Sparse | Sparse |
Backup Storage Deployment
In the backup storage case, the UDS index covers the size of the backup set but is not bigger than the physical volume. If you expect the backup set or the physical size to grow in the future, factor this into the index size.
Physical Volume Size | 10 GB – 1 TB | 2–10 TB | 11–50 TB | 51–100 TB | 101–256 TB |
---|---|---|---|---|---|
RAM Usage | 250 MB | 2 GB | 10 GB | 20 GB | 26 GB |
Disk Usage | 2.5 GB | 170 GB | 850 GB | 1700 GB | 3400 GB |
Index Type | Dense | Sparse | Sparse | Sparse | Sparse |