Chapter 1. Introduction to VDO on LVM
The Virtual Data Optimizer (VDO) feature provides inline block-level deduplication, compression, and thin provisioning for storage. You can manage VDO as a type of Logical Volume Manager (LVM) Logical Volumes (LVs), similar to LVM thin-provisioned volumes.
VDO volumes on LVM (LVM-VDO) contain the following components:
- VDO pool LV
- This is the backing physical device that stores, deduplicates, and compresses data for the VDO LV. The VDO pool LV sets the physical size of the VDO volume. It is the amount of data that VDO can store on the disk.
- Currently, each VDO pool LV can hold only one VDO LV. As a result, VDO deduplicates and compresses each VDO LV separately. Duplicate data that is stored on separate LVs do not benefit from data optimization of the same VDO volume.
- VDO LV
- This is the virtual, provisioned device on top of the VDO pool LV. The VDO LV sets the provisioned, logical size of the VDO volume. It is the amount of data that applications can write to the volume before deduplication and compression occurs.
dm-vdo-
The
dm-vdomodule is a Linux Device Mapper target that provides a deduplicated, compressed, and thin-provisioned block storage volume. -
The
dm-vdomodule exposes a block device that the VDO pool LV uses to create a VDO LV. The VDO LV is then used by the system. -
When
dm-vdoreceives a request to read a logical block of data from a VDO volume, it responds. It maps the requested logical block to the underlying physical block. It then reads and returns the requested data. -
When
dm-vdoreceives a request to write a block of data to a VDO volume, it checks if it is DISCARD, TRIM, or zeroed data. If either of these conditions is met,dm-vdoupdates its block map and acknowledges the request. Otherwise, VDO processes and optimizes the data. -
The
dm-vdomodule uses the Universal Deduplication Service (UDS) index on the volume internally and analyzes incoming data to find duplicates. For each new piece of data, UDS determines if that piece is identical to any previously stored piece of data. If the index finds a match, the storage system can then verify the accuracy of that match. It can then update records to prevent storing duplicate data.
-
The
If you are already familiar with the structure of an LVM thin-provisioned implementation, see the following table. It shows how the different aspects of VDO are presented to the system.
| Physical device | Provisioned device | |
|---|---|---|
| VDO on LVM | VDO pool LV | VDO LV |
| LVM thin provisioning | Thin pool | Thin volume |
As VDO is thin-provisioned, the file system and applications only see the logical space in use, not the actual available physical space. Use scripting to monitor the available physical space and generate an alert if use exceeds a threshold.