In this guide, you’ll learn how to use the Virtual Data Optimizer (VDO) to compress and deduplicate data on storage devices to ensure storage space is optimized. RHEL / CentOS & Fedora Linux distributions has Virtual Data Optimizer (VDO) Linux device mapper available for installation.

<img alt="" data-ezsrc="https://kirelos.com/wp-content/uploads/2020/02/echo/compress-and-deduplicate-storage-with-vdo-1024×383.png" data-ez ezimgfmt="rs rscb6 src ng ngcb6 srcset" src="data:image/svg xml,”>

VDO optimizes the data footprint on block devices by reducing disk space usage on block devices, and minimizing the replication of data, saving disk space and even increasing data throughput. VDO includes two kernel modules:

  • kvdo module – Transparently control data compression,
  • uds module – Handles data deduplication.

The VDO layer is placed on top of an existing block storage device, such as a local disk, RAID device, encrypted devices. The storage layer such as file systems and LVM logical volumes are then placed on top of a VDO device.

Step 1: Install Virtual Data Optimizer (VDO)

For RHEL and CentOS Linux distribution, you can easily install Virtual Data Optimizer (VDO) Linux device mapper by running the commands below.

sudo yum -y install vdo kmod-kvdo

Wait for the installation to complete.

Loaded plugins: fastestmirror
Determining fastest mirrors
 * base: centos.mirror.liquidtelecom.com
 * extras: centos.mirror.liquidtelecom.com
 * updates: centos.mirror.liquidtelecom.com
base                                                | 3.6 kB     00:00     
extras                                              | 2.9 kB     00:00     
updates                                             | 2.9 kB     00:00     
(1/2): extras/7/x86_64/primary_db                     | 159 kB   00:06     
(2/2): updates/7/x86_64/primary_db                    | 6.7 MB   00:40     
Resolving Dependencies
--> Running transaction check
---> Package kmod-kvdo.x86_64 0:6.1.2.41-5.el7 will be installed
---> Package vdo.x86_64 0:6.1.2.41-4.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================
 Package          Arch          Version                  Repository   Size
===========================================================================
Installing:
 kmod-kvdo        x86_64        6.1.2.41-5.el7           base        317 k
 vdo              x86_64        6.1.2.41-4.el7           base        624 k

Transaction Summary
===========================================================================
Install  2 Packages

Total download size: 941 k
Installed size: 4.5 M
Downloading packages:
(1/2): kmod-kvdo-6.1.2.41-5.el7.x86_64.rpm            | 317 kB   00:01     
(2/2): vdo-6.1.2.41-4.el7.x86_64.rpm                  | 624 kB   00:24     
---------------------------------------------------------------------------
Total                                          38 kB/s | 941 kB  00:24     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : kmod-kvdo-6.1.2.41-5.el7.x86_64                         1/2 
  Installing : vdo-6.1.2.41-4.el7.x86_64                               2/2 
  Verifying  : kmod-kvdo-6.1.2.41-5.el7.x86_64                         1/2 
  Verifying  : vdo-6.1.2.41-4.el7.x86_64                               2/2 

Installed:
  kmod-kvdo.x86_64 0:6.1.2.41-5.el7       vdo.x86_64 0:6.1.2.41-4.el7      

Complete!

Step 2: Creating VDO volume

VDO Volumes are the logical devices that you create using VDO. They are treated like disk partitions.You’ll just format them with a file-system and then a VDO volume can be mounted just like a regular file system. If you prefer LVM, you can use a VDO volume as an LVM physical volume.

I have a 10 GB disk that will be used for this exercise.

$  lsblk  /dev/sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb    8:16   0  10G  0 disk 

This is how you’ll create a VDO volume:

$ sudo vdo create --name myvdo --device /dev/sdb --vdoLogicalSize 5G
Creating VDO myvdo
Starting VDO myvdo
Starting compression on VDO myvdo
VDO instance 0 volume is ready at /dev/mapper/myvdo

Where:

  • myvdo is the name of the logical device that VDO presents to the user.
  • /dev/sdb is a block device to be used by VDO volume
  • 5G is the logical size of the VDO volume. This is optional and it can be more than the physical size of the actual block device.

Display a list of both started and non-started volumes.

$ sudo vdo list --all
myvdo

Run the vdo status command to analyze the volume.

$ sudo vdo status -n myvdo
VDO status:
  Date: '2020-02-12 12:42:04 03:00'
  Node: repo-server-01.safaricom.net
Kernel module:
  Loaded: true
  Name: kvdo
  Version information:
    kvdo version: 6.1.2.41
Configuration:
  File: /etc/vdoconf.yml
  Last modified: '2020-02-12 12:39:36'
VDOs:
  myvdo:
    Acknowledgement threads: 1
    Activate: enabled
    Bio rotation interval: 64
    Bio submission threads: 4
    Block map cache size: 128M
    Block map period: 16380
.......

Compression and Deduplication should be enabled.

$ sudo vdo status -n myvdo | egrep 'Compression|Deduplication'
    Compression: enabled
    Deduplication: enabled

You can grow an existing volume with the command vdo growLogical. I’ll grow the volume to 10GB total capacity.

sudo vdo growLogical -n  myvdo --vdoLogicalSize 10G

Confirm:

$ sudo vdo status -n myvdo | grep size
    Block map cache size: 128M
    Block size: 4096
    Logical size: 10G
    Physical size: 10G
    Read cache size: 0M
    Slab size: 2G
        block map cache size: 134217728
        block size: 4096

Step 3: Formatting the VDO volume with a file system.

You can format the the VDO volume with a file-system type of your choice or create an PV, VG and LV from it.

$ sudo mkfs.xfs /dev/mapper/myvdo

For LVM creation:

# Create PV
$ sudo  pvcreate /dev/mapper/myvdo
Physical volume "https://computingforgeeks.com/dev/mapper/myvdo" successfully created.

# Create VG
$ sudo vgcreate vg01 /dev/mapper/myvdo
Volume group "vg01" successfully created

# Create LV
$ sudo lvcreate -n lv01 -l 100%FREE vg01
Logical volume "lv01" created.

# Create a file system
$ sudo mkfs -t xfs /dev/mapper/vg01-lv01 
meta-data=/dev/mapper/vg01-lv01  isize=512    agcount=4, agsize=655104 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=2620416, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

You can now register new device and mount it.

sudo udevadm settle
sudo mkdir /myvdo

--- For standard VDO volume ---
$ sudo mount /dev/mapper/myvdo1 /mnt/myvdo

--- For LVM ---
$ sudo mount /dev/mapper/vg01-lv01  /myvdo

You can also display stats in human-readable form.

$ sudo vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/myvdo        10.0G      4.0G      6.0G  40%           98%

Step 4: Testing Deduplication:

I’ll download an ISO file for testing deduplication.

wget http://mirror.centos.org/centos/7/os/x86_64/images/boot.iso

Copy the file to /myvdo directory.

sudo cp boot.iso /myvdo/boot1.iso

Check storage stats.

--- Before copy ---
$ sudo vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/myvdo        10.0G      4.0G      6.0G  40%           98%

--- After copy ---
$ sudo vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/myvdo        10.0G      4.2G      5.8G  42%            7%

You can notice the value of the Used field increased from 4.0G to 4.2G because we copied a file to the volume which occupies some space.

Let’s do a second copy of the same file.

sudo cp boot.iso /myvdo/boot2.iso

View volume stats again.

$ sudo vdostats --human-readable
Device                    Size      Used Available Use% Space saving%
/dev/mapper/myvdo        10.0G      4.2G      5.8G  42%           52%

You can see that the used volume space did not change. Rather, the percentage of the saved volume space increased to 52% proving that the data deduplication occurred to reduce the space consumption for the redundant copies of the same file.

More guides:

Configure User Password Aging / Expiry Policy in Linux

How To Optimize Linux System Performance with tuned-adm

Preserve Systemd Journals Logging with Persistent Storage

Understanding the Linux File System Hierarchy

How To Create Hard Links and Soft (Symbolic) Links in Linux