This is actually a kind of anniversary post. 10 years ago I wrote a howto about compressing Virtualbox VDI disk images by zeroing the unused space. Today, although the hypervisor, the OS, even my whole build environment has changed, I’m still using the same technique to keep virtual disk images as small as possible.
This time I’m reducing the size of VMDK disk images during a CI/CD pipeline build using the qemu-img tool.
The basic idea is the same as 10 years ago: create and setup the disk image; as soon as you are finished delete unnecessary files and zero-fill the free space. Then use ‘qemu-img’ to either compact the disk image (if this is supported) or create a copy of it minus the free space.
This works because most hypervisors support dynamically allocated virtual disk image formats. They grow every time a disk sector is written for the first time until the virtual disk reaches its maximum defined capacity. A big advantage because it allows the efficient storage of rather empty virtual image files (eg. right after their initial setup).
But their size is not automatically reduced when files are deleted. This is where the ‘zeroing the empty space’ step kicks in… by zeroing these sectors, they can be marked as ’empty’ in the virtual disk image file. Next time when ‘compacting’ the image or when creating a copy of it, they will not be copied over to the new image.
In case of a freshly installed Linux system you can fill up the empty space by creating a temporary file on the root partition, fill it with zeros until all remaining disk space is used up, then delete it and shut down the system.
dd if=/dev/zero of=/target/temp.img bs=4M; sync; rm /target/temp.img
I’m assuming here that the target partition on the virtual disk you want to compress is mounted as /target. The first part (disk dump) will write zeros to a temporary file and use up all disk space available on the partition. After synchronizing the file system (just to be safe everything was written), the temporary file is deleted again.
I’m then using qemu-img to create a copy of the disk image. Here I’m converting a VMDK image again into a VMDK:
qemu-img convert -p -f vmdk -O vmdk /path/to/src.vmdk /path/to/dest.vmdk
The option -p stands for an optional progress meter, -f for the source format, and -O for the destination format. The last two parameters are the input and output filenames.
If you prefer qcow2 formatted disk images you can make use of the compress option to even further reduce the image size. The source format here is again VMDK; the option -c stands for compress:
qemu-img convert -c -p -f vmdk -O qcow2 /path/to/src_image.vmdk /path/to/dest_image.qcow2
So in the end, by simply creating a copy and thereby getting rid of empty space, the resulting image can be reduced to only a bit more than the overall size of the files in the image.