Monthly Archive for June, 2013

Enabling QEMU Guest Agent anddddd FSTRIM (AGAIN)


In an earlier post I walked through reclaiming disk space from guests using FSTRIM and in a follow up I showed how to do the same thing with thin Logical Volumes as the sparse backing storage for the disk images. In both of the previous posts I logged in to the guest first and then executed the fstrim command in order to release the free blocks back to the underlying block devices.

Thankfully, due to some recent work, this operation has now been exposed externally via qemu Guest Agent and can be executed remotely via libvirt. To enable qemu Guest Agent, first I added a virtio-serial device that the host and guest will use for communication. I did this by adding the following to the guest's xml:

<channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/Fedora19.agent'/> <target type='virtio' name='org.qemu.guest_agent.0'/> </channel>

After a power cycle of the guest I added the qemu-guest-agent rpm inside of my guest. Then I started the qemu-guest-agent service using systemctl as shown below:

[root@guest ~]# yum install qemu-guest-agent ... Installed: qemu-guest-agent.x86_64 2:1.4.2-2.fc19 Complete! [root@guest ~]# systemctl start qemu-guest-agent.service [root@guest ~]# systemctl status qemu-guest-agent.service qemu-guest-agent.service - QEMU Guest Agent Loaded: loaded (/usr/lib/systemd/system/qemu-guest-agent.service; static) Active: active (running) since Sun 2013-06-02 16:38:18 EDT; 6s ago Main PID: 913 (qemu-ga) CGroup: name=systemd:/system/qemu-guest-agent.service └─913 /usr/bin/qemu-ga

Finally I could test out the fstrim functionality (again)! In the host I copied the file into the guest.

[root@host ~]# du -sh /guests/Fedora19.img 1.3G /guests/Fedora19.img [root@host ~]# [root@host ~]# scp /tmp/code.tar.gz root@192.168.100.136:/root/ root@192.168.100.136's password: code.tar.gz 100% 1134MB 81.0MB/s 00:14 [root@host ~]# du -sh /guests/Fedora19.img 2.4G /guests/Fedora19.img

Then, inside the guest I deleted the file:

[root@guest ~]# rm /root/code.tar.gz rm: remove regular file ‘/root/code.tar.gz’? y

And finally I can remotely execute the guest-fstrim command via virsh:

[root@host ~]# [root@host ~]# virsh qemu-agent-command Fedora19 '{"execute":"guest-fstrim"}' {"return":{}} [root@host ~]# du -sh /guests/Fedora19.img 1.3G /guests/Fedora19.img

This is powerful stuff because I can now remotely (via libvirt) direct all of my guests, whether there be 5 or 5000, to all give back free space to underlying sparse storage devices.

The full guest libvirt XML from this post can be found here.

Hope everyone is having a great summer!

Dusty

Guest Discard/FSTRIM On Thin LVs


In my last post I showed how to recover space from disk images backed by sparse files. As a small addition I'd like to also show how to do the same with a guest disk image that is backed by a thinly provisioned Logical Volume.

First things first, I modified the /etc/lvm/lvm.conf file to have the issue_discards = 1 option set. I'm not 100% sure this is needed but I did it at the time so I wanted to include it here.

Next I created a new VG (vgthin) out of a spare partition and then created an thin LV pool (lvthinpool) inside the VG. Finally I created a thin LV within the pool (lvthin). This is all shown below:

[root@host ~]# vgcreate vgthin /dev/sda3 Volume group "vgthin" successfully created [root@host ~]# lvcreate --thinpool lvthinpool --size 20G vgthin Logical volume "lvthinpool" created [root@host ~]# [root@host ~]# lvcreate --name lvthin --virtualsize 10G --thin vgthin/lvthinpool Logical volume "lvthin" created

To observe the usages of the thin LV and the thin pool you can use the lvs command and take note of the Data% column:

[root@host ~]# lvs vgthin LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lvthin vgthin Vwi-aotz- 10.00g lvthinpool 0.00 lvthinpool vgthin twi-a-tz- 20.00g 0.00

Next I needed to add the disk to the guest. I did it using the following xml and virsh command. Note from my previous post that the scsi controller inside of my guest is a virtio-scsi controller and that I am adding the discard='unmap' option.

[root@host ~]# cat <<EOF > /tmp/thinLV.xml <disk type='block' device='disk'> <driver name='qemu' type='raw' discard='unmap'/> <source dev='/dev/vgthin/lvthin'/> <target dev='sdb' bus='scsi'/> </disk> EOF [root@host ~]# [root@host ~]# virsh attach-device Fedora19 /tmp/thinLV.xml --config ...

After a quick power cycle of the guest I then created a filesystem on the new disk (sdb) and mounted it within the guest.

[root@guest ~]# mkfs.ext4 /dev/sdb ... [root@guest ~]# [root@guest ~]# mount /dev/sdb /mnt/

Same as last time, I then copied a large file into the guest. After I did so you can see from the lvs output that the thin LV is now using 11% of its allotted space within the pool.

[root@host ~]# lvs vgthin LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lvthin vgthin Vwi-aotz- 10.00g lvthinpool 1.34 lvthinpool vgthin twi-a-tz- 20.00g [root@host ~]# [root@host ~]# scp /tmp/code.tar.gz root@192.168.100.136:/mnt/ root@192.168.100.136's password: code.tar.gz 100% 1134MB 29.8MB/s 00:38 [root@host ~]# [root@host ~]# lvs vgthin LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lvthin vgthin Vwi-aotz- 10.00g lvthinpool 11.02 lvthinpool vgthin twi-a-tz- 20.00g

It was then time for a little TRIM action:

[root@guest ~]# df -kh /mnt/ Filesystem Size Used Avail Use% Mounted on /dev/sdb 9.8G 1.2G 8.1G 13% /mnt [root@guest ~]# [root@guest ~]# [root@guest ~]# rm /mnt/code.tar.gz rm: remove regular file ‘/mnt/code.tar.gz’? y [root@guest ~]# [root@guest ~]# df -kh /mnt/ Filesystem Size Used Avail Use% Mounted on /dev/sdb 9.8G 23M 9.2G 1% /mnt [root@guest ~]# fstrim -v /mnt/ /mnt/: 1.2 GiB (1329049600 bytes) trimmed

And from within the host we can see that the utilization of the thin LV has appropriately dwindled back down to ~2.85%

[root@host ~]# lvs vgthin LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lvthin vgthin Vwi-aotz- 10.00g lvthinpool 2.85 lvthinpool vgthin twi-a-tz- 20.00g 1.42

Again I have posted my full guest libvirt XML here.

Dusty

PS See here for a more thorough example of creating thin LVs.

Recover Space From VM Disk Images By Using Discard/FSTRIM


Sparse guest disk image files are a dream. I can have many guests on a small amount of storage because they are only using what they need. Of course, if each guest were to suddenly use all of the space in their filesystems then the host filesystem containing the guest disk images would fill up as well. However, since filesystems grow over time rather than overnight, with proper monitoring you can foresee this event and add more storage as needed.

Sparse guest disk images aren't all bells and whistles though. Over time files are created/deleted within the filesystems on the disk images and the images themselves are no longer as compact as they were in the past. There is good news though; we can recover the space from all of those deleted files!

A Little History


With the rise of SSDs has come along a new low level command known as TRIM that allows the filesystem to notify the underlying block device of blocks that are no longer in use by the filesystem. This allows for improved performance in SSDs because delete operations can be handled in advance of write operations, thus speeding up writes.

Fortunately for us this TRIM notification also has plenty of application with thinly provisioned block devices. If the filesystem can notify a thin LV or a sparse disk image of blocks that are no longer being used then the blocks can be released back to the pool of available space.

"So I should be able to recover space from my guest disk images, right?" The answer is "yes"! It is relatively new, but virtio-scsi devices (QEMU) support TRIM operations. This is available in QEMU 1.5.0 by adding discard=unmap to the -drive option. You can also bypass the QEMU command line by using Libvirt 1.0.6 and adding the discard=unmap option to disk XML.

Creating/Configuring Guest For Discard


To take advantage of discard/TRIM operations I needed a guest that utilizes virtio-scsi. I created a guest with a virtio-scsi backed device by using the following virt-install command.

[root@host ~]# virt-install --name Fedora19 --disk path=/guests/Fedora19.img,size=30,bus=scsi --controller scsi,model=virtio-scsi --network=bridge:virbr0,model=virtio --accelerate --ram 2048 -c /images/F19.iso

The XML that was generated clearly shows that scsi controller 0 is of model virtio-scsi and thus all scsi devices on that controller will be virtio-scsi devices.

<controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </controller>

The next step was to actually notify QEMU that we want to relay discard operations from the guest to the host. This is supported in QEMU 1.5.0 (since commit a9384aff5315e7568b6ebc171f4a482e01f06526 ). Fortunately libvirt also added support for this in version 1.0.6 (since commit a7c4202cdd12208dcd107fde3b79b2420d863370 ).

For libvirt, to make all discard/TRIM operations be passed from the guest back to the host I had to add the discard='unmap' to the disk XML description. After adding the option the XML looked like the following block:

<disk type='file' device='disk'> <driver name='qemu' type='raw' discard='unmap'/> <source file='/guests/Fedora19.img'/> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>

Trimming The Fat


After a power cycle of the guest I am now able to test it out. First I checked the disk image size and then copied a 1.2G file into the guest. Afterwards I confirmed the sparse disk image had increased size in the host.

[root@host ~]# du -sh /guests/Fedora19.img 1.1G /guests/Fedora19.img [root@host ~]# [root@host ~]# du -sh /tmp/code.tar.gz 1.2G /tmp/code.tar.gz [root@host ~]# [root@host ~]# scp /tmp/code.tar.gz root@192.168.100.136:/root/ root@192.168.100.136's password: code.tar.gz 100% 1134MB 81.0MB/s 00:14 : [root@host ~]# [root@host ~]# du -sh /guests/Fedora19.img 2.1G /guests/Fedora19.img

Within the guest I then deleted the file and executed the fstrim command in order to notify the block devices that the blocks for that file (and any other file that had been deleted) are no longer being used by the filesystem.

[root@guest ~]# rm /root/code.tar.gz rm: remove regular file ‘/root/code.tar.gz’? y [root@guest ~]# [root@guest ~]# fstrim -v / /: 1.3 GiB (1372569600 bytes) trimmed

As can be seen from the output of the fstrim command approximately 1.3G were trimmed. A final check of the guest disk image confirms that the space was recovered in the host filesystem.

[root@host ~]# du -sh /guests/Fedora19.img 1.1G /guests/Fedora19.img

If anyone is interested I have posted my full guest libvirt XML here .

Until Next Time,
Dusty

NOTE: An easy way to tell if trim operations are supported in the guest is to cat out the /sys/block/sda/queue/discard_* files. On my system that supports trim operations it looks like:

[root@guest ~]# cat /sys/block/sda/queue/discard_* 4096 4294966784 0

TRIM/SSD Reference Material:
https://patrick-nagel.net/blog/archives/337
http://www.linux-kvm.org/wiki/images/7/77/2012-forum-thin-provisioning.pdf
http://www.outflux.net/blog/archives/2012/02/15/discard-hole-punching-and-trim/