Archive for the 'libvirt' Category

qemu-img Backing Files: A Poor Man's Snapshot/Rollback

I often like to formulate detailed steps when trying to reproduce a bug or a working setup. VMs are great for this because they can be manipulated easily. To manipulate their disk images I use qemu-img to create new disk images that use other disk images as a backing store. This is what I like to call a "poor man's" way to do snapshots because the snapshotting process is a bit manual, but that is also why I like it; I don't touch the original disk image at all so I have full confidence I haven't compromised it.

NOTE: I use QEMU/KVM/Libvirt so those are the tools used in this example:

Taking A Snapshot

In order to take a snapshot you should first shutdown the VM and then simply create a new disk image that uses the original disk image as a backing store:

$ sudo virsh shutdown F21server
Domain F21server is being shutdown
$ sudo qemu-img create -f qcow2 -b /guests/F21server.img /guests/F21server.qcow2.snap
Formatting '/guests/F21server.qcow2.snap', fmt=qcow2 size=21474836480 backing_file='/guests/F21server.img' encryption=off cluster_size=65536 lazy_refcounts=off

This new disk image is a COW snapshot of the original image, which means any writes will go into the new image but any reads of non-modified blocks will be read from the original image. A benefit of this is that the size of the new file will start off at 0 and increase only as modifications are made.

To get the virtual machine to pick up and start using the new COW disk image we will need to modify the libvirt XML to point it at the new file:

$ sudo virt-xml F21server --edit target=vda --disk driver_type=qcow2,path=/guests/F21server.qcow2.snap --print-diff
--- Original XML
+++ Altered XML
@@ -27,8 +27,8 @@
   <devices>
     <emulator>/usr/bin/qemu-kvm</emulator>
     <disk type="file" device="disk">
-      <driver name="qemu" type="raw"/>
-      <source file="/guests/F21server.img"/>
+      <driver name="qemu" type="qcow2"/>
+      <source file="/guests/F21server.qcow2.snap"/>
       <target dev="vda" bus="virtio"/>
       <address type="pci" domain="0x0000" bus="0x00" slot="0x07" function="0x0"/>
     </disk>
$
$ sudo virt-xml F21server --edit target=vda --disk driver_type=qcow2,path=/guests/F21server.qcow2.snap
Domain 'F21server' defined successfully.

You can now start your VM and make changes as you wish. Be destructive if you like; the original disk image hasn't been touched.

After making a few changes I had around 15M of differences between the original image and the snapshot:

$ du -sh /guests/F21server.img
21G     /guests/F21server.img
$ du -sh /guests/F21server.qcow2.snap
15M     /guests/F21server.qcow2.snap

Going Back

To go back to the point you started you must first delete the file that you created (/guests/F21server.qcow2.snap) and then you have two options:

  • Again create a disk image using the origin as a backing file.
  • Go back to using the original image.

If you want to continue testing and going back to your starting point then you will want to delete and recreate the COW snapshot disk image:

$ sudo rm /guests/F21server.qcow2.snap
$ sudo qemu-img create -f qcow2 -b /guests/F21server.img /guests/F21server.qcow2.snap
Formatting '/guests/F21server.qcow2.snap', fmt=qcow2 size=21474836480 backing_file='/guests/F21server.img' encryption=off cluster_size=65536 lazy_refcounts=off

If you want to go back to your original setup then we'll also need to change back the xml to what it was before:

$ sudo rm /guests/F21server.qcow2.snap
$ sudo virt-xml F21server --edit target=vda --disk driver_type=raw,path=/guests/F21server.img
Domain 'F21server' defined successfully.

Committing Changes

If you happen to decide that the changes you have made are some that you want to carry forward then you can commit the changes in the COW disk image into the backing disk image. In the case below I have 15M worth of changes that get committed back into the original image. I then edit the xml accordingly and can start the guest with all the changes baked back into the original disk image:

$ sudo qemu-img info /guests/F21server.qcow2.snap
image: /guests/F21server.qcow2.snap
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 15M
cluster_size: 65536
backing file: /guests/F21server.img
$ sudo qemu-img commit /guests/F21server.qcow2.snap
Image committed.
$ sudo rm /guests/F21server.qcow2.snap
$ sudo virt-xml F21server --edit target=vda --disk driver_type=raw,path=/guests/F21server.img
Domain 'F21server' defined successfully.

Fin

This backing file approach is useful because it's much more convenient than making multiple copies of huge disk image files, but it can be used for much more than just snapshotting/reverting changes. It can also be used to start 100 virtual machines from a common backing image, thus saving space...etc.. Go ahead and try it!

Happy Snapshotting!
Dusty

Nested Virt and Fedora 20 Virt Test Day

Introduction


I decided this year to take part in the Fedora Virtualization Test Day on October 8th. In order to take part I needed a system with Fedora 20 installed so that I could then create VMs on top. Since I like my current setup and I didn't have a hard drive laying around that I wanted to wipe I decided to give nested virtualization a shot.

Most of the documentation I have seen for nested virtualization has come from Kashyap Chamarthy. Relevant posts are here, here, and here. He has done a great job with these tutorials and this post is nothing more than my notes for what I found to work for me.

Steps


With nested virtualization the OS/Hypervisor that touches the physical hardware is known as L0. The first level of virtualized guest is known as L1. The second level of virtualized guest (the guest inside a guest) is known as L2. In my setup I ultimately wanted F19(L0), F20(L1), and F20(L2).

First, in order to pass along intel vmx extensions to the guest I created a modprobe config file that instructs the kvm_intel kernel module to allow nested virtualization support:

[root@L0 ~]# echo "options kvm-intel nested=y" > /etc/modprobe.d/nestvirt.conf

After a reboot I can now confirm the kvm_intel moduel is configured for nested virt:

[root@L0 ~]# cat /sys/module/kvm_intel/parameters/nested Y

Next I converted an existing Fedora 20 installation to use "host-passthrough" (see here) so that the L1 guest would see the same processor (with vmx extensions) as my L0 host. To do this i modified the cpu xml tags as follows in the libvirt xml definition:

<cpu mode='host-passthrough'> </cpu>

After powering up the guest I now see that the processor that the L1 guest sees is indeed the same as the host:
[root@L1 ~]# cat /proc/cpuinfo | grep "model name" model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

Next I decided to enable nested virt in the L1 guest by adding the same modprobe.conf file as I did in L0. I did this based on a tip from Kashyap in the #fedora-test-day chat that this tends to give about a 10X performance improvement in the L2 guests.

[root@L1 ~]# echo "options kvm-intel nested=y" > /etc/modprobe.d/nestvirt.conf

After a reboot I could then create and install L2 guests using virt-install and virt-manager. This seemed to work fine except for the fact that I would often see an unknown NMI in the guest periodically.

[ 14.324786] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 14.325046] Do you have a strange power saving mode enabled? [ 14.325046] Dazed and confused, but trying to continue

I believe the issue I was seeing may be documented in kernel BZ#58941 . After asking about it in the chat I was informed that for the best experience with nested virt I should go to a 3.12 kernel. I decided to leave that exercise for another day :).

Have a great day!

Dusty

Enabling QEMU Guest Agent anddddd FSTRIM (AGAIN)


In an earlier post I walked through reclaiming disk space from guests using FSTRIM and in a follow up I showed how to do the same thing with thin Logical Volumes as the sparse backing storage for the disk images. In both of the previous posts I logged in to the guest first and then executed the fstrim command in order to release the free blocks back to the underlying block devices.

Thankfully, due to some recent work, this operation has now been exposed externally via qemu Guest Agent and can be executed remotely via libvirt. To enable qemu Guest Agent, first I added a virtio-serial device that the host and guest will use for communication. I did this by adding the following to the guest's xml:

<channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/Fedora19.agent'/> <target type='virtio' name='org.qemu.guest_agent.0'/> </channel>

After a power cycle of the guest I added the qemu-guest-agent rpm inside of my guest. Then I started the qemu-guest-agent service using systemctl as shown below:

[root@guest ~]# yum install qemu-guest-agent ... Installed: qemu-guest-agent.x86_64 2:1.4.2-2.fc19 Complete! [root@guest ~]# systemctl start qemu-guest-agent.service [root@guest ~]# systemctl status qemu-guest-agent.service qemu-guest-agent.service - QEMU Guest Agent Loaded: loaded (/usr/lib/systemd/system/qemu-guest-agent.service; static) Active: active (running) since Sun 2013-06-02 16:38:18 EDT; 6s ago Main PID: 913 (qemu-ga) CGroup: name=systemd:/system/qemu-guest-agent.service └─913 /usr/bin/qemu-ga

Finally I could test out the fstrim functionality (again)! In the host I copied the file into the guest.

[root@host ~]# du -sh /guests/Fedora19.img 1.3G /guests/Fedora19.img [root@host ~]# [root@host ~]# scp /tmp/code.tar.gz root@192.168.100.136:/root/ root@192.168.100.136's password: code.tar.gz 100% 1134MB 81.0MB/s 00:14 [root@host ~]# du -sh /guests/Fedora19.img 2.4G /guests/Fedora19.img

Then, inside the guest I deleted the file:

[root@guest ~]# rm /root/code.tar.gz rm: remove regular file ‘/root/code.tar.gz’? y

And finally I can remotely execute the guest-fstrim command via virsh:

[root@host ~]# [root@host ~]# virsh qemu-agent-command Fedora19 '{"execute":"guest-fstrim"}' {"return":{}} [root@host ~]# du -sh /guests/Fedora19.img 1.3G /guests/Fedora19.img

This is powerful stuff because I can now remotely (via libvirt) direct all of my guests, whether there be 5 or 5000, to all give back free space to underlying sparse storage devices.

The full guest libvirt XML from this post can be found here.

Hope everyone is having a great summer!

Dusty

Recover Space From VM Disk Images By Using Discard/FSTRIM


Sparse guest disk image files are a dream. I can have many guests on a small amount of storage because they are only using what they need. Of course, if each guest were to suddenly use all of the space in their filesystems then the host filesystem containing the guest disk images would fill up as well. However, since filesystems grow over time rather than overnight, with proper monitoring you can foresee this event and add more storage as needed.

Sparse guest disk images aren't all bells and whistles though. Over time files are created/deleted within the filesystems on the disk images and the images themselves are no longer as compact as they were in the past. There is good news though; we can recover the space from all of those deleted files!

A Little History


With the rise of SSDs has come along a new low level command known as TRIM that allows the filesystem to notify the underlying block device of blocks that are no longer in use by the filesystem. This allows for improved performance in SSDs because delete operations can be handled in advance of write operations, thus speeding up writes.

Fortunately for us this TRIM notification also has plenty of application with thinly provisioned block devices. If the filesystem can notify a thin LV or a sparse disk image of blocks that are no longer being used then the blocks can be released back to the pool of available space.

"So I should be able to recover space from my guest disk images, right?" The answer is "yes"! It is relatively new, but virtio-scsi devices (QEMU) support TRIM operations. This is available in QEMU 1.5.0 by adding discard=unmap to the -drive option. You can also bypass the QEMU command line by using Libvirt 1.0.6 and adding the discard=unmap option to disk XML.

Creating/Configuring Guest For Discard


To take advantage of discard/TRIM operations I needed a guest that utilizes virtio-scsi. I created a guest with a virtio-scsi backed device by using the following virt-install command.

[root@host ~]# virt-install --name Fedora19 --disk path=/guests/Fedora19.img,size=30,bus=scsi --controller scsi,model=virtio-scsi --network=bridge:virbr0,model=virtio --accelerate --ram 2048 -c /images/F19.iso

The XML that was generated clearly shows that scsi controller 0 is of model virtio-scsi and thus all scsi devices on that controller will be virtio-scsi devices.

<controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </controller>

The next step was to actually notify QEMU that we want to relay discard operations from the guest to the host. This is supported in QEMU 1.5.0 (since commit a9384aff5315e7568b6ebc171f4a482e01f06526 ). Fortunately libvirt also added support for this in version 1.0.6 (since commit a7c4202cdd12208dcd107fde3b79b2420d863370 ).

For libvirt, to make all discard/TRIM operations be passed from the guest back to the host I had to add the discard='unmap' to the disk XML description. After adding the option the XML looked like the following block:

<disk type='file' device='disk'> <driver name='qemu' type='raw' discard='unmap'/> <source file='/guests/Fedora19.img'/> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>

Trimming The Fat


After a power cycle of the guest I am now able to test it out. First I checked the disk image size and then copied a 1.2G file into the guest. Afterwards I confirmed the sparse disk image had increased size in the host.

[root@host ~]# du -sh /guests/Fedora19.img 1.1G /guests/Fedora19.img [root@host ~]# [root@host ~]# du -sh /tmp/code.tar.gz 1.2G /tmp/code.tar.gz [root@host ~]# [root@host ~]# scp /tmp/code.tar.gz root@192.168.100.136:/root/ root@192.168.100.136's password: code.tar.gz 100% 1134MB 81.0MB/s 00:14 : [root@host ~]# [root@host ~]# du -sh /guests/Fedora19.img 2.1G /guests/Fedora19.img

Within the guest I then deleted the file and executed the fstrim command in order to notify the block devices that the blocks for that file (and any other file that had been deleted) are no longer being used by the filesystem.

[root@guest ~]# rm /root/code.tar.gz rm: remove regular file ‘/root/code.tar.gz’? y [root@guest ~]# [root@guest ~]# fstrim -v / /: 1.3 GiB (1372569600 bytes) trimmed

As can be seen from the output of the fstrim command approximately 1.3G were trimmed. A final check of the guest disk image confirms that the space was recovered in the host filesystem.

[root@host ~]# du -sh /guests/Fedora19.img 1.1G /guests/Fedora19.img

If anyone is interested I have posted my full guest libvirt XML here .

Until Next Time,
Dusty

NOTE: An easy way to tell if trim operations are supported in the guest is to cat out the /sys/block/sda/queue/discard_* files. On my system that supports trim operations it looks like:

[root@guest ~]# cat /sys/block/sda/queue/discard_* 4096 4294966784 0

TRIM/SSD Reference Material:
https://patrick-nagel.net/blog/archives/337
http://www.linux-kvm.org/wiki/images/7/77/2012-forum-thin-provisioning.pdf
http://www.outflux.net/blog/archives/2012/02/15/discard-hole-punching-and-trim/

Share a Folder Between KVM Host and Guest


I often find myself in situations where I need to share information or files between a KVM host and KVM guest. With libvirt version 0.8.5 and newer there is support for mounting a shared folder between a host and guest. I decided to try this out on my Fedora 17 host, with a Fedora 17 guest.

Using the libvirt <filesystem> xml tag I created the following xml that defines a filesystem device.

<filesystem type='mount' accessmode='mapped'> <source dir='/tmp/shared'/> <target dir='tag'/> </filesystem>

Note that target dir is not necessarily a mount point, but rather a string that is exported to the guest that we will use when mounting in the guest.

In order to get this xml into the guest I had to use virsh edit F17 where F17 is the domain name of my guest. This opens the guest xml in the VI text editor. I then inserted the xml at the end of the <devices> section of the guest xml, closed VI, and started the guest.

Once the guest had booted I used the following command to mount the shared folder in the guest.

[root@F17 ~]# mount -t 9p -o trans=virtio,version=9p2000.L tag /mnt/shared/

And voila! I can now access the /tmp/ directory of the host inside of the guest.

Note: I had some SELinux denials as a result of doing this. If I was using this as a long term solution I would clean them up, but for now I just disabled SELinux temporarily by using sudo setenforce 0 in the host.

Resources:
http://www.linux-kvm.org/page/9p_virtio
http://wiki.qemu.org/Documentation/9psetup
http://libvirt.org/formatdomain.html#elementsFilesystems