One of the most compelling features of virtualization for me is the speed and flexibility in provisioning systems. There are a number of logical steps you can perform quite easily in order to get systems up really quickly. Just this week on an average KVM host with 10k RPM disks I brought up a minimal Fedora 13 VM ready to go in DNS on the network in just over 30 seconds: 14 to build, 18 to boot. It still brings a smile to my face. :) So, let's get started.
There are two basic components to a quick provisioning process:
1. Source preparation - This is the piece that takes an existing system and packages it up (what I'll call bundling) such that it can be quickly placed onto the disk of new VMs. As you'll see, I use trusty old tar to do this.
2. VM Build Process - This is the big piece that does "the rest." Everything that a new VM needs will be done here in a completely automated "one click" process. At a high level, MACs and IPs will be assigned, disk device(s) will be allocated, filesystem(s) will be made, the bundle will be extracted onto the new filesystems, local system files will be modified, DNS will be updated, and the VM will be booted.
Source Preparation
This section is going to be short. Once I have a machine in a state where it's ready to be used as a template for future VMs, I run a bundle process on it. The meat of the script to perform this is:
bundle=/tmp/$name.tar.gz
cd /
tar czpf $bundle --exclude "./tmp/*" \
--exclude ./lost+found \
--one-file-system . 2>&1 | grep -v 'socket ignored'
I can get away with this because I build my template systems on a single filesystem to make this process a snap. The operating system goes in one filesystem and data will be mounted separately (likely with logical volume management) so the --one-file-system option will ignore it. Simply adjust the tar command if you have broken out /var, /boot, and others so that you don't miss any required files.
The actual script that I use to perform this process does more than just this, but not all that much. Besides spending time getting options, printing usage information, and providing a means to update itself, it transfers the resulting tar file up to a web service that I have running on a host serving as a bundle repository. That way, my bundles get "published" and can be used across the hosting environment. Note that one large side benefit of the script's simplicity is that bundles can be used for both Xen and KVM virtual machines...it will be up to the provisioning process to make the minor adjustments to account for hypervisor differences.
VM Build Process
As I just mentioned, the bundles I create can be used with both Xen and KVM. When using libvirt as a middleman between configuration and hypervisor there are very few differences in the provisioning processes. I'm pretty anal so I do have separate programs to do both Xen and KVM, but that's by no means required and I probably will just combine the two at some point.
One of the differences between Xen and KVM is that Xen has the ability to assign logical volumes as individual disk partitions in the guest VM. For example, /dev/vg_xen/testvm-root can be presented to the VM named testvm as /dev/xvda1. This is great, because a filesystem can be created on that volume on the Xen host, quickly mounted up, and then the bundle can be untarred directly into it. When the VM boots, it'll see that filesystem as /dev/xvda1, its root filesystem.
From my research and testing, KVM does not have this ability. Instead, logical volumes can only be presented as whole disks, so /dev/vg_kvm/testvm-vda could only be mounted as /dev/vda in the VM. This presents the (minor) annoyance of having to partition the disk from the KVM host first and then playing the kpartx game before unbundling. I'll do that in the example below since this method could also be used for Xen.
Ok, with that out of the way, let's get down to it. I find that it's easiest to start with a list of all the information I'll need for a fully functional VM and then go from there:
- VM name (obviously)
- MAC address: Not absolutely required but I like to assign these programmatically.
- IP address: Again, not absolutely required since you could just use DHCP, but I like to assign these and possibly push the assignments to DHCP.
- Bridge: You'll need to know what bridge to use for your VM's network.
- Disk Device: This piece can be huge ... it can be something as simple as a file path or it can become a big process to generate an iscsi lun on a remote server and connect to it from the provisioning host. In this example, I'll use a local logical volume group. In future blog entries I'll probably talk a lot about iSCSI.
- CPU count
- Memory
For the purposes of this blog entry, I'm going to chop the stuff that isn't necessarily all that interesting: mac/ip/bridge assignment and cpu/memory definition. So for simplicity I'll take these items as variables used by a bash function named vm_build. I'll just spit out the function (from the KVM build process) in chunks and discuss along the way. Where my "real" scripts do more I'll try to make a comment.
vm_build()
{
vm=$1
### do name validation in real script
echo
echo "=== Creating new linux domain: $vm ==="
echo Bundle: $bundle
echo Memory: $mem
echo CPUs: $cpu
echo MAC: $mac
echo IP: $ip
echo Bridge: $bridge
echo Volume Group: $vg
echo Root Filesystem Size: $size_root
echo Swap Size: $size_swap
echo Filesystem Type: $fs
echo Boot on Completion: $boot
xml=/etc/libvirt/qemu/${vm}.xml
sed "s/\$NAME/$vm/" $BASE_XML | \
sed "s/\$MEM/$mem/" | \
sed "s/\$CPU/$cpu/" | \
sed "s/\$IP/$ip/" | \
sed "s/\$BRIDGE/$bridge/" | \
sed "s/\$MAC/$mac/" > $xml
This is just displaying the variables that I'll have defined by the time it's ready to build a VM. You could take this as command-line options or have functions to pick them or, like I do, use a mixture of both. I generally take bundle, memory, cpu, size_root, size_swap, and fs as command-line options since they're a unique decision for each VM. For the network info and disk volume, that gets programmatically generated since it would be annoying to manually create unique MACs and manually pick open IP addresses. :)
Once those variables are in place, start feeding them into a template libvirt xml file (defined as $BASE_XML) which we'll use to define the VM. Here's a link to this template:
base_kvm_linux.xml Now to the dirty part ... the actual disk provisioning.
echo creating logical volumes...
lv_size=$((size_root + size_swap))
lv_root=/dev/$vg/$vm
lvcreate -L ${lv_size}m -n $lv_root
if [ $? -ne 0 ]; then
echo "error: could not create root volume for $vm...aborting!"
return 3
fi
sed -i "s@\$ROOT@$lv_root@" $xml
parted -s $lv_root mktable msdos
parted -s -- $lv_root mkpart primary ext2 0 $size_root
parted -s -- $lv_root mkpart primary linux-swap $size_root -1
parted -s -- $lv_root set 1 boot on
# brutal hack workaround since parted creates weird mapper entries
mapper_name=`echo $vm | sed 's/-/--/g'`
dmsetup remove /dev/mapper/*${mapper_name}*p1
dmsetup remove /dev/mapper/*${mapper_name}*p2
kpartx -a -p P $lv_root
part_root=/dev/mapper/${vm}P1
part_swap=/dev/mapper/${vm}P2
mkswap $part_swap
echo building root filesystem...
mkfs.$fs -q $part_root
mkdir -p /mnt/${vm}
mount $part_root /mnt/${vm}
echo extracting operating system...
tar -C /mnt/$vm -xzf $bundle
sync # gotta sync otherwise the grub will occasionally fail???
The above allocates a logical volume, partitions it, plays device mapper games, mounts it locally on the KVM host, creates a filesystem (type specified by $fs - I use ext3 and ext4 as needed) on one partition and mounts it, makes swap on the other, and finally extracts the bundle. It's a lot of work for really not all that much activity. As I mentioned before, with Xen this *can* be simplified to create two logical volumes: one for root and one for swap. The mkfs will run on the root logical volume and that volume will be mounted. It eliminates all the partitioning steps and device mapper complexity.
From here, we go to the local modifications. For simplicity's sake again, I'll just assume that we only build Red Hat based systems. Debian would have its configuration files in other locations.
# update the hostname
sed -i "s/HOSTNAME=.*/HOSTNAME=${vm}.${dns_domain}/" \
/mnt/$vm/etc/sysconfig/network
# change the grub boot device
sed -i 's@root=[^ ]\{1,\} @root=/dev/vda1 @' \
/mnt/$vm/boot/grub/grub.conf
# create a base fstab using the virtio vda devices
cat >/mnt/$vm/etc/fstab <<+++
/dev/vda1 / $fs defaults,noatime 1 1
/dev/vda2 swap swap defaults 0 0
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
+++
# clear out the MAC address
sed -i '/^HWADDR/d' \
/mnt/$vm/etc/sysconfig/network-scripts/ifcfg-eth0
sed -i "s/^IPADDR=.*/IPADDR=$ip/" \
/mnt/$vm/etc/sysconfig/network-scripts/ifcfg-eth0
# for now, just remove persistent rules files
rm -f /mnt/$vm/etc/udev/rules.d/70-persistent-net.rules
# disable selinux
if [ -f /mnt/$vm/etc/sysconfig/selinux ]; then
echo 'SELINUX=disabled' > /mnt/$vm/etc/sysconfig/selinux
fi
Mostly it's just a matter of making sure that the system boots up on the device partitions that KVM is presenting (Xen will use xvda rather than vda) and that our eth0 network interface comes up with its own MAC address and IP. SELinux is disabled since it will hose stuff up in the event that our template VM had it enabled.
Now we need to make sure that the new VM will actually boot with grub. In Xen this is not at all required since it uses the beautiful concept of pygrub, but as far as I am aware KVM has no such ability and it requires that a boot loader be installed on each VM's root volume.
mount --bind /dev /mnt/$vm/dev
mount -t proc none /mnt/$vm/proc
cat >/mnt/$vm/vm-grub <<EOF
#!/bin/bash
ln -s $part_root ${lv_root}1
grub <<+++
device (hd0) $lv_root
root (hd0,0)
setup (hd0)
+++
rm -f ${lv_root}1
EOF
chmod 755 /mnt/$vm/vm-grub
echo chroot /mnt/$vm /vm-grub
chroot /mnt/$vm /vm-grub
umount /mnt/$vm/proc
umount /mnt/$vm/dev
umount /mnt/$vm
rm -rf /mnt/$vm
kpartx -d -p P $lv_root
If there is a better simpler way to do this, please, I am all ears. I struggled through this piece the longest in the conversion of my Xen provisioning scripts to KVM and would really love to clean this up. I mean, I love taking any chance I can to use chroot and all, but one command sure would be nice. :)
Anyway, finally, we get to the definition of the VM and its registration with DNS.
echo -n Defining VM...
virsh define $xml &>/dev/null
if [ $? -eq 0 ]; then
echo "success!"
echo Configuring VM to autostart...
virsh autostart $vm
if [ "$boot" -eq 1 ]; then
virsh start $vm
fi
else
echo "failure!"
fi
### call DNS registration in the real script
echo
echo "=== Creation of $vm complete ==="
}
And that's it! Not so bad, eh? :) The VM is built and started up if desired. When using ext4 and a compressed bundle of 400MB or so, the fastest I've provisioned a machine on 10k RPM disks was 14 seconds. I'd love to bring this number down to single digits so perhaps a nice RAID of 15k RPM disks is in my future.
I hope this proves useful and I'd absolutely love to receive any and all feedback. Until next time!