vmserver
This tutorial covers how to set up a production server that's intended to be used as a virtualization stack for a small business or educator. I am currently running a Supermicro 6028U-TRTP+ w/ Dual 12-core Xeon E5-2650 at 2.2Ghz, 384GB RAM, with four two-way mirrors of Samsung enterprise SSDs for the primary vdev, and two two-way mirrors of 16TB platters for the backup vdev. All drives using SAS. I am using a 500W PSU. I determine the RAM would be about 5-10W a stick, the mobo about 100W, and the drives would consume most of the rest at roughly 18-22W per drive. The next step was to install Debian on the bare metal to control and manage the virtualization environment. The virtualization stack is virsh and kvm/qemu. As for the file system and drive formatting, I used luks and pam_mount to open an encrypted home partition and mapped home directory. I use this encrypted home directory to store keys for the zfs pool and/or other sensitive data, thus protecting them behind FDE. Additionally, I create file-level encrypted zfs data sets within each of the vdevs that are unlocked by the keys on the LUKS home partition. Instead of tracking each UUID down on your initial build, do the following:
zpool create -m /mnt/pool pool -f mirror sda sdb mirror sdc sdh mirror sde sdf mirror sdg sdh zpool export pool zpool import -d /dev/disk/by-id pool
Once the pool is created, you can create your encrypted datasets. To do so, I made some unlock keys with the dd command and placed the keys in a hidden directory inside that LUKS encrypted home partition I mentioned above:
dd if=/dev/random of=/secure/area/example.key bs=1 count=32 zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///mnt/vault/example.key pool/dataset
When the system reboots, the vdevs will automatically mount but the data sets won't because the LUKS keys won't be available until you mount the home partition by logging in to the user that holds the keys. For security reasons, this must be done manually or it defeats the entire purpose. So, once the administrator has logged in to the user in a screen session (remember, it is using pam_mount), they simple detach from that session and then load the keys and datasets as follows:
zfs load-key pool/dataset zfs mount pool/dataset
If you have a lot of data sets, you can make a simple script to load them all at once, etc. Since we have zfs, it's a good idea to run some snapshots. To do that, I created a small shell script with the following commands and then set it to run 4 times a day, or every 6 hours:
DATE=date +"%Y%m%d-%H:%M:%S" /usr/sbin/zfs snapshot -r pool/vm1dataset@backup_$DATE /usr/sbin/zfs snapshot -r pool/vm2dataset@backup_$DATE /usr/sbin/zfs snapshot -r pool/@backup_$DATE /usr/sbin/zfs snapshot pool@backup_$DATE
Make sure to manage your snapshots and only retain as many as you can etc., as they will impact performance. If you need to zap all of them and start over, you can use this command:
zfs list -H -o name -t snapshot | xargs -n1 zfs destroy
Off-site full backups are essential but they take a long time to download. For that reason, it's best to have the images as small as possible. When using cp
in your workflow, make sure to specify –sparse=always
. Before powering the virtual hard disk back up, you should run virt-sparsify
on the image to free up the unused blocks on the host and that are not actually used in the VM. In order for the VM to designate those blocks as empty, ensure that you are running fstrim within the VM. If you want the ls command to show the size of the virtual disk that remains after the zeroing, you will need to run qemu-img create
on it, which will create a new copy of the image without listing the ballooned size. the new purged virtual hard disk image can then be copied to a backup directory where one can compress and tarball it to further reduce its size. I use BSD tar and the pbzip2 compression which makes ridiculously small images. GNU tar glitches with the script for some reason. BSD tar can be downloaded with sudo apt install libarchive-tools
. I made a script to automate all of those steps for a qcow2 image. I also adapted that to work for raw images.
vm-bu-production-QCOW-loop.sh
vm-bu-production-RAW-loop.sh
On the off-site backup machine, I originally would pull the tarballs down using a one line rsync script. I would adjust the cron timing of the rsync script to work well with when the tarballs are created.
sudo rsync -av --log-file=/home/logs/backup-of-vm-tarballs.log --ignore-existing -e 'ssh -i /home/user/.ssh/id_rsa' root@domain.com:/backups/tarballs/ /media/user/Backups/
Since then, I've switched to using rsnapshot to pull down the tarballs in some cases. The rsnapshot configurations can be found here:
– Network Bridge Setup / VMs –
Up until now, I've covered how to provision the machines with virt-manager, how to backup the machines on the physical host, and how to pull those backups to an off-site workstation. Now I will discuss how to assign each VM an external IP. The first step is to provision the physical host with a virtual switch (wrongly called a bridge) to which VMs can connect. To do this, I kept it simple and used ifup
and bridge-utils
package and some manual editing in /etc/network/interfaces
.
sudo apt install bridge-utils sudo brctl addbr br0 sudo nano /etc/network/interfaces
Now that you have added created the virtual switch, you need to reconfigure your physical host's /etc/network/interfaces
file to use the switch. In my case, I used 1 IP for the host itself, and another for the switch, meaning that two ethernet cables are plugged into my physical host. I did this so that if I hose my virtual switch settings, I still have a separate connection to the box. Here's the configuration in interfaces
:
#eth0 [1st physical port] auto ent8s0g0 iface ent8s0f0 inet static address 8.25.76.160 netmask 255.255.255.0 gateway 8.25.76.1 nameserver 8.8.8.8
#eth1 [2nd physical port] auto enp8s0g1 iface enp8s0g1 inet manual
auto br0 iface br0 inet static address 8.25.76.159 netmask 255.255.255.0 gateway 8.25.76.1 bridge_ports enp8s0g1 nameserver 8.8.8.8
After that, either reboot or systemctl restart networking.service
to make the changes current. Execute ip a
and you should see both external IPs on two separate interfaces, and you should see br0 state UP
in the output of the second interface enp8s0g1
. You should also run some ping 8.8.8.8
and ping google.com
tests to confirm you can route. If anyone wants to do this in a home, small business, or other non-public facing environment, you can easily use dhcp and provision the home/small business server's interface
file as follows:
auto eth1 iface eth1 inet manual
auto br0 iface br0 inet dhcp bridge_ports eth1
The above home-version allows, for example, users to have a virtual machine that gets an ip address on your LAN and makes ssh/xrdp access far easier. If you have any trouble routing on the physical host, it could be that you do not have nameservers setup. If that's the case, do the following:
echo nameserver 8.8.8.8 > /etc/resolv.conf systemctl restart networking.service
Now that the virtual switch is setup, I can now provision VMs and connect them to the virtual switch br0
in virt-manager. You can provision the VMs within the GUI using X passthrough, or use the command line. First, create a virtual disk to your desired size by excuting sudo qemu-img create -f raw new 1000G
and then run something like this:
sudo virt-install --name=new.img \ --os-type=Linux \ --os-variant=debian10 \ --vcpu=1 \ --ram=2048 \ --disk path=/mnt/vms/students/new.img \ --graphics spice \ --location=/mnt/vms/isos/debian-11.4.0-amd64-netinst.iso \ --network bridge:br0
The machine will open in virt-viewer, but if you lose the connection you can reconnect easily with:
virt-viewer --connect qemu:///system --wait new.img
Once you finish installation, configure the guestOS interfaces file sudo nano /etc/network/interfaces
with the IP you intend to assign it. You should have something like this:
auto epr1 iface epr1 inet static address 8.25.76.158 netmask 255.255.255.0 gateway 8.25.76.1 nameservers 8.8.8.8
If you are creating VMs attached to a virtual switch on the smaller home/business environment, then adjust the guest OS by executing sudo nano /etc/network/interfaces
and then something like this recipe:
auto epr1 iface epr1 inet dhcp
If your guest OS uses Ubuntu, you will need to do extra steps to ensure that the guestOS can route. This is because Ubuntu-based distros have deprecated ifupdown
in favor of netplan
and disabled manual editing of /etc/resolv.conf
. So, either you want to learn netplan syntax and make interface changes using its YAML derivative, or you can install the optional resolvconf
package to restore ifupdown
functionality. To do this, adjust the VM provision script above (or use the virt-manager GUI with X passthrough) to temporarily use NAT then override Ubuntu defaults and restore ifupdown
functionality as follows:
sudo apt install ifupdown sudo apt remove --purge netplan.io sudo apt install resolvconf sudo nano /etc/resolvconf/resolv.conf.d/tail <nameserver 8.8.8.8> systemctl restart networking.service
You should once again execute ping 8.8.8.8
and ping google.com
to confirm you can route within the guest OS. If it fails, reboot and try again. Its a good idea at this point to check netstat -tulpn
on both the host and in any VMs to ensure only approved services are listening. When I first began spinning up machines, I would make template machines and then use virt-clone
to make new machines which I would then tweak for the new use case. You always get ssh hash errors this way and it is just kind of cumbersome and not clean. Over time, I found out about how to pass preseed.cfg files to Debian through virt-install, and so now I simply spin up new images with desired parameters and the preseed.cfg files passes nameservers, network configuration details, and ssh keys into the newly created machine. Although related, that topic stands on its own, so I wrote up the steps I took over at preseed. One other thing that people might want do is enable some type of GUI-based monitoring tool for the physical host like munin, cacti, smokeping, etc., in order to monitor snmp or other characteristics of the VMs. If so, make sure you only run those web administration panels locally and/or block 443/80 in a firewall. You will want to put the physical host behind a vpn, like I've documented in vpnserver-debian and then just access it by its internal IP. This completes the tutorial on setting up a virtualization stack with virsh and qemu/kvm.
— oemb1905 2024/02/17 20:46