User Tools

Site Tools


computing:btrfsreminders

This is an old revision of the document!



  • btrfsreminders
  • Jonathan Haack
  • Haack's Networking
  • support@haacksnetworking.org

btrfsreminders


Introduction

This tutorial is for Debian users that want to create a JBOD pool using BTRFS subvolumes and its RAID10 equivalent. These types of setups are common and helpful for virtualization environments and hosting multiple services, either for serious home hobbyist use and/or small business level production. These approaches are not designed for enterprise or large-scale production.

Overview of Design Model

Encrypting the home partition is essential because it ensures that the pool key is never directly exposed; its behind LUKS on the boot volume and the sysadmin keeps this credential stored in KeePassXC offsite. Thus, the physical layer is protected by LUKS with integrity. As for Pam's mounting utilities, I use this method because it allows for easy remote reboot as there is no need to enter an FDE key in the post-BIOS FDE splash and/or require you to log in to IPMI each time. Instead, you encrypt home and then unlock that in a screen session after remote reboot with screen then su - user - after that, detach from the session with ctrl-d. In short, this method provides two advantages, namely, a secure LUKS-encrypted location for keys/credentials that's not exposed if a physical compromise takes place, and using built-in pam and simple UNIX login infra to avoid cumbersome BIOS/IPMI-level FDE unlocking after reboot.

Installation Instructions

Let's install btrfs, LUKS, and identify your hard drives:

sudo apt-get install cryptsetup libpam-mount btrfs*
ls -lah /dev/disk/by-id/

This installs the required packages and provides you the blkid location/mountpoint for your hard drives. You can also use blkid but I find ls to be easier here tbh. After identifying your JBOD drives, do something like:

dd if=/dev/random of=/home/user/vm.key bs=1 count=32
dd if=/dev/random of=/home/user/wh.key bs=1 count=32
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a98416870 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a98356f30 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a983571d0 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a98356590 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a0840a300 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a98356500 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a084065d0 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35002538a98357220 --key-file /home/user/.unlock/vm.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35000c500d775df03 --key-file /home/user/.unlock/wh.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35000c500d7694517 --key-file /home/user/.unlock/wh.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35000c500d7771943 --key-file /home/user/.unlock/wh.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random
cryptsetup luksFormat /dev/disk/by-id/scsi-35000c500cb1689e3 --key-file /home/user/.unlock/wh.key --type luks2 --cipher aes-xts-plain64 --key-size 512 --pbkdf argon2id --pbkdf-memory 4194304 --pbkdf-parallel 4 --iter-time 4000 --sector-size 4096 --use-random

After you create the crypts on the hard drive and specify the key, you need to mount the volume and assign the unique blkid a dedicated/unique shortname for its mountpoint. You do this as follows:

cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98416870 ssd1 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356f30 ssd2 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a983571d0 ssd3 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356590 ssd4 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a0840a300 ssd5 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356500 ssd6 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a084065d0 ssd7 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98357220 ssd8 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d775df03 hdd1 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d7694517 hdd2 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d7771943 hdd3 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500cb1689e3 hdd4 --key-file /home/user/.unlock/wh.key

Now that we have mounted the crypts at /dev/mapper/ssd#, we can easily create a filesystem with them and or pool them together as we see fit. I've chosen to replicate RAID10 with BTRFS as closely as possible. It should be noted that it's not a perfect replication of RAID10 since it is chunk-based. In the commands below, we create the pools, verify the pools, and then mount them

mkdir -p /mnt/vm
mkdir -p /mnt/wh
mkfs.btrfs -f -d raid10 -m raid1 --checksum=xxhash --nodesize=32k /dev/mapper/ssd1 /dev/mapper/ssd2 /dev/mapper/ssd3 /dev/mapper/ssd4 /dev/mapper/ssd5 /dev/mapper/ssd6 /dev/mapper/ssd7 /dev/mapper/ssd8
mkfs.btrfs -f -d raid10 -m raid1 --checksum=xxhash --nodesize=32k /dev/mapper/hdd1 /dev/mapper/hdd2 /dev/mapper/hdd3 /dev/mapper/hdd4
mount -o compress-force=zstd:3,noatime,autodefrag,space_cache=v2,discard=async,commit=120 /dev/mapper/ssd1 /mnt/vm
mount -o compress=zstd:3,noatime,autodefrag,space_cache=v2,discard=async,commit=120 /dev/mapper/hdd1 /mnt/wh
btrfs filesystem show /mnt/vm
btrfs filesystem show /mnt/wh
df -h #verify all looks right!

After the first reboot I set persistent compression. I did this because I was getting errors trying to do it on initial pool build. Here's what I do for compression:

btrfs property set /mnt/vm compression zstd:3
btrfs property set /mnt/wh compression zstd:3

Maintenance and Monitoring

Once that's done and you've rebooted a few times and tested things a few times, you can safely make a mount script for remote rebooting. This way, you reboot and then log in to your user and detach, run a simple script to unlock and mount the BTRFS subvolumes … and you are done! Create nano /usr/local/bin/btrfs-mount-datasets.sh and some chmod 750 /usr/local/bin/btrfs-mount-datasets.sh and enter something like:

#!/bin/bash
#open SSD crypts
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98416870 ssd1 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356f30 ssd2 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a983571d0 ssd3 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356590 ssd4 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a0840a300 ssd5 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98356500 ssd6 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a084065d0 ssd7 --key-file /home/user/.unlock/vm.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35002538a98357220 ssd8 --key-file /home/user/.unlock/vm.key
#open PLATTER crypts
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d775df03 hdd1 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d7694517 hdd2 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500d7771943 hdd3 --key-file /home/user/.unlock/wh.key
cryptsetup luksOpen /dev/disk/by-id/scsi-35000c500cb1689e3 hdd4 --key-file /home/user/.unlock/wh.key
#mount the btrfs r10 pool for vm
mount -o compress-force=zstd:3,noatime,autodefrag,space_cache=v2,discard=async,commit=120 /dev/mapper/ssd1 /mnt/vm
#mount the btrfs r10 pool for wh
mount -o compress=zstd:3,noatime,autodefrag,space_cache=v2,discard=async,commit=120 /dev/mapper/hdd1 /mnt/wh

This script is designed to be run manually post reboot. In order, you reboot, log in to the admin user via ssh, unlock the crypt key directory with screen and then su - sexa (detach with ctrl-a-d). After detaching, simply run the mount the script /bin/bash /usr/local/bin/btrfs-mount-datasets.sh. In the weeks ahead, it is essential to regularly scrub the pool. For that, put the following commands on a cronjob:

/usr/bin/btrfs scrub start /mnt/vm
/usr/bin/btrfs scrub start /mnt/wh

To check the status, you use:

/usr/bin/btrfs scrub status /mnt/vm
/usr/bin/btrfs scrub status /mnt/wh

In addition to scrubbing, I compiled a slew of commands to assess pool health more granularly. I put this script on a cronjob which runs and sends me a statistics report every hour:

#!/bin/bash
DATE=`date +"%Y%m%d-%H:%M:%S"`
LOG="/root/vitals.log"
 
echo "Here are the RAM usage stats ..." >> $LOG
free -h
 
echo "Here are the btrfs stats for the vm pool ..." >> $LOG
btrfs filesystem show /mnt/vm
btrfs filesystem df /mnt/vm
btrfs filesystem usage /mnt/vm
btrfs device usage /mnt/vm
btrfs scrub status /mnt/vm
btrfs device stats /mnt/vm
btrfs device stats /mnt/vm -c
mount | grep /mnt/vm
dmesg | grep -i btrfs | tail -n 40
dmesg | grep -E 'sd[a,c,d,e,f,g,h]' | tail -n 30
btrfs fi show /mnt/vm | grep -i missing
btrfs fi df -h /mnt/vm
btrfs fi usage -T /mnt/vm
btrfs qgroup show /mnt/vm 2>/dev/null
btrfs subvolume list -a /mnt/vm
btrfs balance status /mnt/vm
 
echo "Here are the btrfs stats for the wh pool ..." >> $LOG
btrfs filesystem show /mnt/wh
btrfs filesystem df /mnt/wh
btrfs filesystem usage /mnt/wh
btrfs device usage /mnt/wh
btrfs scrub status /mnt/wh
btrfs device stats /mnt/wh
btrfs device stats /mnt/wh -c
mount | grep /mnt/wh
dmesg | grep -i btrfs | tail -n 40
dmesg | grep -E 'sd[a,c,d,e,f,g,h]' | tail -n 30
btrfs fi show /mnt/wh | grep -i missing
btrfs fi df -h /mnt/wh
btrfs fi usage -T /mnt/wh
btrfs qgroup show /mnt/wh 2>/dev/null
btrfs subvolume list -a /mnt/wh
btrfs balance status /mnt/wh
 
for disk in \
  /dev/disk/by-id/wwn-0x5002538a98416870 \
  /dev/disk/by-id/wwn-0x5002538a98356f30 \
  /dev/disk/by-id/wwn-0x5002538a983571d0 \
  /dev/disk/by-id/wwn-0x5002538a0840a300 \
  /dev/disk/by-id/wwn-0x5002538a98356500 \
  /dev/disk/by-id/wwn-0x5002538a98356590 \
  /dev/disk/by-id/wwn-0x5002538a084065d0 \
  /dev/disk/by-id/wwn-0x5002538a98357220 \
  /dev/disk/by-id/wwn-0x5000c500d775df03 \
  /dev/disk/by-id/wwn-0x5000c500d7694517 \
  /dev/disk/by-id/wwn-0x5000c500d7771943 \
  /dev/disk/by-id/wwn-0x5000c500cb1689e3; do
  temp=$(sudo smartctl -a "$disk" | grep 'Current Drive Temperature' | awk '{print $4}' || echo "N/A")
  echo "$disk: $temp°C"
done
 
for disk in \
  /dev/disk/by-id/ata-SATA_SSD_22100512800207 \
  /dev/disk/by-id/ata-SATA_SSD_22100512800205; do
  temp=$(sudo smartctl -a "$disk" | grep '^194 Temperature_Celsius' | head -n 1 | awk '{print $10}' || echo "N/A")
  echo "$disk: $temp°C"
done
computing/btrfsreminders.1770565624.txt.gz · Last modified: by oemb1905