-------------------------------------------
* **RAID**
* **Jonathan Haack**
* **Haack's Networking**
* **netcmnd@jonathanhaack.com**
-------------------------------------------
This tutorial is based on me wanting to set up a server solution for self-hosters that can survive a remote reboot, but also encrypt the data on the server that matters most, and most importantly, be able to give me a chance to replace a hard drive if one were to fail. For those reasons, I chose RAID1 array as the wrapper, and used luks and pam_mount and once that array and debian OS were installed on my host with that arrangement - for short, I call this the 'RAID' tutorial, since that was the primary non-negotiable and all of this pivots around being able to replace failing hard drives first, and then building other 'backup' solutions on top of that later (see [[https://wiki.haacksnetworking.com/doku.php?id=computing:rsyncrsnapshot|rsyncrsnapshot tutorial]]. For now, it is important to set up the RAID, luks, and pam_mount correctly, and here is what I got to work flawlessly. I used a debian stretch .iso that I ripped to a flash drive (full amd64DVD1.iso).
(Optional: I recommend not using the netinst.iso but the full DVD.iso instead. I also do not use a network mirror, and do not install any other packages than the defaults, as I noticed this speeds up the install a lot. Once the system is bootable and up, you can install the packages and rebuild the the OS as needed. Debian keeps the repositories at [[https://wiki.debian.org/SourcesList|Debian Repositories]] web page and make sure that /etc/apt/sources.list is updated to those examples. //Again, do this after the final RAID syncing below.// Also, when you save time like this, you need to install sudo (yes, really). Why? The RAID1 array is suycing and slowing everything down a lot. Just wait until it finishes, reboot, and then update. And take as little time as possible during the install. That's the plan I use ... ok, moving on!)
-------------------------------------------
Hardware Required
* Matching 4TB drives (some prefer identical models, others prefer identical size and speed but different models)
* Host Machine (4GB RAM minimum, 16GB ideal)
Using the ncruses installer, with manual partitioning, select each drive heading, continue by pressing enter when each of those are selected, and for each one it will prompt you to issue the drive a new GPT table. After that, I partition as follows:
/dev/sda1 - BIOS
/dev/sda2 - 128GB (use as RAID)
/dev/sda3 - 16GB (use as RAID)
/dev/sda4 - 3.9TB (use as RAID)
The other drive was set up identically.
/dev/sdg1 - BIOS
/dev/sdg2 - 128GB (use as RAID)
/dev/sdg3 - 16GB (use as RAID)
/dev/sdg4 - 3.9TB (use as RAID)
From there, I use the configure software RAID option and create new md devices for sda2/sdg2, sda3/sdg3, and sda4/sdg4. Do not create a mirror between the BIOS partitions - do not worry, if you attempted to, it will not allow you to utilize the md partition in that manner in the following screen, but it takes a lot of time to back up and reformat everything and build the array properly, so do not do this from the beginning. Once this is done, you will have
md0 - 128GB - format it as xfs, mount point "/"
md1 - 16GB - format this as a swap partition
md2 - 3.9TB - format it as xfs, use as "/home"
Once that is done (still in the ncurses/text-based installer), you can continue with the installation and the options as you desire. (Do not forget to install GRUB to both hard drives and to both of the reserved BIOS partitions that were made earlier.) Again, I do not add any additional software or utitlize the mirror during install in order to speed up the install process. Later, I will add all the packages I need and adjust sources.list, etc., but I do this ... __after__ the syncing completes. This tutorial will have you encrypt the md2 partition within the RAID1 array __after__ the OS installation completes - not using the installer on that piece. Once the OS installs and at any time before or after the encryption, you can use the command below to monitor the 'syncing' of the RAID1 array:
cat /proc/mdstat
But again ... __wait__ until you complete the steps below before you update or heavily customize this system. It needs to sync and it will be slow - and you still have not encrypted your /home partition. To get your /home partition encrypted with luks after the install, you need to enter tty1 (ctrl-f1 or f?) when the system boots in and log in as root at the tty1 shell. Once logged in, perform the following on md2 (the "/home" partition on the RAID mirror). I adapted the encryption instructions that [[https://jasonschaefer.com/encrypting-home-dir-decrypting-on-login-pam/|Jason Schaefer]] covered in his blog in order for encrypted servers to survive remote reboots. Here is my simplified version of what he has written there:
apt-get install cryptsetup libpam-mount rsync
su - root
rsync -av /home /backup
umount /home/
cryptsetup luksFormat /dev/md2
cryptsetup luksOpen /dev/md2 home
mkfs.xfs -L home /dev/mapper/home (xfs is optional, use what you want)
mount /dev/mapper/home /home/
rsync -av /backup/home/ /home
nano /etc/fstab
Just add "#" before the lines for "/home" since that is now getting changed. Now, let's configure the pam_mount set up, which is what we are changing that to:
cp /etc/security/pam_mount.conf.xml /root/
nano /etc/security/pam_mount.conf.xml
The last command above opens a text editor nano with the pam_mount configuration file. Once this file is opened, locate the "" section, and immediately underneat, enter a configuration similar to the one I use:
If you forgot how to locate the UUID or ID, here's some different ways:
ls -lah /dev/disk/by-uuid/
ls -lah /dev/disk/by-id/
blkid
Anyway, once this is done, the crypt for home is set up. Now, let's encyrpt swap. Again, we will not encrypt the file system root because pam_mount will allow an easy remote reboot if we only encrypt the home directory. Additionally, the web server root will also be located in /home/server/ in this tutorial, but that set up and configuration is beyond this tutorial's scope. I am in the process of adding the topic of virtual hosts outside of the /var/www configuration to my tutorial, entitled "[[https://jonathanhaack.com/dokuwiki/doku.php?id=computing:apachesurvival|apachesurvival]]." Anyways, for swap, do the following:
swapoff -a
cryptdisks_start md1_crypt
nano /etc/crypttab
Enter something like this in the crypttab file that just opened:
md1_crypt /dev/disk/by-id/md1byidcodejustdolslahondevdiskbyid /dev/urandom cipher=aes-xts-plain64,size=256,swap
Now, make sure to comment out the /etc/fstab entry for swap and replace with something like this:
/dev/mapper/md1_crypt none swap sw 0 0
Once that configuration is entered, do the following to bring the encrypted swap back up:
swapon -av
swapon -sv
When you reboot, you may find that swap stopped syncing. If so, remember you can always check the syncing status of the RAID1 array or restart the syncing as follows:
cat /proc/mdstat [check]
mdadm --readwrite /dev/mdx [restart]
Okay, the point of this is to combine RAID with pam_mount handlind the mounting of the crypt, and we also need ssh pubkey authentication. But, without some tweaks, we would never be able to access the public key, which is in the crypt on the server. So, we need to move a copy of the public keys outside the crypt. Here is how:
cp -a ~/.ssh/authorized_keys /opt/authorizedkeys
sudo nano /etc/ssh/sshd_config
Add the following parameter in the section that pertains to it:
AuthorizedKeysFile /opt/authorized_keys
Now, when you reboot, even if PasswordAuthentication and ChallengeResponse are set to "no," as long as PubKeyAuthentication is set to "yes" you will be able to do the following to mount the crypt. This is because pam_mount is handling log in, and PAM is therefore allowing the password to be entered through the ssh tunnel because sshd_config also has Use PAM yes.
ssh user@xx.xx.xx.xx
screen
su - user
Now, press ctrl-a-d to detach from the screen.
exit
Okay, now you have survived reboot with a RAID array, with pam doing its magic for you on the crypt ... hardly a pain if it saves you
a trip
-------------------------------------------
When a drive fails, issue the commands below to remove the drive and clone the partitioning system with sfdisk to the new hard drive:
(draft) sfdisk -d | sfdisk /dev/sdg /dev/sdz
(draft) cat /proc/mdstat
(draft)
(draft) mdadm --readwrite /dev/mdx
When a hard drive fails, follow these steps with the machine still on (with failed drive as per SMART inside):
mdadm --manage /dev/md0 --fail /dev/sdi1
mdadm --manage /dev/md1 --fail /dev/sdi2
mdadm --manage /dev/md2 --fail /dev/sdi3
mdadm --manage /dev/md0 --remove /dev/sdi1
mdadm --manage /dev/md1 --remove /dev/sdi2
mdadm --manage /dev/md2 --remove /dev/sdi3
sudo poweroff
sfdisk -d /dev/sdj | sfdisk /dev/sdi
mdadm --manage /dev/md0 --add /dev/sdi1
mdadm --manage /dev/md1 --add /dev/sdi2
mdadm --manage /dev/md2 --add /dev/sdi3
dpkg-reconfigure -plow grub-pc
cat /proc/mdstat
Let syncing finish, then reboot and run:
sudo grub-install /dev/sdX
sudo update-grub
sudo update-grub2
Next up, how to create a RAID1 array on a running host
-- -- -- -- --
This tutorial is a designated "Invariant Section" of the "Technotronic" section of Haack's Wiki as described on the [[https://jonathanhaack.com/dokuwiki/doku.php?id=start|Start Page]].
--- //[[netcmnd@jonathanhaack.com|oemb1905]] 2019/01/13 12:25//