Read-only root filesystem on a Debian Buster system

Wed 02 September 2020

First, a little background

I was hoping to set up a read-only SD card with a ram-based filesystem on a Raspberry Pi. They're kind of notorious for becoming unbootable, and understandably so, after the power's been pulled a few times. I decided to play around on a Debian 10 VM first, to wrap my head around what was going on.

The boot process, or the parts we care about

Hopefully you saw my last post about adding scripts for use by initramfs-tools. That's the package Debian uses to manage initramfs scripts.

I'm sure I'm doing some amount of hand-waving here, but the boot process looks something like this:

Bootloader hands off control to kernel with an "initial RAM device" (initrd) set up.
Kernel starts and mounts the initial root filesystem (initramfs), based on /dev/initrd, at /.
If the command-line parameter init= isn't passed to the kernel specifying the location of some program to run as PID 1, it runs /init. In our case, this is a shell script.
The initramfs-tools(7) man page describes the stages of this process.
We end with the real root filesystem mounted at /root.
The filesystem table in /etc/fstab is read and the entries are mounted. This usually includes an entry to (re)mount the rootfs as read/write.

Like I mentioned in the previous article, the advantage to this approach is that it's the Linux kernel, not the bootloader, mounting these filesystems, so they can be more complex filesystems and the bootloader doesn't have to duplicate functionality already present in the Linux kernel.

The goal

We would like to insert a program (script) between steps 5 and 6 described above. That program should turn our read-only root filesystem, mounted at /root, into an overlay-filesystem of the read-only root filesystem underneath a writable filesystem. This could be a RAM filesystem, if we don't want the files modified or created there to persist, or it could be a disk-backed filesystem (including a filesystem on the network).

The script

The first step is to copy the example script from the man page, and strip it to a bare minimum:

#!/bin/sh

PREREQ=""
prereqs()
{
    echo "$PREREQ"
}

case $1 in
prereqs)
    prereqs
    exit 0
    ;;
esac

. /scripts/functions

Now we can define some variables:

root_ro=/mnt/root-ro
root_rw=/mnt/root-rw
root_rw_upper=${root_rw}/upper
root_rw_work=${root_rw}/work

This reveals something I glossed over. There are typically three components to an overlay filesystem: the lower layer, often read-only; the upper layer, read/write; and the "work" layer, read/write. The upper and work layers are kind of a pair and don't make sense independently. This is further complicated by the fact that the modern kernel supports multiple lower layers and does not require the upper/work layer.

The "work" layer, which we'll call workdir moving forward, is not effectively described in the kernel documentation for overlayfs--only the requirement for it is specified there. The important thing is that it has to live on the same filesystem as the upper layer. We'll stick a pin in it and try to understand it later.

So first we have to make the mount points (skipping any failure checking--do it in a real implementation!):

mkdir -p ${root_ro} ${root_rw_upper} ${root_rw_work}

Now make a RAM filesystem for the upper layer:

mount -t tmpfs tmpfs-root ${root_rw}

Then we move the /root mount to /mnt/root-ro:

mount -o move ${rootmnt} ${root_ro}

Now mount the overlay filesystem:

mount -t overlay overlay -o lowerdir=${root_ro},upperdir=${root_rw_upper},workdir=${root_rw_work} ${rootmnt}

But we aren't done yet--we have to move /mnt/root-ro and /mnt/root-rw somewhere that will be visible when /root becomes /!

mkdir -p ${rootmnt}${root_ro} ${rootmnt}${root_rw}
mount -o move ${root_ro} ${rootmnt}${root_ro}
mount -o move ${root_rw} ${rootmnt}${root_rw}

Loading the overlay module

We can't forget that the overlay kernel module has to be loaded for this to work! If it isn't statically compiled into the kernel (CONFIG_OVERLAY_FS=m means it's a dynamic module), we have to add the line overlay to /etc/initramfs-tools/modules.

Rebuilding the initramfs

After these changes, we have to run as root

update-initramfs -u

Booting with overlayfs

Now we should be in good shape to boot the system with an overlaid root filesystem. We can test it by creating a file in /, then rebooting. (Normally I would have tested by making a file in the home directory, but that was actually a separate filesystem on the VM I had lying around--I would have to merge it into the root filesystem or make a separate overlayfs for it!

Editing the underlying filesystem

The read-only part can be remounted as read/write:

mount -o remount,rw /mnt/root-ro

Future questions

What is the work layer actually used for?
After the mountpoints have been created and the underlying filesystems set up, can the overlayfs be mounted from /etc/fstab instead of the script running in the initramfs context?