Chapter 7. Managing file systems in image mode for RHEL
Currently, image mode for RHEL uses OSTree as a backend, and enables composefs
for storage by default. The /opt
and /usr/local
paths are plain directories, and not symbolic links into /var
. This enables you to easily install third-party content in derived container images that write into /opt
for example.
7.1. Physical and logical root with /sysroot
When a system is fully booted, it is similar to chroot
, that is, the operating system changes the apparent root directory for the current running process and its children. The physical host root filesystem is mounted at /sysroot
. The chroot
filesystem is called a deployment root.
The remaining filesystem paths are part of a deployment root which is used as a final target for the system boot. The system uses the ostree=kernel
argument to find the deployment root.
/usr
-
This filesystem keeps all operating system content in
/usr
, with directories such as/bin
working as symbolic links to/usr/bin
.
composefs
enabled /usr
is not different from /
. Both directories are part of the same immutable image, so you do not need to perform a full UsrMove
with a bootc system.
/usr/local
-
The base image is configured with
/usr/local
as the default directory. /etc
The
/etc
directory contains mutable persistent state by default, but it supports enabling theetc.transient config
option. When the directory is in mutable persistent state, it performs a 3-way merge across upgrades:-
Uses the new default
/etc
as a base -
Applies the diff between current and previous
/etc
to the new/etc
directory -
Retains locally modified files that are different from the default
/usr/etc
of the same deployment in/etc
.
-
Uses the new default
The ostree-finalize-staged.service
executes these tasks during shutdown time, before creating the new boot loader entry.
This happens because many components of a Linux system ship default configuration files in the /etc
directory. Even if the default package does not ship it, by default the software only checks for config files in /etc
. Non bootc image based update systems with no distinct versions of /etc
are populated only during the installation time, and will not be changed at any point after installation. This causes the /etc
system state to be influenced by the initial image version and can lead to problems to apply a change, for example, to /etc/sudoers.conf
, and requires external intervention. For more details about file configuration, see Building and testing RHEL bootc images.
/var
-
The content in the
/var
directory is persistent by default. You can also make/var
or subdirectories mount points be persistent, whether network ortmpfs
.
There is just one /var
directory. If it is not a distinct partition, then physically the /var
directory is a bind mount into /ostree/deploy/$stateroot/var
and is shared across the available boot loader entries deployments.
By default, the content in /var
acts as a volume, that is, the content from the container image is copied during the initial installation time, and is not updated thereafter.
The /var
and the /etc
directories are different. You can use /etc
for relatively small configuration files, and the expected configuration files are often bound to the operating system binaries in /usr
. The /var
directory has arbitrarily large data, for example, system logs, databases, and by default, will not be rolled back if the operating system state is rolled back.
For example, making an update such as dnf downgrade postgresql
should not affect the physical database in /var/lib/postgres
. Similarly, making a bootc update
or bootc rollback
do not affect this application data.
Having /var
separate also makes it work cleanly to stage new operating system updates before applying them, that is, updates are downloaded and ready, but only take effect on reboot. The same applies for Docker volume, as it decouples the application code from its data.
You can use this case if you want applications to have a pre-created directory structure, for example, /var/lib/postgresql
. Use systemd tmpfiles.d
for this. You can also use StateDirectory=<directory>
in units.
- Other directories
-
There is no support to ship content in
/run
,/proc
or other API Filesystems in container images. Apart from that, other top level directories such as/usr
, and/opt
, are lifecycled with the container image. /opt
-
With
bootc
usingcomposefs
, the/opt
directory is read-only, alongside other top levels directories such as/usr
.
When a software needs to write to its own directory in /opt/exampleapp
, a common pattern is to use a symbolic link to redirect to, for example, /var
for operations such as log files:
RUN rmdir /opt/exampleapp/logs && ln -sr /var/log/exampleapp /opt/exampleapp/logs
Optionally, you can configure the systemd unit to launch the service to do these mounts dynamically. For example:
BindPaths=/var/log/exampleapp:/opt/exampleapp/logs
- Enabling transient root
-
To enable a fully transient writable
rootfs
by default, set the following option inprepare-root.conf
.
[root] transient = true
This enables a software to transiently writes to /opt
, with symlinks to /var
for content that must persist.
7.2. Version selection and bootup
Image mode for RHEL uses GRUB by default, with exception to s390x
architectures. Each version of image mode for RHEL currently available on a system has a menu entry.
The menu entry references an OSTree deployment which consists of a Linux kernel, an initramfs
and a hash linking to an OSTree commit, that you can pass by using the ostree=kernel
argument.
During bootup, OSTree reads the kernel argument to determine which deployment to use as the root filesystem. Each update or change to the system, such as package installation, addition of kernel arguments, creates a new deployment.
This enables rolling back to a previous deployment if the update causes problems.