OVERHEAD

Building USIs with mkosi

Linux

…The hell’s a “USI”?

Let’s start from the very beginning…

  • Initramfs - A CPIO archive file containing the first root filesystem passed to the kernel. Its main purpose is to find, mount, and otherwise setup the “real” system before switch-root‘ing into it, effectively replacing the initramfs with the system being switched into.
  • UKI - A single PE file that contains, but is not limited to:
    • An initramfs: See previous list item.
    • The kernel: As opposed to being a separate file in the ESP.
    • A “stub” (aka. systemd-stub): The executable ran when the PE file is executed, whose purpose is to initialise, check, and “measure” (calculating hashes for TPM-related tasks) the boot-up environment.
    • And other content related to the boot process (e.g. CPU microcode, a boot screen image, etc).
  • USI - An otherwise normal UKI file, with the defining trait being it does not switch to a real system, and is instead intended to be used as is.

In other words, a USI is an entire OS in a single UEFI executable file. Cool!

This one section had to define eight acronyms (skipping “CPU” & “OS” because of how common they are) just for you to maybe understand what I wrote. Less cool.

And mkosi?

mkosi, an acronym for “Make Operating System Image”; the 9th one and counting…

😑️

…is an incredibly useful tool for creating OS images in multiple formats, including USIs! Just give it some configs and some scripts, and it’ll do most of the tedious work for you.

Obviously, I can’t know exactly what you want to build, but as a starting point, here’s a rundown of how to build a USI, and some space-saving tips to make sure the resulting USI doesn’t completely fill your ESP.

Building a USI

Let’s start with a config:

An example of a USI mkosi config.

[Distribution]
Distribution=arch

[Content]
Keymap=uk
MakeInitrd=no
Bootloader=none		# USIs don't need bootloaders installed into them.
KernelCommandLine=rw
Timezone=Europe/London
Locale=en_GB.UTF-8
RootPassword=root	# Or, use `Autologin=yes` instead.

Packages=
	base
	less		# For nicer scrolling output for commands.
	linux
	linux-firmware	# Otherwise, the USI won't boot on most hardware.
	systemd

[Output]
Format=uki
CompressOutput=zst
ImageId=example-usi

The key config options for generating a USI are:

  • Format=uki: Generate a UKI as the output.
  • MakeInitrd=no: Don’t create the /etc/initrd-release symlink to /usr/lib/os-release, but instead, create the /etc/os-release symlink as if building a non-initramfs formatted image.

The rest of the config options can be changed to your needs. In my case, I’m using Arch Linux due to my familiarity with its tooling, however, you may find that other distributions are more suited for USIs, or that they better fit your own knowledge.

Optimising the output size

Hah, so uh, you know how I haven’t been writing as much lately… yeah, here’s why.

All output size reductions, either as a note at the start of each section or in the form of # Trims ~... comments, are based on the size differences of the compressed USI file before & after a change. Therefore, they will be inaccurate for you, and they should only be used for rough relative comparisons.

Kernel modules

Estimated output size reduction: Varies.

By default, all kernel modules are included, which adds a huge amount of content. Conveniently, multiple config options exist for changing this behaviour to your needs.

Before mkosi version 20, you would have to specify KernelModulesExclude=.* to exclude everything, and then manually construct a list of kernel module lines under KernelModulesInclude=. As of version 20 however, there’s the KernelModulesIncludeHost= config option which, when set to true, includes all the kernel modules your host OS currently has loaded, saving the need to manually specify most modules you’d want.

An mkosi config snippet that only includes kernel modules loaded on the host OS.

[Content]
KernelModulesExclude=.*
KernelModulesIncludeHost=yes

The KernelModules* config options also remove unnecessary files from under /usr/lib/firmware/, except for /usr/lib/firmware/nvidia/ which needs to be removed manually for some reason.

RemoveFiles=, and splitting the gcc-libs package

Estimated output size reduction: ~14MiB.

The RemoveFiles= config option allows you to specify what files and directories should be removed from the output, with this step happening just before the mkosi.finalize script is executed. Using the mkosi-initrd configs for inspiration, we can remove unnecessary library files that come from the gcc-libs Arch Linux package.

See the mkosi-initrd directory under the mkosi repo for various RemoveFiles= lines to include.

An mkosi config snippet that removes unnecessary libraries files owned by the gcc-libs package.

RemoveFiles=
	# See the `mkosi-initrd` directory in the `mkosi` repo. Trims ~12.3MiB.
	/usr/lib/libgfortran.so*	# Trims ~935KiB.
	/usr/lib/libgomp.so*		# Trims ~0KiB.
	/usr/lib/libgo.so*		# Trims ~7.8MiB.
	/usr/lib/libgphobos.so*		# Trims ~1.9MiB.
	/usr/lib/libobjc.so*		# Trims ~83KiB.
	/usr/lib/libasan.so*		# Trims ~467KiB.
	/usr/lib/libtsan.so*		# Trims ~938KiB.
	/usr/lib/liblsan.so*		# Trims ~220KiB.
	/usr/lib/libubsan.so*		# Trims ~268KiB.
	/usr/lib/libstdc++.so*		# Trims ~978KiB.
	/usr/lib/libgdruntime.so*	# Trims ~0KiB.

This step is unnecessary for Fedora USIs, since they already split their gcc-libs package into multiple packages.

Although these libraries are “unnecessary” for successfully running the USI, some non-critical commands like pkgdata or xgettext (both missing the libstdc++.so* files) will cease to work. Thankfully, instead of leaving it to you to find out what commands are broken by the missing libraries, I’ve done that work for you…

An mkosi config snippet that removes binaries that are missing their required libraries.

RemoveFiles=
	# Binaries with missing libraries. Trims ~1.9MiB.
	/usr/bin/arpd		# Trims ~8KiB. Missing `libdb-5.3.so`.
	/usr/bin/derb		# Trims ~16KiB. Missing `libstdc++.so.6`.
	/usr/bin/dwp		# Trims ~198KiB. Missing `libstdc++.so.6`.
	/usr/bin/escapesrc	# Trims ~23KiB. Missing `libstdc++.so.6`.
	/usr/bin/genbrk		# Trims ~17KiB. Missing `libstdc++.so.6`.
	/usr/bin/genccode	# Trims ~13KiB. Missing `libstdc++.so.6`.
	/usr/bin/gencfu		# Trims ~5KiB. Missing `libstdc++.so.6`.
	/usr/bin/gencmn		# Trims ~6KiB. Missing `libstdc++.so.6`.
	/usr/bin/gencnval	# Trims ~28KiB. Missing `libstdc++.so.6`.
	/usr/bin/gendict	# Trims ~0KiB. Missing `libstdc++.so.6`.
	/usr/bin/gennorm2	# Trims ~29KiB. Missing `libstdc++.so.6`.
	/usr/bin/genrb		# Trims ~74KiB. Missing `libstdc++.so.6`.
	/usr/bin/gensprep	# Trims ~0KiB. Missing `libstdc++.so.6`.
	/usr/bin/gp-*		# Trims ~326KiB. Missing `libstdc++.so.6`.
	/usr/bin/gr2fonttest	# Trims ~1KiB. Missing `libstdc++.so.6`.
	/usr/bin/icu-config	# Trims ~33KiB. Probably useless without the other `icu*` commands.
	/usr/bin/icuexportdata	# Trims ~1KiB. Missing `libstdc++.so.6`.
	/usr/bin/icuinfo	# Trims ~0KiB. Missing `libstdc++.so.6`.
	/usr/bin/icupkg		# Trims ~18KiB. Missing `libstdc++.so.6`.
	/usr/bin/ld.gold	# Trims ~897KiB. Missing `libstdc++.so.6`.
	/usr/bin/makeconv	# Trims ~34KiB. Missing `libstdc++.so.6`.
	/usr/bin/memusagestat	# Trims ~5KiB. Missing `libgd.so.3`.
	/usr/bin/memusage	# Trims ~0KiB. Missing `memusagestat`.
	/usr/lib/libmemusage.so	# Trims ~40KiB. Unused without binaries(?).
	/usr/bin/msg*		# Trims ~120KiB. Missing `lib{stdc++.so.6,gomp.so.1}`.
	/usr/bin/pinentry-*	# Trims ~0KiB. Missing `lib{gcr-base-3.so.1,gtk-x11-2.0.so.0,KF5WaylandClient.so.5}`.
	/usr/bin/pkgdata	# Trims ~15KiB. Missing `libstdc++.so.6`.
	/usr/bin/pzstd		# Trims ~0KiB. Missing `libstdc++.so.6`.
	/usr/bin/uconv		# Trims ~19KiB. Missing `libstdc++.so.6`.
	/usr/bin/xgettext	# Trims ~171KiB. Missing `libstdc++.so.6`.
	/usr/bin/xmlcatalog	# Trims ~0KiB. Missing `libstdc++.so.6`.
	/usr/bin/xmllint	# Trims ~32KiB. Missing `libstdc++.so.6`.

If you want to find this list of binaries yourself, then:

  1. Create a script which runs every command under /usr/bin/ with the --help argument, and then grep the standard error output for the string loading shared libraries, which indicates the binary failed due to a missing library.
    • Do not run this on a real system since some commands, regardless of argument, will attempt to do real changes.
  2. Check all shell scripts under /usr/bin/ for any commands the first step revealed are broken (e.g. memusage is a Bash script which executes memusagestat).
  3. (optional) Then, if you hate having free time, change the script to print what exact libraries are missing for each binary, so you can write the line comments as shown above.

You can also remove most arbitrary binaries you don’t think users will use, the binaries provided by the krb5 package being a good example (that package’s library files are required, so you can’t not install it). And additionally, you can remove /usr/lib/modules/*/vmlinuz since the kernel is already embedded in the USI file (though only in mkosi version 20 or later, otherwise the build will fail).

pacman specific tweaks

Estimated output size reduction: Varies.

For the databases under /var/lib/pacman/sync/, they can easily be restored by running pacman -Syu. However, removing everything under /var/lib/pacman/local/ would break pacman --query, which is generally useful for analysing dependencies, checking package versions, etc. So, if you want to keep that functionality, while still trimming out most of the bytes, you can just remove the files and mtree files from all packages.

An mkosi config snippet that removes /var/lib/pacman/ files, without breaking the pacman --query command.

RemoveFiles=
	/var/lib/pacman/local/*/files		# Trims ~261KiB.
	/var/lib/pacman/local/*/mtree		# Trims ~2.2MiB.
	/var/lib/pacman/sync/*			# Trims ~8.5MiB.

The above # Trims ~... comment values are based on my own USI package selection; expect vastly different numbers for your own setups.

Stripping all binaries

Estimated output size reduction: ~16MiB.

The strip command is provided by the binutils Arch Linux package.

Stripping binaries as described here may lead to corruption according to some sources, and should be used with at least some level of caution. Do not use it as a general space saving tool on your host OS.

That being said, I’ve not seen any USIs become broken after this step, and for the amount it trims from the output, it’s honestly kinda worth it. But again, USIs grant you a safety net of being an in-memory operating systems, while regular OSs do not.

Instead of adding the strip command to the USI, we can choose to make the mkosi.finalize script not chroot into the USI and instead use the command from the host. To make an mkosi script run on the host, ensure that:

  1. The script does not have the .chroot file extension (e.g. mkosi.finalize.chroot).
  2. The script does not use the mkosi-chroot command, either being used just before the command or function in question, or with it being included in a line such as:
    if [ "$container" != "mkosi" ]; then exec mkosi-chroot "$CHROOT_SCRIPT" "$@"; fi.

A POSIX mkosi.finalize script which runs the strip --unneeded command on all eligible files.

#!/bin/sh

strip_bin() {
	## Remember, the `strip` command is from the host OS.
	if [ "$(file --brief --mime-encoding "$1")" = "binary" ]; then
		strip --strip-unneeded "$1" > /dev/null 2>&1; fi
}

strip_all_bins() {
	# Strip all binaries under `/usr/{bin,lib}` (which is all of them).
	## Trims ~16.2MiB.
	printf "Stripping all binaries...\n"	
	find "$BUILDROOT"/usr/{bin,lib}/ -xdev -type f \
		-not \( -path "*lib/firmware*" -prune \) \
		-not \( -path "*lib/modules*" -prune \) \
		! -empty \
		! -iname "*.sh" \
		! -iname "*.csh" \
		! -iname "*.js" \
		! -iname "*.css" \
		! -iname "*.xml" \
		! -iname "*.html" \
		! -iname "*.txt" \
		! -iname "*.conf" \
		! -iname "*.json" \
		! -iname "*.path" \
		! -iname "*.link" \
		! -iname "*.mount" \
		! -iname "*.timer" \
		! -iname "*.slice" \
		! -iname "*.socket" \
		! -iname "*.target" \
		! -iname "*.preset" \
		! -iname "*.service" \
		! -iname "*.network" \
		! -iname "*.example" \
		! -iname "*.tar" \
		! -iname "*.gz" \
		! -iname "*.xz" \
		! -iname "*.bz2" \
		! -iname "*.zst" \
		! -iname "*.gpg" \
		-print0 |
		xargs -0 -P 16 -I % sh -c 'strip_bin "$@"' _ %
}

## The `find` command in the `strip_all_bins` function won't work otherwise.
export -f strip_bin

## Run the `strip_all_bins` function in parallel to save blocking other commands.
strip_all_bins &

wait

The long list of find options ensures the strip_all_bins function skips nearly all invalid files I know of, with the xargs -P 16 command running 16 strip_bin functions in parallel for even faster execution.

The value of the $BUILDROOT variable is the path to the output’s root (e.g. /home/user/.cache/mkosi-workspacey7tuhq4o/root), however, don’t panic if you forget this variable since mkosi only has a partial & mostly read-only view of the host; it shouldn’t lead to you accidentally stripping your host’s binaries instead of the output’s.

As always though: Make. Frequent. Backups. (that applies outside of mkosi too)

XKB

Estimated output size reduction: ~500KiB.

Remember the XKB blog post? Well, you should be able to use the script from that blog post to remove unnecessary files under /usr/share/X11/xkb/ yourself, but, what about /usr/share/X11/locale/?

In a mkosi.prepare script, unconditionally keep /usr/share/X11/locale/{compose.dir,en_US.UTF-8/}, and then look for your locale’s directory under /usr/share/X11/locale/ if it exists. Functionally (unless you’re using deadkeys), this is just to stop XKB warning messages; you can remove the locale/ directory as a whole without major consequences.

/usr/share/locale/ & locale-gen

Estimated output size reduction: ~22MiB.

Fun fact: Did you know the /usr/share/locale/ directory is completely unused after you run the locale-gen command?

Yep, locale-gen first reads the contents of the /usr/share/locale/ directory, but then, it generates the /usr/lib/locale/locale-archive archive file, after which /usr/share/locale/ can be entirely removed. Once you’ve done that though, the locale-gen & localedef commands won’t function anymore, so feel free to remove the commands too.

/usr/share/kbd/

Estimated output size reduction: ~1.6MiB.

Assuming your kernel was compiled with built-in console fonts, which you can check by running zgrep "FONT_.*x.*=y" /proc/config.gz and seeing if anything’s listed, you can remove the font-related directories from under /usr/share/kbd/ (after which, the setfont command won’t be useful, much like locale-gen).

Otherwise, you’ll need to manually filter out what files you need.

Perhaps using the XKB method involving the inotifywait command, with setfont used instead of xkbcli, could work here? Untested.

Non-XKB keymaps

Unfortunately, VTs don’t understand XKB layouts, and can only parse the keymaps as found under /usr/share/kbd/keymaps/. Thankfully, the ckbcomp command exists for compiling an XKB layout into a keymap, enabling you to remove everything else under /usr/share/kbd/keymaps/.

e.g. ckbcomp -layout gb > /usr/share/kbd/keymaps/xkb.kmap

You can then set Keymap=xkb in the mkosi config to use this keymap file. Additionally, compressed .kmap files are still readable, which is a good idea since keymaps are highly compressible (e.g. The zstd --long --ultra -22 command shrinks the gb XKB layout keymap from 120KiB to 3KiB).

The catch? Arch Linux doesn’t officially package the ckbcomp tool, so you’ll need to either:

  1. Install ckbcomp from the AUR onto your host, and then run an mkosi script on the host as shown in the ‘Stripping all binaries’ section.
  2. Download the latest ckbcomp release, and place the tool under mkosi.skeleton so it’s available to the mkosi.prepare & mkosi.build scripts (remember to remove it with RemoveFiles= afterwards).
    • Bare in mind that ckbcomp is a perl script, so you’ll need to have perl installed to execute it. I personally handle this by setting BuildPackages=perl in the mkosi config, and then running ckbcomp -layout gb > "$DESTDIR"/usr/share/kbd/keymaps/xkb.kmap in a chrooted mkosi.build script so perl isn’t installed in the output.
  3. Compile ckbcomp in a mkosi.build script, and run ckbcomp as described in the list item of option two.

xkbcli compile-keymap doesn’t touch all the files ckbcomp expects to exist, so do the XKB trimming after ckbcomp is ran, or alternatively, store the trimmed version at /usr/share/X11.tmp/ temporarily and replace /usr/share/X11/ after the mkosi.build script is ran.

Terminfo

Estimated output size reduction: ~700KiB.

The /usr/share/terminfo/ files are responsible for informing tools what a terminal is capable of (e.g. The supported character set), which means keeping the files for non-existent terminals isn’t necessary. As a good example, the file for VTs is located at /usr/share/terminfo/l/linux.

Remove unused timezones

Estimated output size reduction: ~300KiB.

If you don’t change timezones very often, you can consider removing unused timezones from under /usr/share/zoneinfo/, as well as removing /usr/share/zoneinfo-leaps/ (/usr/share/zoneinfo-posix/ is just a symlink to zoneinfo/).

upx

Estimated output size reduction: Varies.

Similar to the strip command, use the upx command with caution due to possible corruption issues.

upx is a tool for creating self-decompressing binaries, significantly decreasing the size of existing binaries, but, with the major downside that the binaries are fully decompressed into memory for each execution of that binary.

Admittedly, I don’t use this tool anywhere, except for /usr/bin/Xwayland as cage currently requires it to exist (which will be fixed in the next update), so feel free to use this tool as you see fit.

And much more…

For comparison, my mkosi.conf.d/remove.conf config, containing only the RemoveFiles config option, is 155 lines long (with comments and empty lines removed). Past this point however, the size differences are usually pretty small, or require significantly reducing functionality.

And for those curious…

Yes, I kept counting: There are 14 fucking acronyms in this blog post.

WHY!?