d56f44df06
GitOrigin-RevId: bc4b9eef3ce3d5a90d8693e8367c9cbfc9fc1e13 |
||
---|---|---|
.. | ||
src | ||
Cargo.lock | ||
Cargo.toml | ||
README.md |
What is this for?
NixOS's traditional initrd is generated by listing the paths that
should be included in initrd and copying the full runtime closure of
those paths into the archive. For most things, like almost any
executable, this involves copying the entirety of huge packages like
glibc, when only things like the shared library files are needed. To
solve this, NixOS does a variety of patchwork to edit the files being
copied in so they only refer to small, patched up paths. For instance,
executables and their shared library dependencies are copied into an
extraUtils
derivation, and every ELF file is patched to refer to
files in that output.
The problem with this is that it is often difficult to correctly patch
some things. For instance, systemd bakes the path to the mount
command into the binary, so patchelf is no help. Instead, it's very
often easier to simply copy the desired files to their original store
locations in initrd and not copy their entire runtime closure. This
does mean that it is the burden of the developer to ensure that all
necessary dependencies are copied in, as closures won't be
consulted. However, it is rare that full closures are actually
desirable, so in the traditional initrd, the developer was likely to
do manual work on patching the dependencies explicitly anyway.
How it works
This program is similar to its inspiration (find-libs
from the
traditional initrd), except that it also handles symlinks and
directories according to certain rules. As input, it receives a
sequence of pairs of paths. The first path is an object to copy into
initrd. The second path (if not empty) is the path to a symlink that
should be placed in the initrd, pointing to that object. How that
object is copied depends on its type.
-
A regular file is copied directly to the same absolute path in the initrd.
- If it is also an ELF file, then all of its direct shared library dependencies are also listed as objects to be copied.
-
A directory's direct children are listed as objects to be copied, and a directory at the same absolute path in the initrd is created.
-
A symlink's target is listed as an object to be copied.
There are a couple of quirks to mention here. First, the term "object"
refers to the final file path that the developer intends to have
copied into initrd. This means any parent directory is not considered
an object just because its child was listed as an object in the
program input; instead those intermediate directories are simply
created in support of the target object. Second, shared libraries,
directory children, and symlink targets aren't immediately recursed,
because they simply get listed as objects themselves, and are
therefore traversed when they themselves are processed. Finally,
symlinks in the intermediate directories leading to an object are
preserved, meaning an input object /a/symlink/b
will just result in
initrd containing /a/symlink -> /target/b
and /target/b
, even if
/target
has other children. Preserving symlinks in this manner is
important for things like systemd.
These rules automate the most important and obviously necessary copying that needs to be done in most cases, allowing programs and configuration files to go unpatched, while keeping the content of the initrd to a minimum.
Why Rust?
-
A prototype of this logic was written in Bash, in an attempt to keep with its
find-libs
ancestor, but that program was difficult to write, and ended up taking several minutes to run. This program runs in less than a second, and the code is substantially easier to work with. -
This will not require end users to install a rust toolchain to use NixOS, as long as this tool is cached by Hydra. And if you're bootstrapping NixOS from source, rustc is already required anyway.
-
Rust was favored over Python for its type system, and because if you want to go fast, why not go really fast?