depot/third_party/tvl/users/tazjin/presentations/bootstrapping-2018/notes.org
Default email c4fb0432ae Project import generated by Copybara.
GitOrigin-RevId: 3fc1143a04da49a92c3663813c6a0c1e8ccd477f
2020-09-29 23:42:59 -04:00

3.1 KiB

Bootstrapping, reproducibility, etc.

Compiler bootstrapping

This section contains notes about compiler bootstrapping, the history thereof, which compilers need it - and so on:

C

Haskell

  • self-hosted compiler (GHC)

Common Lisp

CL is fairly interesting in this space because it is a language that is defined via an ANSI standard that compiler implementations normally actually follow!

CL has several ecosystem components that focus on making abstracting away implementation-specific calls and if a self-hosted compiler is written in CL using those components it can be cross-bootstrapped.

Python

A note on runtimes

Sometimes the compiler just isn't enough …

LLVM

JVM

Slide thoughts:

  1. Hardware trust has been discussed here a bunch, most recently during the puri.sm talk. Hardware trust is important, as we see with IME, but it's striking that people often take a leap to "I'm now on my trusted Debian with free software". Unless you built it yourself from scratch (Spoiler: you haven't) you're placing trust in what is basically foreign binary blobs. Agenda: Implications/attack vectors of this, state of the chicken & egg, the topic of reproducibility, what can you do? (Nix!)
  2. Chicken-and-egg issue

    It's an important milestone for a language to become self-hosted: You begin doing a kind of dogfeeding, you begin to enforce reliability & consistency guarantees to avoid having to redo your own codebase constantly and so on.

    However, the implication is now that you need your own compiler to compile itself.

    Common examples:

    • C/C++ compilers needed to build C/C++ compilers: GCC 4.7 was the last version of GCC that could be built with a standard C-compiler, nowadays it is mostly written in C++. Certain versions of GCC can be built with LLVM/Clang. Clang/LLVM can be compiled by itself and also GCC.
    • Rust was originally written in OCAML but moved to being self-hosted in 2011. Currently rustc-releases are always built with a copy of the previous release. It's relatively new so we can build the chain all the way.

    Notable exceptions: Some popular languages are not self-hosted, for example Clojure. Languages also have runtimes, which may be written in something else (e.g. Haskell -> C runtime)

How to help:

Most of this advice is about reproducible builds, not bootstrapping, as that is a much harder project.

  • fix reproducibility issues listed in Debian's issue tracker (focus on non-Debian specific ones though)
  • experiment with NixOS / GuixSD to get a better grasp on the problem space of reproducibility

If you want to contribute to bootstrapping, look at bootstrappable.org and their wiki. Several initiatives such as MES could need help!