Reproducible container images

August 21, 2020

Having a way to produce bit-by-bit equal artifacts from the same source code in two different builds is the concept of reproducible (or deterministic) builds. I will not go into the details of why reproducible builds are desirable, but I will show an example which involves container images (aka Docker images) as result of the build process, and how to make the build process reproducible.

Suppose your team is working on a project; this project consists of a Git repository with the code for your app and a Dockerfile. For simplicity let’s assume that your code is just a simple shell script that prints “Hello, World!” so that you repository looks like this:

├── Dockerfile
└── script.sh

with Dockerfile:

FROM busybox

ADD script.sh .

ENTRYPOINT ["/script.sh"]

and script.sh:

#!/bin/sh

echo "Hello, World!"

The problem

One of your team members, say Bob, checks out the repository and builds the image. In what follows I will use Buildah and Podman to work with Docker images instead of the traditional Docker tools: Buildah is for building container images like docker build does, while Podman is for running containers. Unlike Docker that requires you to be member of the docker group in order to use it, giving you effective root privileges, Buildah and Podman can be run by an unpriviledged user without the need of being member of special groups, and therefore provide a much safer alternative. The equivalent of docker build to build an image with Buildah is buildah bud:

$ buildah bud .
STEP 1: FROM busybox
STEP 2: ADD script.sh .
STEP 3: ENTRYPOINT ["/script.sh"]
STEP 4: COMMIT
Getting image source signatures
Copying blob 514c3a3e64d4 skipped: already exists
Copying blob eb8995c05b8a done
Copying config dfd10908b6 done
Writing manifest to image destination
Storing signatures
--> dfd10908b64
dfd10908b6402d65151bd316c9fe7ecb44e12dffdb6f25344818393e47f045e9

Now another of your team members, Alice, checks out your the repository and builds it:

$ buildah bud .
STEP 1: FROM busybox
STEP 2: ADD script.sh .
STEP 3: ENTRYPOINT ["/script.sh"]
STEP 4: COMMIT
Getting image source signatures
Copying blob 514c3a3e64d4 skipped: already exists
Copying blob eb8995c05b8a [--------------------------------------] 0.0b / 0.0b
Copying config 852e0ad309 done
Writing manifest to image destination
Storing signatures
--> 852e0ad3092
852e0ad3092d0b78dc6838da3f76e0919c087ee0d1b472b40c3892ce28611bd7

Notice that the last line two lines of outputs which are the short and full hash of the resulting container image, they are different in the two builds. Infact Bob and Alice built two different container images from the same code with the same Dockerfile. If we run them:

$ podman run --rm dfd10908b64
Hello, World!
$ podman run --rm 852e0ad3092
Hello, World!

we see the same output from both, but nevertheless they are different images. What caused the two builds to produce different outputs? There are two main reasons:

  1. Dependencies: if our software have external dependencies and different people build with different versions of dependencies, the result will be different.

  2. Timestamps: if the final artifact of the build process (a Docker image in our case) contains timestamps of some of its components, builds at diffferent times produce different artifacts.

Fix dependencies

Our shell script is extremely simple and doesn’t have any dependencies other than /bin/sh. In general real production software have hundreds of dependencies, and it’s important to specify the exact versions for all of them. In modern JavaScript projects, for example, this is done by tools like npm or yarn, that store exact version information in package-lock.json and yarn.lock files respectively. Nevertheless even our simple example depends on /bin/sh and we use the first line of our Dockerfile, FROM busybox, to provide it. We can change it to specify the exact version, e.g. FROM busybox:1.32.0, this way we are sure that whoever builds the image will always pull version 1.32.0 of BusyBox, even ten years from now. In fact we can do even better than that: 1.32.0 is just a tag that is attached to the image, and nothing prevents it from being changed at any point in time. Docker images however are content addressable, meaning we can target a specific image using a hash of its content, and the hash, unlike the version tag, cannot be changed (if you want to know more on how the hash is calculated I recommend taking a look at the image format specification). To find out what the hash of version 1.32.0 of the BusyBox image is, use podman inspect:

$ podman pull busybox:1.32.0
...
$ podman inspect busybox:1.32.0
[
    {
        "Id": "018c9d7b792b4be80095d957533667279843acf9a46c973067c8d1dff31ea8b4",
        "Digest": "sha256:400ee2ed939df769d4681023810d2e4fb9479b8401d97003c710d0e20f7c49c6",
        ...
    }
]

the "Digest": line is what you are looking for. Change your Dockerfile as follows:

# FROM busybox:1.32.0
FROM busybox@sha256:400ee2ed939df769d4681023810d2e4fb9479b8401d97003c710d0e20f7c49c6

ADD script.sh .

ENTRYPOINT ["/script.sh"]

Notice the @ mark before the sha256:... in the FROM statement. Let’s try and see if our build is reproducible now. Unfortunately again Alice and Bob will end up with different images. There is still some work to be done and that is fixing timestamps.

Fix timestamps

Timestamps indeterminism occurs twice during the generation of a Docker image. First when we ADD or COPY files to the image, those files get the timestamp of the moment they are copied. Same is true when we generate a file inside the Dockerfile using a RUN script. Second, according to the specifications every image has a creation timestamp. Fixing this indeterminism requires setting all timestamps to a predefined date, e.g. the zero Unix time 1970-01-01T00:00:00.000Z, and it cannot be done using a Dockerfile instruction, but requires some low level commands provided by Buildah.

Create a shell script buildah.sh and make it executable:

#!/bin/sh

ctr=$(buildah from busybox@sha256:400ee2ed939df769d4681023810d2e4fb9479b8401d97003c710d0e20f7c49c6)

mnt=$(buildah mount "$ctr")
cp script.sh "$mnt"
touch --date=@0 "$mnt/script.sh"
buildah umount "$ctr"

buildah config --entrypoint '["/script.sh"]' --cmd '' "$ctr"

buildah commit --omit-timestamp "$ctr"
buildah rm "$ctr"

This script mimics the commands in our Dockerfile but allows a little bit more fine tuning. The buildah from line is equivalent to FROM instruction of a Dockerfile and creates a container from the specified image. buildah mount allows us to interact with the container as a mounted filesystem, and it can be be used to copy or modify files in it. We copy the script.sh file and then we change its timestamp to zero using touch. Next we set ENTRYPOINT and CMD using buildah config. Finally with buildah commit we generate the our Docker image from the container; the --omit-timestamp flag is necessary to set the image creation timestamp to zero as well. Clean up by removing the temporary container with buildah rm.

One last detail: running buildah mount without root privileges requires running the whole script in a “simulated” environment, that is provided by buildah unshare:

buildah unshare ./buildah.sh

That’s it! Run the above script as many time as you want, it will always produce tha same bit-by-bit image.

Final remarks

We achieved our goal of building Docker images deterministically. Real world example may have other sources of indeterminism and must be checked carefully.

It must be noticed however that there is a little drawback in this process: all images have the same creation time. This can be particularly annoying in CI environments where we build a new image for each commit. To solve this problem we can tag each image with the Git commit hash it was built form: this now makes perfect sense not only because the commit hash provides a unique identifier, but also because whenever someone runs a build at that particular commit, the same Docker image will be generated.


© 2020, Nicola Squartini