Dev Containers Part 3: UIDs and file ownership

Posted: 2024-10-29

Categories: development , containers , docker , linux

This is the third entry in a series of posts on devcontainers; the first entry is here.

In the previous post, we showed how to configure devcontainers for a small project and how to access the environment from within Emacs.

The project directory mount

A running container is an isolated bubble, a separate operating system instance, and by default it does not have access to the host's file system. In order for the code running inside the container to have access to your project files (but no other files on your system), the project root directory in your local file system must be mounted inside the container. This is one of the things that devcontainers do automatically for you.

Typically, the container runs a version of Linux, such as Ubuntu, or Alpine, and it will see your files through the eyes of a Linux file system with Unix-style User Identifers (UIDs), Group Identifers (GIDs), and permission flags (read/write/execute). Although is possible to run Windows or MacOS as the operating system inside the container, this article is only about the Linux case.

Your host system on the other hand could be anything - Windows (with an NTFS file system), MacOS (with Apple File System), or another Linux system running a typical Linux file system like ext4 natively.

When a devcontainer is started (automatically by VSCode or some other editor, or manually via the devcontainer CLI), the project root .../DIRNAME in the host file system gets mounted in the container under the path /workspace/DIRNAME, using a Docker bind mount, so the project's files can be accessed from within. This should just work with no effort from you.

File ownership

The big question is how the "user" inside the container - the processes running on your behalf - can be allowed to read and modify these files. For this to work, it must look to the container as if the files are owned by the same user account as the one running the processes, or alternatively as if the processes are running as the super-user (root in Linux) who is allowed to do anything.

On Windows and Mac OS, there is a virtualization layer that maps between the external user's own user ID and group ID (the ones used in the native file system on the host), and the Linux-style UID and GID used by the processes and the mounted files inside the container. Regardless of whether the UID inside the container is 0 (root) or something else like 1000 (typically the first normal user account in Linux), the virtualization will translate the file ownership so the processes in the container may access the files, and outside the container, it looks like it was you who made the changes directly in the host file system, e.g. NTFS on Windows.

On Linux, however, there is no virtualization, just sandboxing, and all UIDs and GIDs are passed through directly between the container and the host system. It is more efficient, but at the same time it can be quite confusing.

Devcontainers on Linux

If you're only running on Mac OS or Windows and it's not your job to actually fiddle with the devcontainer configuration, then congratulations, you can skip the rest of this article. But if you need to understand what happens on a Linux host and want to make sure your containers are configured properly, then read on.

No UID/GID translation

Symptoms of this lack of translation can be:

Processes in the container are not allowed to read or write files in the workspace.

Typically, this happens if the UID inside the container is 1000 but your actual UID on the host system is 1001, or vice versa.
Files or directories that were created inside the container cannot be modified or deleted outside the container because they are owned by root.

This happens if the container processes run as UID 0. They will always be allowed to create or modify files in the workspace, but any files or directories created will be owned by UID 0 also in the host file system.

UID Update

Devcontainers do have a "UID update" functionality on Linux. It is however not a mapping like the one done on Windows or Mac OS. It's just some extra code that, as the container starts, will check if the external UID is different from the internal, and if so, it will update the UID within the container (by straight up modifying the internal /etc/passwd file) to be the same as the external one. It will also update the Group ID in the same way.

The problem is that it only works under these conditions:

The UID inside the container must not be 0 (root), since that account is special and cannot be given another ID.
The external UID must not already be in use by some other account inside the container, so that the update does not cause a clash.

If these conditions are not met, UID updating is skipped, and the container will run with its default UID.

When it works

If your external user happens to be UID 1000, as it typically is on a single user Linux system, and the container also runs as UID 1000 internally, then no UID update is needed and things just work.
If the container runs as any account except root, such as UID 1000, and your external user is something else such as UID 1001 (or 1017 or whatever), as it might be on a multi user Linux system, then as long as the container does not have another account internally that could clash with yours, the UID update will happen and things will also just work.

When it doesn't work

Many containers have been designed to run as root internally, through a USER declaration in the Dockerfile, and this will result in symptom 2 above.
Some containers have multiple user accounts internally, which can prevent UID update. For example, if the container is set up to run its processes as UID 1001, and your external UID is 1000, then the UID update would try to make the container use 1000 instead - but if that UID is already in use within the container, then the update fails, and the container processes will use its default UID, probably resulting in symptom 1 above.

The latter can easily happen when the Dockerfile that specifies the dev container starts from a base image assuming it has no user with UID 1000 or above, and then creates a new user with something like RUN useradd for its own purposes. If the assumption is wrong and UID 1000 already exists in the base image, the new user will get UID 1001, and it might work fine on Windows and Mac OS thanks to the virtualization, but breaks on a Linux host.

Overriding the base container

Luckily, most problems can be fixed in the devcontainer.json file, if it is not in your power to rebuild the container image.

First of all, you can set containerUser to override the default for all processes, or you can set remoteUser to just override the user for devcontainer commands.

"containerUser": "name-or-UID"

This is like specifying USER in a Dockerfile, so all the container's processes will launch as that user by default.

"remoteUser": "name-or-UID"

This only changes the user for running commands from the outside with devcontainer exec. This is also what happens when a tool like VSCode runs stuff inside the container. (What devcontainer exec does is basically just to parse the devcontainer.json file and then run docker exec -u name-or-UID.)

In most documentation, remoteUser is the recommended one to set. I have not yet found a good explanation of when it is preferred to set `containerUser'.

Doing the right thing

Depending on how the base container is set up, you will need to do different things in order to make UID updating work, so that the UID inside the container will be the same as your external user ID:

If the base container runs as root, but also has an a single non-root user that you can use (such as ubuntu or vscode), you only need to specify that internal user name or UID with remoteUser or containerUser.
If the base container runs as root, and does not have a usable non-root user, don't panic, you just need to create that user when the container starts. Thankfully someone has already solved this so you do not need to write any weird scripts yourself.

Specify the user name with remoteUser or containerUser as above, but also add the following feature declaration (see https://github.com/nils-geistmann/devcontainers-features/blob/main/src/create-remote-user/README.md):
```
"features": {
    "ghcr.io/nils-geistmann/devcontainers-features/create-remote-user:0": {
    }
}
```
If the base container already has a user with UID 1000, but it cannot be the devcontainer user for some reason, then adding a new user is not enough - the external UID will often be 1000, and then the automatic UID updating will fail because UID 1000 is already in use internally. One possibility is then to use a wrapper Dockerfile that changes UIDs or removes users; see https://code.visualstudio.com/remote/advancedcontainers/add-nonroot-user#_change-the-uidgid-of-an-existing-container-user
If you for some reason want to prevent UID update from happening at all, it can be disabled by setting "updateRemoteUserUID": false

Devcontainer-ready images

A lot of the documentation on dev containers assumes you're using VSCode, and often also that you are using a devcontainer-ready image that is already set up with a suitable non-root user and some dev tools, and not just some basic Docker image like ubuntu:22.04.

In that case you typically do not need to to set remoteUser, or use the create-remote-user feature. For example, in https://github.com/devcontainers/images the images mcr.microsoft.com/devcontainers/base:ubuntu-20.04 and mcr.microsoft.com/devcontainers/base:ubuntu-22.04 both have a default user vscode with UID 1000.(*)

If you write your own Dockerfile for your devcontainer configuration instead of referring to some existing image, you should ensure that it is set up similarly so that it can be expected to work for everyone.

(*) In Microsoft's devcontainer images, the USER setting is actually root, but the images have additional devcontainer.metadata which can provide the same settings as devcontainer.json, and this contains an entry "remoteUser":"vscode", so that when started from VSCode or with devcontainer up, the user will be vscode. When started with a plain Docker command, the metadata is not parsed.

Ubuntu 24 hickups

As of Ubuntu 24 (Noble Numbat), the Ubuntu base image comes with a non-root user ubuntu with UID 1000 - though by default the container will run as root. This means that you need to set remoteUser to ubuntu, but you must not use the create-remote-user feature (which you needed on Ubuntu 22 and earlier).

The corresponding devcontainer image from Microsoft (mcr.microsoft.com/devcontainers/base:ubuntu-24.04) has the additional vscode user as usual, but in the earlier versions of this image it gets UID 1001. When your external UID is 1000, as it often is, the UID update is skipped because UID 1000 is in use by the ubuntu user. This has been fixed in mcr.microsoft.com/devcontainers/base:1.2.0-ubuntu-24.04 and later (by deleting the ubuntu user). If you cannot switch to that version, you can either explicitly set "remoteUser": "ubuntu", or you can wrap the image in a Dockerfile that does RUN userdel -r ubuntu.

Emacs

If you're using Emacs with Tramp to access the files (after starting the container with devcontainer up --workspace-folder .), you can open a file in the container by writing the path as

/docker:<container-name-or-ID>:/path/to/file

(e.g. after pressing C-x C-f). This will however use the container's default user, because Emacs does not know about the devcontainer.json file. To select a different user you can write

/docker:<username-or-ID>@<container-name-or-ID>:/path/to/file

This should work by default in Emacs 29 or later.

Conclusion

The way Docker containers map onto different platforms, and how to configure devcontainers to take this into account, can be a source of much hair pulling and long sessions of googling. We hope this piece can be of help.

- Richard

Back to blog index.

Hacker's Handbook