Surprising subtleties of Docker permissions

The fundamental building block of our analysis platform is an analyzer. Since the static analysis world works in many different languages and can require many different libraries, each analyzer is its own Docker image, and the Dockerfile is provided by the analysis author. We provide the analyzer's inputs in the /analysis/inputs folder (where the list of inputs is determined by a manifest file), and once the image has finished running, we look for its output in /analysis/output. Usually, we do this by bind-mounting a directory on the host to the /analysis folder; when we run in our CI environment on Circle, we have to fall back to docker cp since Circle's docker-in-docker solution uses a remote docker daemon, meaning that the image isn't necessarily running on the same machine as the code that launched it.

This seems like it'd work, and for a while, it did. But when we started running our client on Linux hosts, we ran into weird issues related to filesystem permissions.

A digression into POSIX filesystem permissions

Before getting into detail, let's explore the typical POSIX filesystem access control model. This model is shared by macOS, BSD, Linux, and other similar operating systems (notably, not Windows). If you're familiar with how they work, including what write and execute permissions mean on a directory, you can skip to the next section.

Each file has an owner, which is stored as a number known as a user ID, and a group, which is similarly stored as a group ID. The permissions entry for a file controls who can read, write, or execute the file, and this can be controlled separately for the owner, for users in the file's group, and for all other users (AKA 'other'). This is typically represented in a form like `rwxr-x

Surprising subtleties of Docker permissions

A digression into POSIX filesystem permissions

Dive deeper into or continue reading our featured posts.

Introducing Semgrep Agentic Workflows: Automate Deep Vulnerability Hunting at Scale

Introducing Semgrep Guardian: Security for AI-Generated Code

Announcing Pyro Caml: A Continuous Profiler for OCaml