CI Worker Threat Model
This document describes the security boundary for docstore's CI execution environment and the mitigations in place to limit the blast radius of malicious user code.
Trust boundary
The ci-worker binary is trusted. Everything that runs after it hands execution
to BuildKit — user-defined build steps executing inside the Kata CLH
microVM — is treated as adversarial.
ci-worker binary (trusted)
├── claims job from ci-scheduler (K8s SA proof)
├── fetches ci.yaml via request_token
├── obtains presigned archive URL via request_token
└── hands off to BuildKit ← trust boundary
└── user build steps (untrusted)
├── reads request_token from /run/secrets/
└── has host network namespace (--oci-worker-net=host)
Credentials available inside the VM
| Credential | How obtained | Notes |
|---|---|---|
request_token |
BuildKit secret mount at /run/secrets/docstore_oidc_request_token |
Readable by any build step |
| OIDC token URL | BuildKit secret mount at /run/secrets/docstore_oidc_request_url |
Needed to exchange request_token for JWT |
| GCP metadata server | Plain HTTP to 169.254.169.254 |
See mitigations below |
| Docker daemon | tcp://localhost:2375, unauthenticated; DOCKER_HOST is set |
Gives full container control within the VM |
| Cluster-internal network | --oci-worker-net=host gives build containers the VM's network namespace |
Can reach cluster services |
What the request_token can do
The request_token is a short-lived opaque token bound to a single CI job. It
is accepted by endpoints on the docstore server and the ci-scheduler. All
docstore endpoints enforce that job.Repo matches the URL path repo:
| Server | Endpoint | Purpose |
|---|---|---|
| docstore | POST /repos/{repo}/-/archive/presign |
Get presigned source archive URL |
| docstore | POST /repos/{repo}/-/check/{name}/logs |
Upload check run log content |
| docstore | GET /repos/{repo}/-/ci/config |
Fetch .docstore/ci.yaml for the job's branch/sequence |
| docstore | POST /repos/{repo}/-/check |
Report check run status |
| ci-scheduler | POST /jobs/{id}/heartbeat |
Keep job alive (cluster-internal only) |
| ci-scheduler | POST /jobs/{id}/complete |
Report job completion (cluster-internal only) |
The ci-scheduler endpoints are only reachable from within the cluster
(ci-scheduler.docstore-ci.svc.cluster.local). Both validate the request_token
and enforce that the token's job ID matches the URL {id}.
The request_token can also be exchanged at the ci-oidc endpoint for a short-lived OIDC JWT. The audience determines what the JWT can access:
aud=ci-registry— authenticate to the BuildKit layer cache registryaud=docstore— authenticate to the docstore API (see below)
OIDC JWT (aud=docstore) permissions
The OIDC JWT is validated by the docstore server. After validation, the request is checked against an allowlist before reaching the inner API mux:
- The URL path repo must match
jobID.Repo— no cross-repo access. - The endpoint must be permitted by the job's declared permissions.
Default permissions (no permissions: block in ci.yaml): checks: write only,
which allows POST /repos/{own-repo}/-/check.
Elevated permissions can be declared in .docstore/ci.yaml:
permissions:
contents: write # commit, branch, merge, rebase, purge
proposals: write # open proposals, post reviews/comments
issues: write # create/close/comment on issues
releases: write # create/delete releases
ci: write # trigger CI runs on own repo
Permissions are evaluated at job dispatch time, not at request time. For proposal (PR) jobs, permissions are read from the target branch (base branch) ci.yaml, not the source branch. A PR cannot grant itself elevated permissions — they only take effect after the permission change is reviewed and merged. See ci.md for details.
The on: trigger filter (which controls whether CI runs at all for a given
proposal) is evaluated from the source branch ci.yaml. A PR author can
therefore suppress or expand which base branches trigger CI on their branch, but
this does not affect permissions, which remain base-branch-gated.
GCP metadata server
The entrypoint-worker.sh adds an iptables rule to block outbound traffic to
169.254.169.254 before buildkitd and dockerd start:
iptables -I OUTPUT -d 169.254.169.254 -j DROP
This is not a hard security boundary. Because build steps run with host networking and the Docker daemon is unauthenticated, a sufficiently motivated build step can remove this rule and reach the metadata server:
docker run --net=host --cap-add=NET_ADMIN alpine \
sh -c "iptables -F OUTPUT && curl -H 'Metadata-Flavor: Google' \
http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"
The real security boundary is the ci-runner GCP service account's IAM grants, which are intentionally minimal (see below). The iptables rule is defense-in-depth.
ci-runner GCP service account permissions
The ci-runner@dlorenc-chainguard.iam.gserviceaccount.com SA is bound to the
ci-worker Kubernetes service account via Workload Identity. Its grants are:
| Scope | Role | Rationale |
|---|---|---|
| Project | roles/artifactregistry.reader |
Pull the ci-worker container image |
No other project-level roles. No bucket-level grants.
Notably absent and intentionally so:
- No roles/cloudsql.client — ci-worker talks to ci-scheduler over HTTP; it
never connects to the database directly.
- No GCS access — log writes go through the docstore server's
request_token-gated endpoint; ci-worker has no direct GCS dependency.
ci-registry cache access
The BuildKit layer cache registry uses a separate SA
(ci-registry@dlorenc-chainguard.iam.gserviceaccount.com) with
roles/storage.objectAdmin on the cache bucket. Access is scoped at two levels:
- Org-level: the OIDC JWT audience
ci-registryis required. - Repo-level:
auth.goenforces exact repo equality — a token foracme/repo-acan only push/pullacme/repo-a:*refs, notacme/repo-b:*.
K8s service account token
The K8s SA token for the ci-worker pod is used to claim jobs from ci-scheduler
(k8sproof validation). The scheduler enforces one-claim-per-pod: once a pod has
claimed a job, its SA token cannot be used to claim another. A malicious build
step that steals the SA token and calls /claim will receive a rejection.
What is NOT reachable
- Other tenants'
request_tokens or source archives — separate Kata VMs, no state sharing between jobs - The OIDC JWT signing key — lives in GCP KMS, never touches the VM
- Cross-repo API operations — enforced at the OIDC JWT allowlist gate
- Other tenants' presigned archive URLs —
job.Repo == URL repoenforced in the presign handler - Cross-org ci-registry operations — enforced in
auth.go - Cloud SQL — ci-runner SA has no
cloudsql.clientgrant - Other tenants' build logs — ci-runner SA has no GCS grants; log access goes through the docstore server which enforces repo-level authorization
Residual risks and future work
- iptables bypass: a privileged build step with Docker daemon access can remove the metadata server block. Mitigated by minimal SA permissions. Long-term fix: run buildkitd/dockerd as a separate less-privileged process, or use a network policy at the Kata VM level.
- Cluster-internal network: host networking gives build steps access to cluster services. The ci-scheduler and docstore server do not accept requests from arbitrary cluster workloads, but this is worth hardening with NetworkPolicy.
- Cache poisoning within same org: repo-level scoping in ci-registry prevents cross-repo cache poisoning. Cache integrity relies on BuildKit's content-addressable layer verification.
Raw markdown — machine-readable source for this page.