SafeDisk AI

BuildKit Cache Disk Full During Container Builds

Large container pipelines can compile successfully and still fail in a later runtime stage because BuildKit keeps intermediate snapshots. Treat this as a stage-boundary policy problem: measure builder cache, images, containerd snapshots, and final artifacts before deciding whether prune should be automatic or opt-out.

Free BuildKit decision card

Separate reusable cache from one-shot intermediate layers.

On disposable builders, prune is usually safe after a successful stage. On persistent builders, preserve cache only when it has a measured reuse benefit and will not block the next stage.

builder cache -> final image -> next stage headroom -> prune policy
Get $99 build policy Read-only evidence Need $29 incident read Request payment link
Read-only evidence

Capture peak stage footprint before pruning.

This packet shows whether disk is held by BuildKit cache, containerd snapshots, Docker images, builder-stage artifacts, app/game build output, or the host filesystem.

df -h; docker system df -v; docker builder du; du /var/lib/docker/buildkit
Request $99 build policy Request $29 incident triage

Runbook: Define The Stage Boundary

  1. Measure disk before the engine/base image stage, after the final image is committed, and before the game/app stage starts.
  2. Separate final image size from retained intermediate snapshots. A successful build can still leave hundreds of GB in BuildKit.
  3. For disposable hosts, run docker builder prune -f after a successful stage and before the next disk-heavy stage.
  4. For persistent hosts, add a cache cap, age policy, and explicit --keep-cache or --no-prune escape hatch only where reuse is valuable.
  5. Add a free-space admission gate before the next stage. The build should fail early with a readable message while logs can still be written.
  6. Update disk requirements to peak footprint: source clone + builder artifacts + final image + BuildKit cache + next-stage output.
Copy-ready issue reply

Use this when BuildKit cache fills the host between stages.

This keeps the fix framed around stage boundaries and peak disk footprint, not just "add more disk."

I would frame this as a stage-boundary policy: after the engine image is committed, the next stage should not inherit the full intermediate cache unless the builder is intentionally persistent and the cache has measured reuse value.

Evidence I would capture before/after the engine stage:

df -hT / /var/lib/docker /var/lib/containerd
df -i / /var/lib/docker /var/lib/containerd
docker system df -v
docker builder du
du -xh /var/lib/docker/buildkit /var/lib/containerd 2>/dev/null | sort -h | tail -80

For disposable or one-shot builders, make prune the default after a successful stage and add an opt-out like --keep-cache. For persistent builders, add a cache cap/TTL and a free-space gate before the game/app build starts.
Request policy review
Paid scope

Turn one BuildKit disk-full failure into a build policy.

The $99 policy is for teams running large container pipelines where intermediate layer cache, source builds, final images, and app/game artifacts compete for one host volume. You get a safe/review/do-not-touch boundary, prune policy, and free-space gates.

No secrets, private Dockerfiles, or registry credentials needed. Redacted disk output is enough to start.

Do Not Prune Blindly