CPU Image Pulls CUDA Dependencies And Fills CI Disk
A hosted runner can run out of disk even when the failing image is supposed to be CPU-only. The real bug is often a dependency contract leak: vLLM, GPU torch, nvidia wheels, CUDA toolkit, flashinfer, or triton enter the default backend image instead of the GPU image.
Separate backend control-plane dependencies from opt-in GPU inference dependencies.
Generic cache cleanup only hides the problem. Add a build-time contract that fails when a CPU/backend image contains CUDA packages, and keep the GPU/vLLM dependency set owned by the image that actually runs it.
pip freeze | grep -Ei 'vllm|nvidia|cuda|triton|torch'; df -h /; docker system df
First Response Runbook
- Define the default backend image contract: no vLLM, CUDA toolkit, nvidia wheels, flashinfer, triton, or GPU torch closure unless explicitly enabled.
- Move GPU inference dependencies into an optional extra, separate requirements file, or dedicated GPU image.
- Log `df -h /`, Docker/buildx disk usage, and `pip freeze | grep -Ei "vllm|nvidia|cuda|triton|torch"` before the smoke build.
- Add a build-time assertion that fails when the default image contains CUDA packages.
- Keep the GPU image smoke path separate so removing vLLM from the backend does not break intended GPU deployments.
Use this when a CPU or backend image pulls GPU dependencies.
The point is to turn a one-off disk cleanup into a dependency architecture guard.
I would make this an image-contract check, not only a runner cleanup.
Acceptance checks:
- Default backend image does not install vllm, cuda-toolkit, nvidia-* wheels, flashinfer, triton, or GPU torch by default.
- GPU/vLLM dependencies live in an optional extra, separate requirements file, or dedicated GPU image.
- Smoke-test logs `df -h /`, `docker system df`, and `pip freeze | grep -Ei "vllm|nvidia|cuda|triton|torch"` before heavy build steps.
- Build fails if the default image dependency closure contains CUDA packages.
- GPU image keeps its own smoke path proving vLLM still installs where intended.
Paid Scope
The $29 incident triage reviews one failing build and returns the safest next diagnostic step. The $99 team pilot turns one representative Docker CI disk-full into a backend-vs-GPU image dependency contract, guardrail assertions, and runner disk preflight checklist.