AI agent storage incident

Codex SQLite WAL Disk Full On Linux

Get $29 recovery link Need $99 team policy Free checks first

If `~/.codex/logs_2.sqlite-wal` grows to GBs or hundreds of GBs, deleting the visible WAL may not free space. Stale or suspended Codex TUI sessions can keep deleted WAL file descriptors open, so `du ~/.codex` looks small while `df -h` still says the filesystem is full.

Free first read

Prove whether space is held by a deleted WAL inode.

The first pass is read-only: compare `du` and `df`, list deleted open files, identify stale Codex PIDs, then checkpoint only after stale readers are gone.

du -xsh ~/.codex; df -h "$HOME"; lsof -nP +L1 | sort -nr

What is happening

SQLite WAL files are normal, but they need checkpointing. Long-lived readers can prevent truncation. If a stale Codex process still holds a deleted `logs_2.sqlite-wal`, the directory size may look fixed while the filesystem remains full because the deleted inode is still allocated.

Safe recovery order

Stop starting new Codex sessions while the filesystem is near full.
Use `lsof +L1` to prove whether deleted WAL or SHM files are still open.
Identify stale or suspended Codex processes with `ps`, `tmux ls`, and `fuser`.
Exit stale sessions cleanly if possible; otherwise terminate only the stale PIDs that hold the deleted WAL.
Run a SQLite checkpoint only after readers release the database.
Re-check `df -h`, `du -xsh ~/.codex`, and `lsof +L1` before deleting anything else.

Copy-ready Linux runbook

Use this when `du` and `df` disagree.

The runbook separates visible file size from deleted-open-inode allocation, then checkpoints the WAL after stale readers are gone.

du -xsh "$HOME/.codex" 2>/dev/null
df -h "$HOME"
lsof -nP +L1 2>/dev/null | awk 'NR>1 && $7 ~ /^[0-9]+$/ && $7 > 1000000000 {print $7, $2, $1, $4, $9}' | sort -nr | head -40 | numfmt --field=1 --to=iec --suffix=B
fuser "$HOME/.codex/logs_2.sqlite" 2>/dev/null
sqlite3 "$HOME/.codex/logs_2.sqlite" "PRAGMA wal_checkpoint(TRUNCATE);"

Request $29 review

Do Not Delete First

Do not keep deleting visible WAL files if `lsof +L1` shows deleted inodes are still open.
Do not remove all of `~/.codex` before exporting or backing up session state you care about.
Do not run broad cache cleaners while the issue is actually a live file descriptor problem.
Do not kill every shell or tmux process; target the stale Codex PIDs that hold the deleted WAL.

Recurring team issue

Turn this recovery into a team-safe agent storage policy.

The $99 policy is for teams running Codex, agent CLIs, tmux sessions, or long-lived SQLite-backed tools on shared Linux workstations and build hosts. You get the deleted-inode recovery runbook, WAL checkpoint rules, stale-process guardrails, and monitoring thresholds for one representative environment.

No mail app or GitHub login? Send this directly from any inbox.

liuminsheng3@gmail.com - SafeDisk Codex WAL Recovery Payment Link