AI Agent Transcript JSONL Retention Policy
Long-running agent sessions, group chats, and topic-bound workers can append every message, tool call, and tool result into active transcript `.jsonl` files. When those files are uncapped, startup, maintenance, and gateway reads can burn CPU on an ever-growing tail.
$99 AI tool storage policy
Choose the raw transcript cap, compaction, and disk-budget behavior before the next outage.
Use this when `.jsonl` session transcripts, tool results, replay state, or topic-bound group sessions grow past the point where ordinary session pruning can protect the gateway.
cap transcript bytes + truncate tool output + test oversized replay
Read-only evidence
Measure transcript size, line count, tool-result growth, and replay cost first.
These checks keep the discussion public-safe. They do not require transcript contents, secrets, prompts, or user messages.
find sessions -name '*.jsonl' -size +10M -print
Runbook: Cap The Transcript, Not Just The Session Index
- Separate index retention from transcript retention. A `sessions.json` rotation or entry cap does not prove active `.jsonl` transcripts are bounded.
- Choose a product policy: raw transcript rotation, compaction-only with successor transcript, or both.
- Truncate or summarize large tool results before writing them to transcript files.
- Protect active preserved sessions in the disk budget. Long-lived human topic sessions are often exactly the files that ordinary pruning skips.
- Add a migration path for existing oversized transcripts: archive, summarize, split, or mark as cold before gateway startup scans them repeatedly.
- Test with a deterministic oversized `.jsonl` fixture and assert maintenance/startup work is bounded.
Copy-ready issue reply
Use this when agent transcripts grow without bound.
This keeps the maintainer conversation on product decisions and acceptance tests rather than local cleanup alone.
I would split the fix into a product decision and a bounded-maintenance regression.
Acceptance checks I would want before closing this:
- Make the config surface explicit: sessions.json retention is separate from active transcript .jsonl retention.
- Pick one supported policy: raw transcript rotation, compaction-only successor transcripts, or both.
- Add an oversized .jsonl fixture and prove startup/maintenance does not repeatedly scan the full unbounded tail.
- Truncate or summarize large tool results before append, so one noisy tool call cannot create a 100 MB active transcript.
- Make the disk-budget path account for protected long-lived topic/group sessions, not only evictable inactive sessions.
- Document migration behavior for existing oversized transcripts: archive, summarize, split, or leave intentionally uncapped.
Do Not Delete First
- The only transcript that reproduces the replay or startup CPU spike.
- Session index files before confirming whether the oversized file is active, preserved, or cold.
- Compaction successor files before checking whether they are the intended mitigation path.
- Tool-result evidence that explains which command class is causing transcript growth.