MySQL Binlog Disk Full Service Outage
When MySQL binary logs fill a small VPS, the symptom is often much larger than "delete a few files": MySQL waits on ENOSPC, web workers back up, CPU climbs, and the application returns 502s. Capture the binlog evidence first, then choose a retention, replication, or template fix that will not break point-in-time recovery.
Prove whether binary logs are the outage driver before deleting anything.
The first pass should answer four questions: how full the filesystem is, how much space binlogs consume, whether replication/PITR depends on them, and whether the template should disable or expire binlogs for this deployment.
Measure binlog growth without touching database state.
These checks are intentionally read-only. They show filesystem pressure, binlog count/size, current MySQL binary-log settings, and whether the service is configured as a single-node app database or a replication/PITR source.
df -h; du -sh /var/lib/mysql/binlog*; SHOW BINARY LOGS;
Runbook: Pick The Safe Binlog Fix
- Confirm the exact failure: MySQL writing a
binlog.*file with OS errno 28, not generic Docker overlay, app logs, or inode exhaustion. - Measure binlog space separately from the database directory. A single-node app can often use a different policy than a replication or point-in-time-recovery source.
- Check whether binary logs are required. If there is no replica, no CDC consumer, and no PITR backup process, the template may not need binary logging at all.
- If binlogs are required, set an explicit retention budget: seconds/days, max total bytes via monitoring, and an alert before disk reaches the write-failure cliff.
- If switching database engines or templates, treat it as migration work. Validate backups, import path, healthchecks, and resource settings before replacing the stateful service.
- Convert the incident into a reusable guard: no unbounded logs by default, a minimum free-space check, and a template review whenever upstream image defaults change.
Use this when MySQL binlogs filled a VPS.
This reply keeps the discussion focused on the policy decision: disable binlogs for single-node templates, or retain them with explicit limits when replication/PITR needs them.
I would split the fix into immediate recovery and template policy.
Read-only evidence before deleting binlogs:
df -h
df -i
du -sh /var/lib/mysql /var/lib/mysql/binlog* 2>/dev/null | sort -h
mysql -e "SHOW VARIABLES LIKE 'log_bin'; SHOW VARIABLES LIKE 'binlog_expire_logs_seconds'; SHOW VARIABLES LIKE 'expire_logs_days'; SHOW BINARY LOGS;"
mysql -e "SHOW REPLICA STATUS\G" 2>/dev/null || mysql -e "SHOW SLAVE STATUS\G" 2>/dev/null || true
For a single-node app template with no replica, CDC, or point-in-time recovery flow, I would avoid unbounded binary logging by default: either disable it explicitly or set a short expiry plus a disk alert. For deployments that do need binlogs, the template should make retention visible and bounded so a small VPS cannot silently turn binlogs into a full service outage.
Turn one binlog outage into a safer template policy.
The $99 policy is for teams shipping one-click templates, self-hosted app stacks, or database-backed CI/dev services. You get the safe/review/do-not-touch cleanup boundary, retention settings, monitoring guard, and rollout checklist for one representative incident.
Do Not Delete First
- Current binlog files before checking replication, CDC, backup, or point-in-time-recovery requirements.
- Database data directories, Docker volumes, or compose state while MySQL may still be writing.
- The first MySQL log lines that prove the binlog filename and the exact ENOSPC sequence.
- Template defaults without pinning image version, database engine, retention policy, and migration path.