Automating Prod-to-Local DB Sync for 7 Postgres Databases
I run 7 production Postgres databases on a single VPS. For a while, debugging production issues meant either SSHing in and running queries live — which is risky — or manually dumping and restoring whichever database I needed at the moment, which is tedious.
I finally built a proper solution: a daily cron that syncs all 7 databases to local, automatically. Here’s how it works and what I learned building it.
The Goal
Every morning, I want a local copy of each production database, read-only locked, ready to query. If the sync fails, I want to know before I start working — not when I’m mid-debug session.
The databases: menthos, aethos-pilot, coda-ce, hub, forge, govisales, aethos-lab.
Why Not pg_dump --clean?
My first instinct was to use pg_restore --clean to drop and recreate objects in place. This fails with foreign key constraints.
When Postgres drops a table that has FK references from other tables, it errors. The --clean flag drops objects in dump order, not dependency order. The correct approach is to drop and recreate the entire database before restoring:
psql -h localhost -p $LOCAL_PORT -U postgres -c "DROP DATABASE IF EXISTS $DB_NAME;"
psql -h localhost -p $LOCAL_PORT -U postgres -c "CREATE DATABASE $DB_NAME;"
pg_restore -h localhost -p $LOCAL_PORT -U postgres -d $DB_NAME "$DUMP_FILE"
No FK errors. Every time.
Read-Only Lock After Restore
After restoring, I lock the database to prevent accidental writes to what is essentially a prod mirror:
psql -h localhost -p $LOCAL_PORT -U postgres -d $DB_NAME \
-c "ALTER DATABASE $DB_NAME SET default_transaction_read_only = on;"
This doesn’t prevent reads or schema inspection — it just means any INSERT, UPDATE, or DELETE will get a clean error. Useful guard when you’re debugging and muscle memory sends you down the wrong path.
The Bash Gotcha That Cost 2 Hours
I used set -euo pipefail for safety, and had a counter to track failed syncs:
# ❌ This breaks with set -e when FAILED_COUNT=0
(( FAILED_COUNT++ ))
# ✅ This works correctly
FAILED_COUNT=$((FAILED_COUNT + 1))
(( expr )) returns exit code 1 when the result is 0. With set -e, that exits the script immediately. I spent two hours confused because the script was silently stopping on the first iteration when no failures had occurred yet.
Use $((var + 1)) for arithmetic in set -e scripts.
Retry Logic
Network hiccups happen. Each sync gets 3 attempts before being marked as failed:
ATTEMPTS=0
SUCCESS=false
while [ $ATTEMPTS -lt 3 ] && [ "$SUCCESS" = "false" ]; do
if sync_database "$DB_NAME" "$LOCAL_PORT"; then
SUCCESS=true
else
ATTEMPTS=$((ATTEMPTS + 1))
log "Attempt $ATTEMPTS failed for $DB_NAME"
sleep 5
fi
done
if [ "$SUCCESS" = "false" ]; then
echo "$DB_NAME" >> "$FAILED_QUEUE"
fi
Failed databases are written to ~/.aria/sync-failed-queue.json for manual review.
Sync Report for ARIA
After the script runs, it writes a sync-report.json:
{
"timestamp": "2026-03-27T02:00:05Z",
"ok": 7,
"failed": 0,
"skipped": 0,
"results": {
"menthos": "ok",
"aethos-pilot": "ok",
"hub": "ok"
}
}
ARIA reads this file during the morning briefing. If failed > 0, it surfaces an alert before I see anything else. I’ve caught VPS connectivity issues this way before opening a single terminal.
Cron Setup
# Runs at 02:00 every day
0 2 * * * /home/user/scripts/sync-prod-dbs.sh >> /home/user/.aria/logs/db-sync.log 2>&1
02:00 is intentional — low traffic, no active sessions, and the sync is done well before the morning briefing at 07:00.
Auto-Create Missing Local Containers
If a database doesn’t have a local Postgres container yet, the script creates one:
if ! docker ps --format '{{.Names}}' | grep -q "$CONTAINER_NAME"; then
docker run -d \
--name "$CONTAINER_NAME" \
-e POSTGRES_PASSWORD=localdev \
-p "$LOCAL_PORT:5432" \
postgres:16-alpine
sleep 3 # wait for Postgres to be ready
fi
This means the first sync for a new project just works, without manual setup.
What I’d Do Differently
The script grew organically and has some rough edges. If I rewrote it today:
- Use a config file instead of hardcoded arrays. Adding a new database currently means editing the script.
- SSH multiplexing for performance. Each
pg_dumpopens a new SSH tunnel. With ControlMaster, they’d reuse the same connection. - Structured logs (JSON) instead of plain text. ARIA can parse JSON; plain text requires regex.
It’s not perfect, but it runs every night without intervention, and I’ve caught three production anomalies by querying local mirrors that I would have missed otherwise.