Backup, Restore & High Availability
Plan database backups, test restores, implement point-in-time recovery, and set up high availability.
🎯 What You'll Learn
- Logical backups with pg_dump and pg_restore
- Physical backups, WAL archiving, and point-in-time recovery
- The 3-2-1 backup strategy and automated scheduling
- Testing and verifying backups (RPO and RTO)
- High availability with streaming replication and failover
📦 Logical Backups
Logical backups export your database as SQL statements or a portable archive. Like photocopying every page of a book — slow for huge books, but you can read any page and restore individual chapters.
pg_dump & pg_restore
Export and import databases with logical backups
-- LOGICAL BACKUPS: SQL dump files (portable, human-readable)
-- pg_dump: backup a single database
-- pg_dump -h localhost -U postgres -d mydb -F custom -f backup.dump
-- -F custom = compressed binary format (recommended)
-- -F plain = SQL text (readable but larger)
-- -F directory = parallel dump to directory
-- Backup specific tables only:
-- pg_dump -t orders -t customers mydb > partial_backup.sql
-- Backup schema only (no data):
-- pg_dump --schema-only mydb > schema.sql
-- Backup data
...💾 Physical Backups & PITR
Physical backups copy the raw data files. Combined with WAL archiving, they enable point-in-time recovery — like rewinding a DVR to the exact second before something went wrong.
💡 Pro Tip — PITR Is Your Safety Net
Point-in-time recovery lets you restore to any second in the past. If someone runs DELETE FROM users at 2:35 PM, you can recover to 2:34:59 PM. Always enable WAL archiving in production.
Physical Backups & PITR
Base backups, WAL archiving, and time-travel recovery
-- PHYSICAL BACKUPS: copy data files directly (faster for large DBs)
-- PostgreSQL: pg_basebackup
-- pg_basebackup -h localhost -U replication_user -D /backup/base
-- --checkpoint=fast --wal-method=stream -P
-- This copies the entire data directory + WAL files
-- Can be used to create replicas or point-in-time recovery
-- WAL (Write-Ahead Log) archiving for continuous backup:
-- postgresql.conf:
-- wal_level = replica
-- archive_mode = on
-- archive_command = 'cp %p /wal_archive/%f'
-- Poi
...📋 Backup Strategy
Follow the 3-2-1 rule: 3 copies, 2 storage types, 1 offsite. Automate everything and verify regularly.
3-2-1 Backup Strategy
Schedules, automation, and verification scripts
-- BACKUP STRATEGY: the 3-2-1 rule
-- 3 copies of your data
-- 2 different storage types (local + cloud)
-- 1 offsite copy (different region/provider)
-- Example production schedule:
-- | Backup Type | Frequency | Retention |
-- |----------------|--------------|---------------|
-- | WAL archiving | Continuous | 7 days |
-- | Full pg_dump | Daily 2 AM | 30 days |
-- | Base backup | Weekly | 90 days |
-- | Monthly export | 1st of month | 1 year
...🧪 Testing Backups
An untested backup is not a backup — it's a hope. Regularly restore to a test database and verify row counts, integrity constraints, and recovery time.
⚠️ Common Mistake
Never testing restores until a real disaster strikes. Schedule monthly restore tests. The first time you discover your backups are corrupt should not be during an outage.
Testing & Verification
Verify backups, RPO, and RTO metrics
-- TESTING BACKUPS: untested backups are worthless!
-- Restore to a test database and verify:
-- createdb test_restore
-- pg_restore -d test_restore production_backup.dump
-- Verification queries:
SELECT 'Table counts' AS check_type;
SELECT schemaname, tablename,
(xpath('/row/count/text()',
query_to_xml(format('SELECT COUNT(*) FROM %I.%I', schemaname, tablename),
false, true, '')))[1]::text::bigint AS row_count
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY tablenam
...🏗️ High Availability
For zero-downtime requirements, set up streaming replication with automatic failover. If the primary goes down, a replica takes over within seconds.
High Availability
Streaming replication and automatic failover
-- HIGH AVAILABILITY: minimize downtime
-- Streaming Replication (hot standby):
-- Primary → Replica (real-time WAL streaming)
-- Replica can serve read queries
-- If primary fails → promote replica
-- Setup on PRIMARY:
-- postgresql.conf:
-- wal_level = replica
-- max_wal_senders = 5
-- synchronous_commit = on -- for sync replication
-- pg_hba.conf:
-- host replication repl_user replica_ip/32 md5
-- Setup on REPLICA:
-- pg_basebackup -h primary_ip -U repl_user -D /data -P --slot=replica1
-
...📋 Quick Reference
| Method | RPO | RTO |
|---|---|---|
| Daily pg_dump | Up to 24h | Minutes–hours |
| WAL + PITR | Seconds–minutes | 15–60 min |
| Sync replication | Zero | Seconds |
| Cloud snapshots | Hours | Minutes |
🎉 Lesson Complete!
You now have a complete backup and recovery strategy. Next, dive deeper into database replication!
Sign up for free to track which lessons you've completed and get learning reminders.