Backup and Restore¶
This guide covers backup and restore procedures for the RotaCC system. It is intended for anyone responsible for keeping the system running -- whether that is a developer, a sysadmin, or an on-call responder.
Overview¶
The system provides two independent backup mechanisms:
| JSON (dumpdata) | pg_dump | |
|---|---|---|
| What it captures | Application data from specific Django models | Complete PostgreSQL database |
| Format | Gzip-compressed JSON (.json.gz) |
Gzip-compressed SQL (.sql.gz) |
| Selective restore | Yes -- pick individual models | No -- restores the entire database |
| Requires PostgreSQL | No | Yes |
| Recommended for | Quick snapshots, migrating specific data | Full disaster recovery |
Use pg_dump for your primary backups. It is the safer, more complete option. JSON backups are useful when you need to move individual models between environments or inspect backup contents by hand.
Backups are stored under the backups/ directory at the project root. Every backup is tracked in the database via a BackupMetadata record (see Backup Metadata below).
Automatic Backups¶
Two Celery beat tasks run automatically:
| Task | Schedule | What it does |
|---|---|---|
daily_pg_dump_backup |
Every day at 03:00 UTC | Creates a pg_dump backup |
cleanup_old_backups_task |
Every day at 04:00 UTC | Deletes old backups using tiered retention |
These are configured in rota/settings/base.py under CELERY_BEAT_SCHEDULE.
Backup failure alerting¶
If the daily pg_dump fails, the task:
- Retries up to 3 times with exponential backoff (~60 s, ~120 s, ~240 s).
- Sends an alert email to all admins who have opted in to system failure notifications.
- Logs the failure for investigation.
If you receive a backup failure email, check the Celery worker logs and the backups/ directory for disk space issues.
Automatic cleanup and retention policy¶
The cleanup task applies different rules depending on backup type:
pg_dump backups (tiered retention):
| Tier | Retention |
|---|---|
| Daily | Keep all backups within the last 30 days |
| Weekly | Keep one backup per ISO week for 12 weeks |
| Monthly | Keep one backup per month for 12 months |
| Older than 12 months | Delete |
JSON backups (simple retention):
- Keep everything within the last 6 months; delete the rest.
Manual Backups¶
pg_dump backup (recommended)¶
Run the management command from the project root:
Expected output:
Creating pg_dump backup...
pg_dump backup created successfully
File: backup_2026-05-07_030000.sql.gz
Path: /path/to/project/backups/backup_2026-05-07_030000.sql.gz
Size: 1.24 MB
Add a description to help identify the backup later:
Prerequisites:
- The database engine must be PostgreSQL.
- The
pg_dumpcommand must be available on the system (postgresql-clientpackage).
JSON backup¶
For an application-level backup of Django model data:
Expected output:
Creating backup...
Backup created successfully
File: backup_20260507_143000.json.gz
Path: /path/to/project/backups/backup_20260507_143000.json.gz
Size: 256.00 KB
Common options:
# Add a description
uv run python manage.py backup_data --description "Pre-migration backup"
# Back up only specific models
uv run python manage.py backup_data --models config.Shift,config.LeaveRequest
# Exclude audit logs to save space
uv run python manage.py backup_data --no-include-audit
# Show per-model record counts
uv run python manage.py backup_data --verbose
# Copy the backup to a custom location
uv run python manage.py backup_data --output /tmp/rota-backup.json.gz
# Attribute the backup to a user
uv run python manage.py backup_data --user admin
Manual cleanup¶
Preview what would be deleted:
# Simple retention (default 6 months)
uv run python manage.py cleanup_old_backups --dry-run
# Tiered retention for pg_dump backups
uv run python manage.py cleanup_old_backups --tiered --dry-run
Actually delete old backups:
# Delete JSON backups older than 6 months
uv run python manage.py cleanup_old_backups
# Delete using tiered retention for pg_dump, plus 6-month JSON cleanup
uv run python manage.py cleanup_old_backups --tiered
# Custom retention period for JSON backups
uv run python manage.py cleanup_old_backups --retention-months 12
Restoring¶
The restore procedure depends on the backup type.
Restoring from a pg_dump backup¶
This replaces the entire database. Use this for full disaster recovery.
Step 1: Identify the backup.
Find the backup ID from the admin interface (see Admin Interface) or query the database:
uv run python manage.py shell -c "
from backup_restore.models import BackupMetadata
for b in BackupMetadata.objects.filter(backup_type='pg_dump').order_by('-timestamp')[:5]:
print(f'{b.id} {b.timestamp} {b.filename} {b.get_file_size_display()}')
"
Step 2: Run a dry run (optional but recommended).
There is no dry-run mode for pg_dump restore. Instead, verify the backup file exists and the metadata looks correct in the admin interface.
Step 3: Restore the backup.
You must provide the backup UUID and the actual database name as a safety check:
uv run python manage.py pg_restore_backup \
--backup-id <uuid> \
--database-name rota_db \
--confirm
Expected output:
Creating safety backup before restore...
Safety backup created: backup_2026-05-07_143500.sql.gz
Enabling maintenance mode...
Maintenance mode enabled. Non-staff users will see 503 page.
Restoring from backup: backup_2026-05-07_030000.sql.gz
Database restored successfully!
Maintenance mode disabled.
The restore process automatically:
- Creates a safety backup of the current database before overwriting anything.
- Enables maintenance mode (non-staff users see a 503 page).
- Restores the database from the backup file.
- Disables maintenance mode (even if the restore fails).
Restoring from a JSON backup¶
This replaces application data for the models included in the backup.
Step 1: Validate the backup without restoring.
Expected output:
Validating backup file...
Backup file is valid
Version: 1.1
Timestamp: 2026-05-07T14:30:00+00:00
File size: 262144 bytes
Backup contents:
- SystemConfiguration: 1 records
- User: 5 records
- Clinician: 12 records
...
[DRY RUN] No data was restored
To actually restore, run again with --confirm flag
Always run --dry-run first. It validates the file structure and shows you exactly what is in the backup.
Step 2: Perform the restore.
You can add --verbose for an interactive confirmation prompt and per-model output.
Important warnings:
- The restore deletes all existing data for each model before inserting the backup data.
- The entire restore runs inside an atomic transaction. If anything fails, all changes are rolled back.
- Models are restored in dependency order to satisfy foreign key constraints.
Backup Metadata¶
Every backup is tracked in the BackupMetadata model (backup_restore.BackupMetadata). This is what each field records:
| Field | Purpose |
|---|---|
id |
UUID primary key -- use this to reference a specific backup |
filename |
Name of the backup file on disk |
file_path |
Full filesystem path to the backup file |
timestamp |
When the backup was created |
description |
Optional human-readable description |
backup_type |
json or pg_dump |
models_included |
List of model names (JSON backups only; empty for pg_dump) |
record_counts |
Per-model record counts (JSON backups only; empty for pg_dump) |
file_size_bytes |
Size of the backup file on disk |
created_by |
The user who created the backup (null for automated backups) |
is_valid |
Whether the backup file passed validation |
created_at |
When the metadata record was created |
Querying backup history¶
From a Django shell:
from backup_restore.models import BackupMetadata
# List the 10 most recent backups
for b in BackupMetadata.objects.order_by('-timestamp')[:10]:
print(f"{b.timestamp} {b.backup_type:7} {b.get_file_size_display():>10} {b.filename}")
# Count backups by type
from django.db.models import Count
BackupMetadata.objects.values('backup_type').annotate(count=Count('id'))
# Find backups created by a specific user
BackupMetadata.objects.filter(created_by__username='admin')
# Find backups older than 90 days
from django.utils import timezone
from datetime import timedelta
cutoff = timezone.now() - timedelta(days=90)
BackupMetadata.objects.filter(timestamp__lt=cutoff).count()
Admin Interface¶
The backup system provides a web interface in the Django admin for staff users.
Viewing backups¶
Navigate to Django Admin and look under Backup Restore for Backup Metadata. The list view shows:
- Filename and timestamp
- Models included (first 3 shown, with a count of remaining)
- File size (human-readable)
- Validity status
- Who created the backup
You can filter by validity and date, and search by filename or description.
Downloading backups¶
From the backup list, each backup has a download action that serves the backup file directly from disk.
Creating a JSON backup¶
From the backup list page, the "Create Backup" button opens a form where you can:
- Add a description
- Select specific models to include (or leave empty for all models)
Uploading a backup¶
The "Upload Backup" button allows you to upload a backup file from another system. Accepted file types:
.json.gz-- JSON backups.sql.gz-- pg_dump backups.sql-- uncompressed SQL dumps
Maximum upload size is 100 MB (configurable via MAX_BACKUP_UPLOAD_SIZE in settings).
The system validates the file on upload and rejects anything that does not look like a valid backup.
Restoring a backup¶
Each backup in the list has restore actions:
JSON restore: Click "Restore", review the backup contents and validation details, type CONFIRM in the text field, and submit.
pg_dump restore: Click "PostgreSQL Restore" (only shown for pg_dump backups). You must type the actual database name and check a confirmation checkbox. The system creates a safety backup before proceeding.
Disaster Recovery¶
If something goes wrong with the database -- a failed migration, accidental data deletion, or corruption -- follow this procedure.
Step 1: Assess the situation¶
Determine the scope of the problem:
- Is the application still running? Can users log in?
- Is the database responding?
- What changed recently? (Check the audit log if accessible.)
Step 2: Create a safety backup¶
Before doing anything else, take a backup of the current state even if it is damaged:
If pg_dump fails because the database is in a bad state, try a JSON backup of whatever models are accessible:
Step 3: Choose a backup to restore¶
List recent backups:
uv run python manage.py shell -c "
from backup_restore.models import BackupMetadata
for b in BackupMetadata.objects.filter(backup_type='pg_dump', is_valid=True).order_by('-timestamp')[:10]:
print(f'{b.id} {b.timestamp} {b.filename} {b.get_file_size_display()}')
"
Pick the most recent valid backup from before the problem occurred.
Step 4: Restore¶
For a full restore, use the pg_dump procedure:
uv run python manage.py pg_restore_backup \
--backup-id <uuid> \
--database-name rota_db \
--confirm
For a partial restore (specific models only), use the JSON procedure with --dry-run first:
uv run python manage.py restore_data \
--input backups/<filename>.json.gz \
--dry-run
uv run python manage.py restore_data \
--input backups/<filename>.json.gz \
--confirm --verbose
Step 5: Verify¶
After restoring, check the following:
- Can you log in? Open the site and log in with an admin account.
- Are clinicians present? Check the clinician list in the admin.
- Are recent shifts visible? Look at the rota for the current period.
- Is Celery running? Check that the Celery worker and beat processes are healthy.
- Run a validation query:
uv run python manage.py shell -c "
from config.models import Clinician, Shift
from django.contrib.auth import get_user_model
User = get_user_model()
print(f'Users: {User.objects.count()}')
print(f'Clinicians: {Clinician.objects.count()}')
print(f'Shifts: {Shift.objects.count()}')
"
Compare these counts against the record_counts stored in the backup metadata.
Step 6: Document¶
Record what happened, which backup was restored, and the timestamp of both the incident and the recovery. The restore process creates an audit log entry automatically, but a human note in your incident log is also valuable.
Troubleshooting¶
Backup creation fails with "pg_dump command not found"¶
Install the PostgreSQL client tools:
# Ubuntu / Debian
sudo apt-get install postgresql-client
# CentOS / RHEL
sudo yum install postgresql
# Docker -- add to your Dockerfile
RUN apt-get update && apt-get install -y postgresql-client
Backup creation fails silently (Celery task)¶
Check:
- Celery worker logs: look for the task name
tasks.backup_tasks.daily_pg_dump_backup. - Disk space on the
backups/directory:df -h /path/to/project/backups/. - File permissions: the Celery worker process needs write access to
backups/. - Database connectivity: can the worker reach PostgreSQL?
Restore fails with a validation error¶
The backup file may be corrupted, or the data model may have changed since the backup was created. Try:
- Use
--dry-runto see if the file parses at all. - Check whether any new migrations have been applied since the backup was taken. You may need to roll back migrations to match the backup's schema.
- Use a more recent backup that matches the current schema.
"Database name mismatch" during pg_restore¶
The --database-name argument must exactly match the NAME value in DATABASES['default'] in your settings. Check your current database name:
uv run python manage.py shell -c "
from django.conf import settings
print(settings.DATABASES['default']['NAME'])
"
Backup file is missing from disk¶
The BackupMetadata record exists but the file has been deleted or moved. You can clean up orphaned metadata by deleting the record from the admin interface.
Passwords after restore¶
User passwords are stored in backups as Django password hashes (e.g., pbkdf2_sha256$...). After restoring, users can log in with their original passwords. Passwords are never stored in plaintext.