Skip to content

Backup and Restore

This guide covers backup and restore procedures for the RotaCC system. It is intended for anyone responsible for keeping the system running -- whether that is a developer, a sysadmin, or an on-call responder.

Overview

The system provides two independent backup mechanisms:

JSON (dumpdata) pg_dump
What it captures Application data from specific Django models Complete PostgreSQL database
Format Gzip-compressed JSON (.json.gz) Gzip-compressed SQL (.sql.gz)
Selective restore Yes -- pick individual models No -- restores the entire database
Requires PostgreSQL No Yes
Recommended for Quick snapshots, migrating specific data Full disaster recovery

Use pg_dump for your primary backups. It is the safer, more complete option. JSON backups are useful when you need to move individual models between environments or inspect backup contents by hand.

Backups are stored under the backups/ directory at the project root. Every backup is tracked in the database via a BackupMetadata record (see Backup Metadata below).

Automatic Backups

Two Celery beat tasks run automatically:

Task Schedule What it does
daily_pg_dump_backup Every day at 03:00 UTC Creates a pg_dump backup
cleanup_old_backups_task Every day at 04:00 UTC Deletes old backups using tiered retention

These are configured in rota/settings/base.py under CELERY_BEAT_SCHEDULE.

Backup failure alerting

If the daily pg_dump fails, the task:

  1. Retries up to 3 times with exponential backoff (~60 s, ~120 s, ~240 s).
  2. Sends an alert email to all admins who have opted in to system failure notifications.
  3. Logs the failure for investigation.

If you receive a backup failure email, check the Celery worker logs and the backups/ directory for disk space issues.

Automatic cleanup and retention policy

The cleanup task applies different rules depending on backup type:

pg_dump backups (tiered retention):

Tier Retention
Daily Keep all backups within the last 30 days
Weekly Keep one backup per ISO week for 12 weeks
Monthly Keep one backup per month for 12 months
Older than 12 months Delete

JSON backups (simple retention):

  • Keep everything within the last 6 months; delete the rest.

Manual Backups

Run the management command from the project root:

uv run python manage.py pg_dump_backup

Expected output:

Creating pg_dump backup...
pg_dump backup created successfully
  File: backup_2026-05-07_030000.sql.gz
  Path: /path/to/project/backups/backup_2026-05-07_030000.sql.gz
  Size: 1.24 MB

Add a description to help identify the backup later:

uv run python manage.py pg_dump_backup --description "Before clinician import"

Prerequisites:

  • The database engine must be PostgreSQL.
  • The pg_dump command must be available on the system (postgresql-client package).

JSON backup

For an application-level backup of Django model data:

uv run python manage.py backup_data

Expected output:

Creating backup...
Backup created successfully
  File: backup_20260507_143000.json.gz
  Path: /path/to/project/backups/backup_20260507_143000.json.gz
  Size: 256.00 KB

Common options:

# Add a description
uv run python manage.py backup_data --description "Pre-migration backup"

# Back up only specific models
uv run python manage.py backup_data --models config.Shift,config.LeaveRequest

# Exclude audit logs to save space
uv run python manage.py backup_data --no-include-audit

# Show per-model record counts
uv run python manage.py backup_data --verbose

# Copy the backup to a custom location
uv run python manage.py backup_data --output /tmp/rota-backup.json.gz

# Attribute the backup to a user
uv run python manage.py backup_data --user admin

Manual cleanup

Preview what would be deleted:

# Simple retention (default 6 months)
uv run python manage.py cleanup_old_backups --dry-run

# Tiered retention for pg_dump backups
uv run python manage.py cleanup_old_backups --tiered --dry-run

Actually delete old backups:

# Delete JSON backups older than 6 months
uv run python manage.py cleanup_old_backups

# Delete using tiered retention for pg_dump, plus 6-month JSON cleanup
uv run python manage.py cleanup_old_backups --tiered

# Custom retention period for JSON backups
uv run python manage.py cleanup_old_backups --retention-months 12

Restoring

The restore procedure depends on the backup type.

Restoring from a pg_dump backup

This replaces the entire database. Use this for full disaster recovery.

Step 1: Identify the backup.

Find the backup ID from the admin interface (see Admin Interface) or query the database:

uv run python manage.py shell -c "
from backup_restore.models import BackupMetadata
for b in BackupMetadata.objects.filter(backup_type='pg_dump').order_by('-timestamp')[:5]:
    print(f'{b.id}  {b.timestamp}  {b.filename}  {b.get_file_size_display()}')
"

Step 2: Run a dry run (optional but recommended).

There is no dry-run mode for pg_dump restore. Instead, verify the backup file exists and the metadata looks correct in the admin interface.

Step 3: Restore the backup.

You must provide the backup UUID and the actual database name as a safety check:

uv run python manage.py pg_restore_backup \
  --backup-id <uuid> \
  --database-name rota_db \
  --confirm

Expected output:

Creating safety backup before restore...
Safety backup created: backup_2026-05-07_143500.sql.gz
Enabling maintenance mode...
Maintenance mode enabled. Non-staff users will see 503 page.
Restoring from backup: backup_2026-05-07_030000.sql.gz
Database restored successfully!
Maintenance mode disabled.

The restore process automatically:

  1. Creates a safety backup of the current database before overwriting anything.
  2. Enables maintenance mode (non-staff users see a 503 page).
  3. Restores the database from the backup file.
  4. Disables maintenance mode (even if the restore fails).

Restoring from a JSON backup

This replaces application data for the models included in the backup.

Step 1: Validate the backup without restoring.

uv run python manage.py restore_data --input backups/backup_20260507_143000.json.gz --dry-run

Expected output:

Validating backup file...
Backup file is valid
  Version: 1.1
  Timestamp: 2026-05-07T14:30:00+00:00
  File size: 262144 bytes

Backup contents:
  - SystemConfiguration: 1 records
  - User: 5 records
  - Clinician: 12 records
  ...

[DRY RUN] No data was restored
To actually restore, run again with --confirm flag

Always run --dry-run first. It validates the file structure and shows you exactly what is in the backup.

Step 2: Perform the restore.

uv run python manage.py restore_data \
  --input backups/backup_20260507_143000.json.gz \
  --confirm

You can add --verbose for an interactive confirmation prompt and per-model output.

Important warnings:

  • The restore deletes all existing data for each model before inserting the backup data.
  • The entire restore runs inside an atomic transaction. If anything fails, all changes are rolled back.
  • Models are restored in dependency order to satisfy foreign key constraints.

Backup Metadata

Every backup is tracked in the BackupMetadata model (backup_restore.BackupMetadata). This is what each field records:

Field Purpose
id UUID primary key -- use this to reference a specific backup
filename Name of the backup file on disk
file_path Full filesystem path to the backup file
timestamp When the backup was created
description Optional human-readable description
backup_type json or pg_dump
models_included List of model names (JSON backups only; empty for pg_dump)
record_counts Per-model record counts (JSON backups only; empty for pg_dump)
file_size_bytes Size of the backup file on disk
created_by The user who created the backup (null for automated backups)
is_valid Whether the backup file passed validation
created_at When the metadata record was created

Querying backup history

From a Django shell:

from backup_restore.models import BackupMetadata

# List the 10 most recent backups
for b in BackupMetadata.objects.order_by('-timestamp')[:10]:
    print(f"{b.timestamp}  {b.backup_type:7}  {b.get_file_size_display():>10}  {b.filename}")

# Count backups by type
from django.db.models import Count
BackupMetadata.objects.values('backup_type').annotate(count=Count('id'))

# Find backups created by a specific user
BackupMetadata.objects.filter(created_by__username='admin')

# Find backups older than 90 days
from django.utils import timezone
from datetime import timedelta
cutoff = timezone.now() - timedelta(days=90)
BackupMetadata.objects.filter(timestamp__lt=cutoff).count()

Admin Interface

The backup system provides a web interface in the Django admin for staff users.

Viewing backups

Navigate to Django Admin and look under Backup Restore for Backup Metadata. The list view shows:

  • Filename and timestamp
  • Models included (first 3 shown, with a count of remaining)
  • File size (human-readable)
  • Validity status
  • Who created the backup

You can filter by validity and date, and search by filename or description.

Downloading backups

From the backup list, each backup has a download action that serves the backup file directly from disk.

Creating a JSON backup

From the backup list page, the "Create Backup" button opens a form where you can:

  • Add a description
  • Select specific models to include (or leave empty for all models)

Uploading a backup

The "Upload Backup" button allows you to upload a backup file from another system. Accepted file types:

  • .json.gz -- JSON backups
  • .sql.gz -- pg_dump backups
  • .sql -- uncompressed SQL dumps

Maximum upload size is 100 MB (configurable via MAX_BACKUP_UPLOAD_SIZE in settings).

The system validates the file on upload and rejects anything that does not look like a valid backup.

Restoring a backup

Each backup in the list has restore actions:

JSON restore: Click "Restore", review the backup contents and validation details, type CONFIRM in the text field, and submit.

pg_dump restore: Click "PostgreSQL Restore" (only shown for pg_dump backups). You must type the actual database name and check a confirmation checkbox. The system creates a safety backup before proceeding.

Disaster Recovery

If something goes wrong with the database -- a failed migration, accidental data deletion, or corruption -- follow this procedure.

Step 1: Assess the situation

Determine the scope of the problem:

  • Is the application still running? Can users log in?
  • Is the database responding?
  • What changed recently? (Check the audit log if accessible.)

Step 2: Create a safety backup

Before doing anything else, take a backup of the current state even if it is damaged:

uv run python manage.py pg_dump_backup --description "Pre-recovery snapshot of current state"

If pg_dump fails because the database is in a bad state, try a JSON backup of whatever models are accessible:

uv run python manage.py backup_data --description "Emergency partial backup"

Step 3: Choose a backup to restore

List recent backups:

uv run python manage.py shell -c "
from backup_restore.models import BackupMetadata
for b in BackupMetadata.objects.filter(backup_type='pg_dump', is_valid=True).order_by('-timestamp')[:10]:
    print(f'{b.id}  {b.timestamp}  {b.filename}  {b.get_file_size_display()}')
"

Pick the most recent valid backup from before the problem occurred.

Step 4: Restore

For a full restore, use the pg_dump procedure:

uv run python manage.py pg_restore_backup \
  --backup-id <uuid> \
  --database-name rota_db \
  --confirm

For a partial restore (specific models only), use the JSON procedure with --dry-run first:

uv run python manage.py restore_data \
  --input backups/<filename>.json.gz \
  --dry-run

uv run python manage.py restore_data \
  --input backups/<filename>.json.gz \
  --confirm --verbose

Step 5: Verify

After restoring, check the following:

  1. Can you log in? Open the site and log in with an admin account.
  2. Are clinicians present? Check the clinician list in the admin.
  3. Are recent shifts visible? Look at the rota for the current period.
  4. Is Celery running? Check that the Celery worker and beat processes are healthy.
  5. Run a validation query:
uv run python manage.py shell -c "
from config.models import Clinician, Shift
from django.contrib.auth import get_user_model
User = get_user_model()
print(f'Users: {User.objects.count()}')
print(f'Clinicians: {Clinician.objects.count()}')
print(f'Shifts: {Shift.objects.count()}')
"

Compare these counts against the record_counts stored in the backup metadata.

Step 6: Document

Record what happened, which backup was restored, and the timestamp of both the incident and the recovery. The restore process creates an audit log entry automatically, but a human note in your incident log is also valuable.

Troubleshooting

Backup creation fails with "pg_dump command not found"

Install the PostgreSQL client tools:

# Ubuntu / Debian
sudo apt-get install postgresql-client

# CentOS / RHEL
sudo yum install postgresql

# Docker -- add to your Dockerfile
RUN apt-get update && apt-get install -y postgresql-client

Backup creation fails silently (Celery task)

Check:

  1. Celery worker logs: look for the task name tasks.backup_tasks.daily_pg_dump_backup.
  2. Disk space on the backups/ directory: df -h /path/to/project/backups/.
  3. File permissions: the Celery worker process needs write access to backups/.
  4. Database connectivity: can the worker reach PostgreSQL?

Restore fails with a validation error

The backup file may be corrupted, or the data model may have changed since the backup was created. Try:

  1. Use --dry-run to see if the file parses at all.
  2. Check whether any new migrations have been applied since the backup was taken. You may need to roll back migrations to match the backup's schema.
  3. Use a more recent backup that matches the current schema.

"Database name mismatch" during pg_restore

The --database-name argument must exactly match the NAME value in DATABASES['default'] in your settings. Check your current database name:

uv run python manage.py shell -c "
from django.conf import settings
print(settings.DATABASES['default']['NAME'])
"

Backup file is missing from disk

The BackupMetadata record exists but the file has been deleted or moved. You can clean up orphaned metadata by deleting the record from the admin interface.

Passwords after restore

User passwords are stored in backups as Django password hashes (e.g., pbkdf2_sha256$...). After restoring, users can log in with their original passwords. Passwords are never stored in plaintext.