Runbook: Data Inconsistency¶

Last Updated: 2026-02-21 Severity: High Estimated TTR: 2 hours Owner: Development Team

Symptoms¶

Calculated values don't match expected results
Shift counts don't add up correctly
Deficiency values are incorrect
Leave balances are wrong

Detection¶

Alert: Manual report or user complaint
Dashboard: Data Quality Dashboard (if configured)
Query: Run data validation scripts

Diagnosis Steps¶

Identify specific inconsistent data:

# Check shift counts for specific clinician
docker compose exec web python manage.py shell
>>> from config.models import Clinician, Shift
>>> from calculations.shift_counter import ShiftCounter
>>> clinician = Clinician.objects.get(id=<id>)
>>> counter = ShiftCounter()
>>> counter.count_worked_shifts(clinician, date(2024,1,1), date(2024,1,31))

Verify source data:

# Check shift records in database
docker compose exec web python manage.py shell
>>> shifts = Shift.objects.filter(clinician_id=<id>, date__gte='2024-01-01', date__lte='2024-01-31')
>>> for s in shifts:
...     print(s.date, s.type, s.duration, s.status)

Expected: Shifts match expected schedule If different: Source data may be incorrect

Check for race conditions:

# Check audit logs for concurrent modifications
docker compose exec web python manage.py shell
>>> from config.models import AuditLog
>>> logs = AuditLog.objects.filter(
...     entity_type='Shift',
...     action__in=['CREATE', 'UPDATE']
... ).order_by('-timestamp')[:50]

Expected: Sequential modifications If different: May indicate race conditions

Verify calculation logic:

# Run calculation with debug logging
docker compose exec web python manage.py shell
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> # Re-run calculation

Expected: Calculation follows expected logic If different: Logic bug may exist

Root Causes¶

Cause	Likelihood	How to Confirm
Incomplete migration	High	Check if all migrations applied
Race condition in concurrent updates	Medium	Check audit logs for timing
Cache invalidation issue	Low	Clear cache and recheck
Business rule misunderstanding	High	Verify expected vs actual calculation logic
Database constraint bypass	Low	Check data integrity constraints

Resolution Steps¶

For Incomplete Migration¶

Check migration status:

docker compose exec web python manage.py showmigrations

Apply any missing migrations:

# Create backup first
docker compose exec web python manage.py dumpdata > backup_before_migration.json

# Apply migrations
docker compose exec web python manage.py migrate

# Verify data integrity
docker compose exec web python manage.py check

Verify: Data consistency restored

For Race Conditions¶

Identify affected records:

# Find records modified within same second
docker compose exec web python manage.py shell
>>> from config.models import AuditLog
>>> from django.db.models import Count
>>> duplicates = AuditLog.objects.values('entity_id', 'timestamp')\
...     .annotate(count=Count('id'))\
...     .filter(count__gt=1)

Implement locking for critical operations:

# Use select_for_update() to prevent race conditions
from django.db import transaction

with transaction.atomic():
    clinician = Clinician.objects.select_for_update().get(id=<id>)
    # Perform calculation and update
    clinician.save()

Verify: No new race conditions detected

For Cache Issues¶

Clear all caches:

docker compose exec redis redis-cli FLUSHALL

Restart services to clear in-memory cache:

docker compose restart web
docker compose restart celery_worker

Recalculate affected data:

# Trigger recalculation for affected period
docker compose exec web python manage.py shell
>>> from calculations.tasks import recalculate_clinician
>>> recalculate_clinician(clinician_id=<id>, start_date=..., end_date=...)

Verify: Data now consistent

For Business Rule Issues¶

Document expected behavior:

# Create test case for expected behavior
def test_specific_business_rule():
    clinician = create_test_clinician()
    result = calculate_deficiency(clinician, start, end)
    assert result == expected_value

Update calculation logic if needed:

# Fix calculation to match business rules
def calculate_deficiency(clinician, start_date, end_date, as_of_date):
    # Updated logic here
    pass

Verify: All tests pass

Verification¶

After applying fix, verify: - [ ] Inconsistent data corrected - [ ] Validation scripts pass - [ ] No new inconsistencies reported - [ ] Audit logs show expected behavior

Prevention¶

Add database constraints for critical data
Implement proper transaction isolation
Add validation checks before data commits
Use locking for critical operations
Regular data integrity audits
Comprehensive test coverage for business rules

Escalation¶

If unresolved after 4 hours, escalate to: Tech Lead
If data loss possible, escalate immediately: Senior Developer
On-call contact: See on-call roster

Related runbooks: calculation_failures.md, performance_degradation.md
Related migrations: Check migration log for recent schema changes