Runbook: Data Inconsistency¶
Last Updated: 2026-02-21 Severity: High Estimated TTR: 2 hours Owner: Development Team
Symptoms¶
- Calculated values don't match expected results
- Shift counts don't add up correctly
- Deficiency values are incorrect
- Leave balances are wrong
Detection¶
- Alert: Manual report or user complaint
- Dashboard: Data Quality Dashboard (if configured)
- Query: Run data validation scripts
Diagnosis Steps¶
-
Identify specific inconsistent data:
# Check shift counts for specific clinician docker compose exec web python manage.py shell >>> from config.models import Clinician, Shift >>> from calculations.shift_counter import ShiftCounter >>> clinician = Clinician.objects.get(id=<id>) >>> counter = ShiftCounter() >>> counter.count_worked_shifts(clinician, date(2024,1,1), date(2024,1,31)) -
Verify source data:
Expected: Shifts match expected schedule If different: Source data may be incorrect -
Check for race conditions:
Expected: Sequential modifications If different: May indicate race conditions -
Verify calculation logic:
Expected: Calculation follows expected logic If different: Logic bug may exist
Root Causes¶
| Cause | Likelihood | How to Confirm |
|---|---|---|
| Incomplete migration | High | Check if all migrations applied |
| Race condition in concurrent updates | Medium | Check audit logs for timing |
| Cache invalidation issue | Low | Clear cache and recheck |
| Business rule misunderstanding | High | Verify expected vs actual calculation logic |
| Database constraint bypass | Low | Check data integrity constraints |
Resolution Steps¶
For Incomplete Migration¶
-
Check migration status:
-
Apply any missing migrations:
-
Verify: Data consistency restored
For Race Conditions¶
-
Identify affected records:
-
Implement locking for critical operations:
-
Verify: No new race conditions detected
For Cache Issues¶
-
Clear all caches:
-
Restart services to clear in-memory cache:
-
Recalculate affected data:
-
Verify: Data now consistent
For Business Rule Issues¶
-
Document expected behavior:
-
Update calculation logic if needed:
-
Verify: All tests pass
Verification¶
After applying fix, verify: - [ ] Inconsistent data corrected - [ ] Validation scripts pass - [ ] No new inconsistencies reported - [ ] Audit logs show expected behavior
Prevention¶
- Add database constraints for critical data
- Implement proper transaction isolation
- Add validation checks before data commits
- Use locking for critical operations
- Regular data integrity audits
- Comprehensive test coverage for business rules
Escalation¶
- If unresolved after 4 hours, escalate to: Tech Lead
- If data loss possible, escalate immediately: Senior Developer
- On-call contact: See on-call roster
Related Issues¶
- Related runbooks:
calculation_failures.md,performance_degradation.md - Related migrations: Check migration log for recent schema changes