12 KiB
Rollback Strategy - Recovery Procedures
Overview
This document provides comprehensive rollback procedures for each phase of the simplification. Each phase can be rolled back independently, and full system rollback is available.
Pre-Execution Backup
Before Starting ANY Phase
# 1. Create backup branch
git checkout -b backup-$(date +%Y%m%d-%H%M%S)
git push origin backup-$(date +%Y%m%d-%H%M%S)
# 2. Tag current state
git tag -a pre-simplification-$(date +%Y%m%d) \
-m "State before architecture simplification"
git push origin --tags
# 3. Export docker volumes
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar czf /backup/postgres-backup-$(date +%Y%m%d).tar.gz /data
docker run --rm -v admin_redis_data:/data -v $(pwd):/backup \
alpine tar czf /backup/redis-backup-$(date +%Y%m%d).tar.gz /data
# 4. Export MinIO data (if documents exist)
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar czf /backup/minio-backup-$(date +%Y%m%d).tar.gz /data
# 5. Document current state
docker compose ps > container-state-$(date +%Y%m%d).txt
docker network ls > network-state-$(date +%Y%m%d).txt
Per-Phase Rollback Procedures
Phase 1: Docker Compose Rollback
Rollback Trigger:
- docker-compose.yml validation fails
- Containers fail to start
- Network errors
- Volume mount issues
Rollback Steps:
# 1. Stop current containers
docker compose down
# 2. Restore docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# 3. Restart with original config
docker compose up -d
# 4. Verify original state
docker compose ps # Should show 14 containers
Validation:
- 14 containers running
- All containers healthy
- No errors in logs
Risk: Low - file-based rollback, no data loss
Phase 2: Remove Tenant Rollback
Rollback Trigger:
- Build errors after tenant code removal
- Application won't start
- Tests failing
- Missing functionality
Rollback Steps:
# 1. Restore deleted files
git checkout HEAD~1 -- backend/src/core/middleware/tenant.ts
git checkout HEAD~1 -- backend/src/core/config/tenant.ts
git checkout HEAD~1 -- backend/src/features/tenant-management/
# 2. Restore modified files
git checkout HEAD~1 -- backend/src/app.ts
git checkout HEAD~1 -- backend/src/core/plugins/auth.plugin.ts
# 3. Rebuild backend
cd backend
npm install
npm run build
# 4. Restart backend container
docker compose restart mvp-backend # or admin-backend if Phase 1 not done
Validation:
- Backend builds successfully
- Backend starts without errors
- Tests pass
- Tenant functionality restored
Risk: Low-Medium - code rollback, no data impact
Phase 3: Filesystem Storage Rollback
Rollback Trigger:
- Document upload/download fails
- File system errors
- Permission issues
- Data access errors
Rollback Steps:
# 1. Stop backend
docker compose stop mvp-backend
# 2. Restore storage adapter
git checkout HEAD~1 -- backend/src/core/storage/
# 3. Restore documents feature
git checkout HEAD~1 -- backend/src/features/documents/
# 4. Re-add MinIO to docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# (Only MinIO service, keep other Phase 1 changes if applicable)
# 5. Restore MinIO data if backed up
docker volume create admin_minio_data
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/minio-backup-YYYYMMDD.tar.gz -C /
# 6. Rebuild and restart
docker compose up -d admin-minio
docker compose restart mvp-backend
Validation:
- MinIO container running
- Document upload works
- Document download works
- Existing documents accessible
Risk: Medium - requires MinIO restore, potential data migration
Phase 4: Config Cleanup Rollback
Rollback Trigger:
- Service connection failures
- Authentication errors
- Missing configuration
- Environment variable errors
Rollback Steps:
# 1. Restore config files
git checkout HEAD~1 -- config/app/production.yml
git checkout HEAD~1 -- .env
git checkout HEAD~1 -- .env.development
# 2. Restore secrets
git checkout HEAD~1 -- secrets/app/
git checkout HEAD~1 -- secrets/platform/
# 3. Restart affected services
docker compose restart mvp-backend mvp-platform
Validation:
- Backend connects to database
- Backend connects to Redis
- Platform service accessible
- Auth0 integration works
Risk: Low - configuration rollback, no data loss
Phase 5: Network Simplification Rollback
Rollback Trigger:
- Service discovery failures
- Network isolation broken
- Container communication errors
- Traefik routing issues
Rollback Steps:
# 1. Stop all services
docker compose down
# 2. Remove simplified networks
docker network rm motovaultpro_frontend motovaultpro_backend motovaultpro_database
# 3. Restore network configuration
git checkout HEAD~1 -- docker-compose.yml
# (Restore only networks section if possible)
# 4. Recreate networks and restart
docker compose up -d
# 5. Verify routing
curl http://localhost:8080/api/http/routers # Traefik dashboard
Validation:
- All 5 networks exist
- Services can communicate
- Traefik routes correctly
- No network errors
Risk: Medium - requires container restart, brief downtime
Phase 6: Backend Updates Rollback
Rollback Trigger:
- Service reference errors
- API connection failures
- Database connection issues
- Build failures
Rollback Steps:
# 1. Restore backend code
git checkout HEAD~1 -- backend/src/core/config/config-loader.ts
git checkout HEAD~1 -- backend/src/features/vehicles/external/
# 2. Rebuild backend
cd backend
npm run build
# 3. Restart backend
docker compose restart mvp-backend
Validation:
- Backend starts successfully
- Connects to database
- Platform client works
- Tests pass
Risk: Low - code rollback, no data impact
Phase 7: Database Updates Rollback
Rollback Trigger:
- Database connection failures
- Schema errors
- Migration failures
- Data access issues
Rollback Steps:
# 1. Restore database configuration
git checkout HEAD~1 -- backend/src/_system/migrations/
git checkout HEAD~1 -- docker-compose.yml
# (Only database section)
# 2. Restore database volume if corrupted
docker compose down mvp-postgres
docker volume rm mvp_postgres_data
docker volume create admin_postgres_data
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/postgres-backup-YYYYMMDD.tar.gz -C /
# 3. Restart database
docker compose up -d mvp-postgres
# 4. Re-run migrations if needed
docker compose exec mvp-backend node dist/_system/migrations/run-all.js
Validation:
- Database accessible
- All tables exist
- Data intact
- Migrations current
Risk: High - potential data loss if volume restore needed
Phase 8: Platform Service Rollback
Rollback Trigger:
- Platform API failures
- Database connection errors
- Service crashes
- API endpoint errors
Rollback Steps:
# 1. Stop simplified platform service
docker compose down mvp-platform
# 2. Restore platform service files
git checkout HEAD~1 -- mvp-platform-services/vehicles/
# 3. Restore full platform architecture in docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# (Only platform services section)
# 4. Restore platform database
docker volume create mvp_platform_vehicles_db_data
docker run --rm -v mvp_platform_vehicles_db_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/platform-db-backup-YYYYMMDD.tar.gz -C /
# 5. Restart all platform services
docker compose up -d mvp-platform-vehicles-api mvp-platform-vehicles-db \
mvp-platform-vehicles-redis mvp-platform-vehicles-etl
Validation:
- Platform service accessible
- API endpoints work
- VIN decode works
- Hierarchical data loads
Risk: Medium-High - requires multi-container restore
Phase 9: Documentation Rollback
Rollback Trigger:
- Incorrect documentation
- Missing instructions
- Broken links
- Confusion among team
Rollback Steps:
# 1. Restore all documentation
git checkout HEAD~1 -- README.md CLAUDE.md AI-INDEX.md
git checkout HEAD~1 -- docs/
git checkout HEAD~1 -- .ai/context.json
git checkout HEAD~1 -- Makefile
git checkout HEAD~1 -- backend/src/features/*/README.md
Validation:
- Documentation accurate
- Examples work
- Makefile commands work
Risk: None - documentation only, no functional impact
Phase 10: Frontend Rollback
Rollback Trigger:
- Build errors
- Runtime errors
- UI broken
- API calls failing
Rollback Steps:
# 1. Restore frontend code
git checkout HEAD~1 -- frontend/src/
# 2. Rebuild frontend
cd frontend
npm install
npm run build
# 3. Restart frontend container
docker compose restart mvp-frontend
Validation:
- Frontend builds successfully
- UI loads without errors
- Auth works
- API calls work
Risk: Low - frontend rollback, no data impact
Phase 11: Testing Rollback
Note: Testing phase doesn't modify code, only validates. If tests fail, rollback appropriate phases based on failure analysis.
Full System Rollback
Complete Rollback to Pre-Simplification State
When to Use:
- Multiple phases failing
- Unrecoverable errors
- Production blocker
- Need to abort entire simplification
Rollback Steps:
# 1. Stop all services
docker compose down
# 2. Restore entire codebase
git checkout pre-simplification
# 3. Restore volumes
docker volume rm mvp_postgres_data mvp_redis_data
docker volume create admin_postgres_data admin_redis_data admin_minio_data
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/postgres-backup-YYYYMMDD.tar.gz -C /
docker run --rm -v admin_redis_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/redis-backup-YYYYMMDD.tar.gz -C /
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/minio-backup-YYYYMMDD.tar.gz -C /
# 4. Restart all services
docker compose up -d
# 5. Verify original state
docker compose ps # Should show 14 containers
make test
Validation:
- All 14 containers running
- All tests passing
- Application functional
- Data intact
Duration: 15-30 minutes Risk: Low if backups are good
Partial Rollback Scenarios
Scenario 1: Keep Infrastructure Changes, Rollback Backend
# Keep Phase 1 (Docker), rollback Phases 2-11
git checkout pre-simplification -- backend/ frontend/
docker compose restart mvp-backend mvp-frontend
Scenario 2: Keep Config Cleanup, Rollback Code
# Keep Phase 4, rollback Phases 1-3, 5-11
git checkout pre-simplification -- docker-compose.yml backend/src/ frontend/src/
Scenario 3: Rollback Only Storage
# Rollback Phase 3 only
git checkout HEAD~1 -- backend/src/core/storage/ backend/src/features/documents/
docker compose up -d admin-minio
Rollback Decision Matrix
| Failure Type | Rollback Scope | Risk | Duration |
|---|---|---|---|
| Container start fails | Phase 1 | Low | 5 min |
| Build errors | Specific phase | Low | 10 min |
| Test failures | Investigate, partial | Medium | 15-30 min |
| Data corruption | Full + restore | High | 30-60 min |
| Network issues | Phase 5 | Medium | 10 min |
| Platform API down | Phase 8 | Medium | 15 min |
| Critical production bug | Full system | Medium | 30 min |
Post-Rollback Actions
After any rollback:
-
Document the Issue:
# Create incident report echo "Rollback performed: $(date)" >> docs/redesign/ROLLBACK-LOG.md echo "Reason: [description]" >> docs/redesign/ROLLBACK-LOG.md echo "Phases rolled back: [list]" >> docs/redesign/ROLLBACK-LOG.md -
Analyze Root Cause:
- Review logs
- Identify failure point
- Document lessons learned
-
Plan Fix:
- Address root cause
- Update phase documentation
- Add validation checks
-
Retry (if appropriate):
- Apply fix
- Re-execute phase
- Validate thoroughly
Emergency Contacts
If rollback fails or assistance needed:
- Technical Lead: [contact]
- DevOps Lead: [contact]
- Database Admin: [contact]
- Emergency Hotline: [contact]
Rollback Testing
Before starting simplification, test rollback procedures:
# Dry run rollback
git checkout -b rollback-test
# Make test change
echo "test" > test.txt
git add test.txt
git commit -m "test"
# Rollback test
git checkout HEAD~1 -- test.txt
# Verify rollback works
git checkout main
git branch -D rollback-test