Files
motovaultpro/docs/redesign/ROLLBACK-STRATEGY.md
Eric Gullickson 046c66fc7d Redesign
2025-11-01 21:27:42 -05:00

546 lines
12 KiB
Markdown

# Rollback Strategy - Recovery Procedures
## Overview
This document provides comprehensive rollback procedures for each phase of the simplification. Each phase can be rolled back independently, and full system rollback is available.
## Pre-Execution Backup
### Before Starting ANY Phase
```bash
# 1. Create backup branch
git checkout -b backup-$(date +%Y%m%d-%H%M%S)
git push origin backup-$(date +%Y%m%d-%H%M%S)
# 2. Tag current state
git tag -a pre-simplification-$(date +%Y%m%d) \
-m "State before architecture simplification"
git push origin --tags
# 3. Export docker volumes
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar czf /backup/postgres-backup-$(date +%Y%m%d).tar.gz /data
docker run --rm -v admin_redis_data:/data -v $(pwd):/backup \
alpine tar czf /backup/redis-backup-$(date +%Y%m%d).tar.gz /data
# 4. Export MinIO data (if documents exist)
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar czf /backup/minio-backup-$(date +%Y%m%d).tar.gz /data
# 5. Document current state
docker compose ps > container-state-$(date +%Y%m%d).txt
docker network ls > network-state-$(date +%Y%m%d).txt
```
---
## Per-Phase Rollback Procedures
### Phase 1: Docker Compose Rollback
**Rollback Trigger:**
- docker-compose.yml validation fails
- Containers fail to start
- Network errors
- Volume mount issues
**Rollback Steps:**
```bash
# 1. Stop current containers
docker compose down
# 2. Restore docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# 3. Restart with original config
docker compose up -d
# 4. Verify original state
docker compose ps # Should show 14 containers
```
**Validation:**
- [ ] 14 containers running
- [ ] All containers healthy
- [ ] No errors in logs
**Risk:** Low - file-based rollback, no data loss
---
### Phase 2: Remove Tenant Rollback
**Rollback Trigger:**
- Build errors after tenant code removal
- Application won't start
- Tests failing
- Missing functionality
**Rollback Steps:**
```bash
# 1. Restore deleted files
git checkout HEAD~1 -- backend/src/core/middleware/tenant.ts
git checkout HEAD~1 -- backend/src/core/config/tenant.ts
git checkout HEAD~1 -- backend/src/features/tenant-management/
# 2. Restore modified files
git checkout HEAD~1 -- backend/src/app.ts
git checkout HEAD~1 -- backend/src/core/plugins/auth.plugin.ts
# 3. Rebuild backend
cd backend
npm install
npm run build
# 4. Restart backend container
docker compose restart mvp-backend # or admin-backend if Phase 1 not done
```
**Validation:**
- [ ] Backend builds successfully
- [ ] Backend starts without errors
- [ ] Tests pass
- [ ] Tenant functionality restored
**Risk:** Low-Medium - code rollback, no data impact
---
### Phase 3: Filesystem Storage Rollback
**Rollback Trigger:**
- Document upload/download fails
- File system errors
- Permission issues
- Data access errors
**Rollback Steps:**
```bash
# 1. Stop backend
docker compose stop mvp-backend
# 2. Restore storage adapter
git checkout HEAD~1 -- backend/src/core/storage/
# 3. Restore documents feature
git checkout HEAD~1 -- backend/src/features/documents/
# 4. Re-add MinIO to docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# (Only MinIO service, keep other Phase 1 changes if applicable)
# 5. Restore MinIO data if backed up
docker volume create admin_minio_data
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/minio-backup-YYYYMMDD.tar.gz -C /
# 6. Rebuild and restart
docker compose up -d admin-minio
docker compose restart mvp-backend
```
**Validation:**
- [ ] MinIO container running
- [ ] Document upload works
- [ ] Document download works
- [ ] Existing documents accessible
**Risk:** Medium - requires MinIO restore, potential data migration
---
### Phase 4: Config Cleanup Rollback
**Rollback Trigger:**
- Service connection failures
- Authentication errors
- Missing configuration
- Environment variable errors
**Rollback Steps:**
```bash
# 1. Restore config files
git checkout HEAD~1 -- config/app/production.yml
git checkout HEAD~1 -- .env
git checkout HEAD~1 -- .env.development
# 2. Restore secrets
git checkout HEAD~1 -- secrets/app/
git checkout HEAD~1 -- secrets/platform/
# 3. Restart affected services
docker compose restart mvp-backend mvp-platform
```
**Validation:**
- [ ] Backend connects to database
- [ ] Backend connects to Redis
- [ ] Platform service accessible
- [ ] Auth0 integration works
**Risk:** Low - configuration rollback, no data loss
---
### Phase 5: Network Simplification Rollback
**Rollback Trigger:**
- Service discovery failures
- Network isolation broken
- Container communication errors
- Traefik routing issues
**Rollback Steps:**
```bash
# 1. Stop all services
docker compose down
# 2. Remove simplified networks
docker network rm motovaultpro_frontend motovaultpro_backend motovaultpro_database
# 3. Restore network configuration
git checkout HEAD~1 -- docker-compose.yml
# (Restore only networks section if possible)
# 4. Recreate networks and restart
docker compose up -d
# 5. Verify routing
curl http://localhost:8080/api/http/routers # Traefik dashboard
```
**Validation:**
- [ ] All 5 networks exist
- [ ] Services can communicate
- [ ] Traefik routes correctly
- [ ] No network errors
**Risk:** Medium - requires container restart, brief downtime
---
### Phase 6: Backend Updates Rollback
**Rollback Trigger:**
- Service reference errors
- API connection failures
- Database connection issues
- Build failures
**Rollback Steps:**
```bash
# 1. Restore backend code
git checkout HEAD~1 -- backend/src/core/config/config-loader.ts
git checkout HEAD~1 -- backend/src/features/vehicles/external/
# 2. Rebuild backend
cd backend
npm run build
# 3. Restart backend
docker compose restart mvp-backend
```
**Validation:**
- [ ] Backend starts successfully
- [ ] Connects to database
- [ ] Platform client works
- [ ] Tests pass
**Risk:** Low - code rollback, no data impact
---
### Phase 7: Database Updates Rollback
**Rollback Trigger:**
- Database connection failures
- Schema errors
- Migration failures
- Data access issues
**Rollback Steps:**
```bash
# 1. Restore database configuration
git checkout HEAD~1 -- backend/src/_system/migrations/
git checkout HEAD~1 -- docker-compose.yml
# (Only database section)
# 2. Restore database volume if corrupted
docker compose down mvp-postgres
docker volume rm mvp_postgres_data
docker volume create admin_postgres_data
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/postgres-backup-YYYYMMDD.tar.gz -C /
# 3. Restart database
docker compose up -d mvp-postgres
# 4. Re-run migrations if needed
docker compose exec mvp-backend node dist/_system/migrations/run-all.js
```
**Validation:**
- [ ] Database accessible
- [ ] All tables exist
- [ ] Data intact
- [ ] Migrations current
**Risk:** High - potential data loss if volume restore needed
---
### Phase 8: Platform Service Rollback
**Rollback Trigger:**
- Platform API failures
- Database connection errors
- Service crashes
- API endpoint errors
**Rollback Steps:**
```bash
# 1. Stop simplified platform service
docker compose down mvp-platform
# 2. Restore platform service files
git checkout HEAD~1 -- mvp-platform-services/vehicles/
# 3. Restore full platform architecture in docker-compose.yml
git checkout HEAD~1 -- docker-compose.yml
# (Only platform services section)
# 4. Restore platform database
docker volume create mvp_platform_vehicles_db_data
docker run --rm -v mvp_platform_vehicles_db_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/platform-db-backup-YYYYMMDD.tar.gz -C /
# 5. Restart all platform services
docker compose up -d mvp-platform-vehicles-api mvp-platform-vehicles-db \
mvp-platform-vehicles-redis mvp-platform-vehicles-etl
```
**Validation:**
- [ ] Platform service accessible
- [ ] API endpoints work
- [ ] VIN decode works
- [ ] Hierarchical data loads
**Risk:** Medium-High - requires multi-container restore
---
### Phase 9: Documentation Rollback
**Rollback Trigger:**
- Incorrect documentation
- Missing instructions
- Broken links
- Confusion among team
**Rollback Steps:**
```bash
# 1. Restore all documentation
git checkout HEAD~1 -- README.md CLAUDE.md AI-INDEX.md
git checkout HEAD~1 -- docs/
git checkout HEAD~1 -- .ai/context.json
git checkout HEAD~1 -- Makefile
git checkout HEAD~1 -- backend/src/features/*/README.md
```
**Validation:**
- [ ] Documentation accurate
- [ ] Examples work
- [ ] Makefile commands work
**Risk:** None - documentation only, no functional impact
---
### Phase 10: Frontend Rollback
**Rollback Trigger:**
- Build errors
- Runtime errors
- UI broken
- API calls failing
**Rollback Steps:**
```bash
# 1. Restore frontend code
git checkout HEAD~1 -- frontend/src/
# 2. Rebuild frontend
cd frontend
npm install
npm run build
# 3. Restart frontend container
docker compose restart mvp-frontend
```
**Validation:**
- [ ] Frontend builds successfully
- [ ] UI loads without errors
- [ ] Auth works
- [ ] API calls work
**Risk:** Low - frontend rollback, no data impact
---
### Phase 11: Testing Rollback
**Note:** Testing phase doesn't modify code, only validates. If tests fail, rollback appropriate phases based on failure analysis.
---
## Full System Rollback
### Complete Rollback to Pre-Simplification State
**When to Use:**
- Multiple phases failing
- Unrecoverable errors
- Production blocker
- Need to abort entire simplification
**Rollback Steps:**
```bash
# 1. Stop all services
docker compose down
# 2. Restore entire codebase
git checkout pre-simplification
# 3. Restore volumes
docker volume rm mvp_postgres_data mvp_redis_data
docker volume create admin_postgres_data admin_redis_data admin_minio_data
docker run --rm -v admin_postgres_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/postgres-backup-YYYYMMDD.tar.gz -C /
docker run --rm -v admin_redis_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/redis-backup-YYYYMMDD.tar.gz -C /
docker run --rm -v admin_minio_data:/data -v $(pwd):/backup \
alpine tar xzf /backup/minio-backup-YYYYMMDD.tar.gz -C /
# 4. Restart all services
docker compose up -d
# 5. Verify original state
docker compose ps # Should show 14 containers
make test
```
**Validation:**
- [ ] All 14 containers running
- [ ] All tests passing
- [ ] Application functional
- [ ] Data intact
**Duration:** 15-30 minutes
**Risk:** Low if backups are good
---
## Partial Rollback Scenarios
### Scenario 1: Keep Infrastructure Changes, Rollback Backend
```bash
# Keep Phase 1 (Docker), rollback Phases 2-11
git checkout pre-simplification -- backend/ frontend/
docker compose restart mvp-backend mvp-frontend
```
### Scenario 2: Keep Config Cleanup, Rollback Code
```bash
# Keep Phase 4, rollback Phases 1-3, 5-11
git checkout pre-simplification -- docker-compose.yml backend/src/ frontend/src/
```
### Scenario 3: Rollback Only Storage
```bash
# Rollback Phase 3 only
git checkout HEAD~1 -- backend/src/core/storage/ backend/src/features/documents/
docker compose up -d admin-minio
```
---
## Rollback Decision Matrix
| Failure Type | Rollback Scope | Risk | Duration |
|--------------|---------------|------|----------|
| Container start fails | Phase 1 | Low | 5 min |
| Build errors | Specific phase | Low | 10 min |
| Test failures | Investigate, partial | Medium | 15-30 min |
| Data corruption | Full + restore | High | 30-60 min |
| Network issues | Phase 5 | Medium | 10 min |
| Platform API down | Phase 8 | Medium | 15 min |
| Critical production bug | Full system | Medium | 30 min |
---
## Post-Rollback Actions
After any rollback:
1. **Document the Issue:**
```bash
# Create incident report
echo "Rollback performed: $(date)" >> docs/redesign/ROLLBACK-LOG.md
echo "Reason: [description]" >> docs/redesign/ROLLBACK-LOG.md
echo "Phases rolled back: [list]" >> docs/redesign/ROLLBACK-LOG.md
```
2. **Analyze Root Cause:**
- Review logs
- Identify failure point
- Document lessons learned
3. **Plan Fix:**
- Address root cause
- Update phase documentation
- Add validation checks
4. **Retry (if appropriate):**
- Apply fix
- Re-execute phase
- Validate thoroughly
---
## Emergency Contacts
If rollback fails or assistance needed:
- Technical Lead: [contact]
- DevOps Lead: [contact]
- Database Admin: [contact]
- Emergency Hotline: [contact]
---
## Rollback Testing
Before starting simplification, test rollback procedures:
```bash
# Dry run rollback
git checkout -b rollback-test
# Make test change
echo "test" > test.txt
git add test.txt
git commit -m "test"
# Rollback test
git checkout HEAD~1 -- test.txt
# Verify rollback works
git checkout main
git branch -D rollback-test
```