Files
motovaultpro/docs/changes/vehicles-dropdown-v1/implementation-checklist.md
Eric Gullickson a052040e3a Initial Commit
2025-09-17 16:09:15 -05:00

634 lines
18 KiB
Markdown

# Vehicle ETL Integration - Implementation Checklist
## Overview
This checklist provides step-by-step execution guidance for implementing the Vehicle ETL integration. Each item includes verification steps and dependencies to ensure successful completion.
## Pre-Implementation Requirements
- [ ] **Docker Environment Ready**: Docker and Docker Compose installed and functional
- [ ] **Main Application Running**: MotoVaultPro backend and frontend operational
- [ ] **NHTSA Database Backup**: VPICList backup file available in `vehicle-etl/volumes/mssql/backups/`
- [ ] **Network Ports Available**: Ports 5433 (MVP Platform DB), 1433 (MSSQL), available
- [ ] **Git Branch Created**: Feature branch created for implementation
- [ ] **Backup Taken**: Complete backup of current working state
---
## Phase 1: Infrastructure Setup
### ✅ Task 1.1: Add MVP Platform Database Service
**Files**: `docker-compose.yml`
- [ ] Add `mvp-platform-database` service definition
- [ ] Configure PostgreSQL 15-alpine image
- [ ] Set database name to `mvp-platform-vehicles`
- [ ] Configure user `mvp_platform_user`
- [ ] Set port mapping to `5433:5432`
- [ ] Add health check configuration
- [ ] Add volume `mvp_platform_data`
**Verification**:
```bash
docker-compose config | grep -A 20 "mvp-platform-database"
```
### ✅ Task 1.2: Add MSSQL Source Database Service
**Files**: `docker-compose.yml`
- [ ] Add `mssql-source` service definition
- [ ] Configure MSSQL Server 2019 image
- [ ] Set SA password from environment variable
- [ ] Configure backup volume mount
- [ ] Add health check with 60s start period
- [ ] Add volume `mssql_source_data`
**Verification**:
```bash
docker-compose config | grep -A 15 "mssql-source"
```
### ✅ Task 1.3: Add ETL Scheduler Service
**Files**: `docker-compose.yml`
- [ ] Add `etl-scheduler` service definition
- [ ] Configure build context to `./vehicle-etl`
- [ ] Set all required environment variables
- [ ] Add dependency on both databases with health checks
- [ ] Configure logs volume mount
- [ ] Add volume `etl_scheduler_data`
**Verification**:
```bash
docker-compose config | grep -A 25 "etl-scheduler"
```
### ✅ Task 1.4: Update Backend Environment Variables
**Files**: `docker-compose.yml`
- [ ] Add `MVP_PLATFORM_DB_HOST` environment variable to backend
- [ ] Add `MVP_PLATFORM_DB_PORT` environment variable
- [ ] Add `MVP_PLATFORM_DB_NAME` environment variable
- [ ] Add `MVP_PLATFORM_DB_USER` environment variable
- [ ] Add `MVP_PLATFORM_DB_PASSWORD` environment variable
- [ ] Add dependency on `mvp-platform-database`
**Verification**:
```bash
docker-compose config | grep -A 10 "MVP_PLATFORM_DB"
```
### ✅ Task 1.5: Update Environment Files
**Files**: `.env.example`, `.env`
- [ ] Add `MVP_PLATFORM_DB_PASSWORD` to .env.example
- [ ] Add `MSSQL_SOURCE_PASSWORD` to .env.example
- [ ] Add ETL configuration variables
- [ ] Update local `.env` file if it exists
**Verification**:
```bash
grep "MVP_PLATFORM_DB_PASSWORD" .env.example
```
### ✅ Phase 1 Validation
- [ ] **Docker Compose Valid**: `docker-compose config` succeeds
- [ ] **Services Start**: `docker-compose up mvp-platform-database mssql-source -d` succeeds
- [ ] **Health Checks Pass**: Both databases show healthy status
- [ ] **Database Connections**: Can connect to both databases
- [ ] **Logs Directory Created**: `./vehicle-etl/logs/` exists
**Critical Check**:
```bash
docker-compose ps | grep -E "(mvp-platform-database|mssql-source)" | grep "healthy"
```
---
## Phase 2: Backend Migration
### ✅ Task 2.1: Remove External vPIC Dependencies
**Files**: `backend/src/features/vehicles/external/` (directory)
- [ ] Delete entire `external/vpic/` directory
- [ ] Remove `VPIC_API_URL` from `environment.ts`
- [ ] Add MVP Platform DB configuration to `environment.ts`
**Verification**:
```bash
ls backend/src/features/vehicles/external/ 2>/dev/null || echo "Directory removed ✅"
grep "VPIC_API_URL" backend/src/core/config/environment.ts || echo "VPIC_API_URL removed ✅"
```
### ✅ Task 2.2: Create MVP Platform Database Connection
**Files**: `backend/src/core/config/database.ts`
- [ ] Add `mvpPlatformPool` export
- [ ] Configure connection with MVP Platform DB parameters
- [ ] Set appropriate pool size (10 connections)
- [ ] Configure idle timeout
**Verification**:
```bash
grep "mvpPlatformPool" backend/src/core/config/database.ts
```
### ✅ Task 2.3: Create MVP Platform Repository
**Files**: `backend/src/features/vehicles/data/mvp-platform.repository.ts`
- [ ] Create `MvpPlatformRepository` class
- [ ] Implement `decodeVIN()` method
- [ ] Implement `getMakes()` method
- [ ] Implement `getModelsForMake()` method
- [ ] Implement `getTransmissions()` method
- [ ] Implement `getEngines()` method
- [ ] Implement `getTrims()` method
- [ ] Export singleton instance
**Verification**:
```bash
grep "export class MvpPlatformRepository" backend/src/features/vehicles/data/mvp-platform.repository.ts
```
### ✅ Task 2.4: Create VIN Decoder Service
**Files**: `backend/src/features/vehicles/domain/vin-decoder.service.ts`
- [ ] Create `VinDecoderService` class
- [ ] Implement VIN validation logic
- [ ] Implement cache-first decoding
- [ ] Implement model year extraction from VIN
- [ ] Add comprehensive error handling
- [ ] Export singleton instance
**Verification**:
```bash
grep "export class VinDecoderService" backend/src/features/vehicles/domain/vin-decoder.service.ts
```
### ✅ Task 2.5: Update Vehicles Service
**Files**: `backend/src/features/vehicles/domain/vehicles.service.ts`
- [ ] Remove imports for `vpicClient`
- [ ] Add imports for `vinDecoderService` and `mvpPlatformRepository`
- [ ] Replace `vpicClient.decodeVIN()` with `vinDecoderService.decodeVIN()`
- [ ] Add `getDropdownMakes()` method
- [ ] Add `getDropdownModels()` method
- [ ] Add `getDropdownTransmissions()` method
- [ ] Add `getDropdownEngines()` method
- [ ] Add `getDropdownTrims()` method
- [ ] Update cache prefix to `mvp-platform:vehicles`
**Verification**:
```bash
grep "vpicClient" backend/src/features/vehicles/domain/vehicles.service.ts || echo "vpicClient removed ✅"
grep "mvp-platform:vehicles" backend/src/features/vehicles/domain/vehicles.service.ts
```
### ✅ Phase 2 Validation
- [ ] **TypeScript Compiles**: `npm run build` succeeds in backend directory
- [ ] **No vPIC References**: `grep -r "vpic" backend/src/features/vehicles/` returns no results
- [ ] **Database Connection Test**: MVP Platform database accessible from backend
- [ ] **VIN Decoder Test**: VIN decoding service functional
**Critical Check**:
```bash
cd backend && npm run build && echo "Backend compilation successful ✅"
```
---
## Phase 3: API Migration
### ✅ Task 3.1: Update Vehicles Controller
**Files**: `backend/src/features/vehicles/api/vehicles.controller.ts`
- [ ] Remove imports for `vpicClient`
- [ ] Add import for updated `VehiclesService`
- [ ] Update `getDropdownMakes()` method to use MVP Platform
- [ ] Update `getDropdownModels()` method
- [ ] Update `getDropdownTransmissions()` method
- [ ] Update `getDropdownEngines()` method
- [ ] Update `getDropdownTrims()` method
- [ ] Maintain exact response format compatibility
- [ ] Add performance monitoring
- [ ] Add database error handling
**Verification**:
```bash
grep "vehiclesService.getDropdownMakes" backend/src/features/vehicles/api/vehicles.controller.ts
```
### ✅ Task 3.2: Verify Routes Configuration
**Files**: `backend/src/features/vehicles/api/vehicles.routes.ts`
- [ ] Confirm dropdown routes remain unauthenticated
- [ ] Verify no `preHandler: fastify.authenticate` on dropdown routes
- [ ] Ensure CRUD routes still require authentication
**Verification**:
```bash
grep -A 3 "dropdown/makes" backend/src/features/vehicles/api/vehicles.routes.ts | grep "preHandler" || echo "No auth on dropdown routes ✅"
```
### ✅ Task 3.3: Add Health Check Endpoint
**Files**: `vehicles.controller.ts`, `vehicles.routes.ts`
- [ ] Add `healthCheck()` method to controller
- [ ] Add `testMvpPlatformConnection()` method to service
- [ ] Add `/vehicles/health` route (unauthenticated)
- [ ] Test MVP Platform database connectivity
**Verification**:
```bash
grep "healthCheck" backend/src/features/vehicles/api/vehicles.controller.ts
```
### ✅ Phase 3 Validation
- [ ] **API Format Tests**: All dropdown endpoints return correct format
- [ ] **Authentication Tests**: Dropdown endpoints unauthenticated, CRUD authenticated
- [ ] **Performance Tests**: All endpoints respond < 100ms
- [ ] **Health Check**: `/api/vehicles/health` returns healthy status
**Critical Check**:
```bash
curl -s http://localhost:3001/api/vehicles/dropdown/makes | jq '.[0]' | grep "Make_ID"
```
---
## Phase 4: Scheduled ETL Implementation
### ✅ Task 4.1: Create ETL Dockerfile
**Files**: `vehicle-etl/docker/Dockerfile.etl`
- [ ] Base on Python 3.11-slim
- [ ] Install cron and system dependencies
- [ ] Install Python requirements
- [ ] Copy ETL source code
- [ ] Set up cron configuration
- [ ] Add health check
- [ ] Configure entrypoint
**Verification**:
```bash
ls vehicle-etl/docker/Dockerfile.etl
```
### ✅ Task 4.2: Create Cron Setup Script
**Files**: `vehicle-etl/docker/setup-cron.sh`
- [ ] Create script with execute permissions
- [ ] Configure cron job from environment variable
- [ ] Set proper file permissions
- [ ] Apply cron job to system
**Verification**:
```bash
ls -la vehicle-etl/docker/setup-cron.sh | grep "x"
```
### ✅ Task 4.3: Create Container Entrypoint
**Files**: `vehicle-etl/docker/entrypoint.sh`
- [ ] Start cron daemon in background
- [ ] Handle shutdown signals properly
- [ ] Support initial ETL run option
- [ ] Keep container running
**Verification**:
```bash
grep "cron -f" vehicle-etl/docker/entrypoint.sh
```
### ✅ Task 4.4: Update ETL Main Module
**Files**: `vehicle-etl/etl/main.py`
- [ ] Support `build-catalog` command
- [ ] Test all connections before ETL
- [ ] Implement complete ETL pipeline
- [ ] Add comprehensive error handling
- [ ] Write completion markers
**Verification**:
```bash
grep "build-catalog" vehicle-etl/etl/main.py
```
### ✅ Task 4.5: Create Connection Testing Module
**Files**: `vehicle-etl/etl/connections.py`
- [ ] Implement `test_mssql_connection()`
- [ ] Implement `test_postgres_connection()`
- [ ] Implement `test_redis_connection()`
- [ ] Implement `test_connections()` wrapper
- [ ] Add proper error logging
**Verification**:
```bash
grep "def test_connections" vehicle-etl/etl/connections.py
```
### ✅ Task 4.6: Create ETL Monitoring Script
**Files**: `vehicle-etl/scripts/check-etl-status.sh`
- [ ] Check last run status file
- [ ] Report success/failure status
- [ ] Show recent log entries
- [ ] Return appropriate exit codes
**Verification**:
```bash
ls -la vehicle-etl/scripts/check-etl-status.sh | grep "x"
```
### ✅ Task 4.7: Create Requirements File
**Files**: `vehicle-etl/requirements-etl.txt`
- [ ] Add database connectivity packages
- [ ] Add data processing packages
- [ ] Add logging and monitoring packages
- [ ] Add testing packages
**Verification**:
```bash
grep "pyodbc" vehicle-etl/requirements-etl.txt
```
### ✅ Phase 4 Validation
- [ ] **ETL Container Builds**: `docker-compose build etl-scheduler` succeeds
- [ ] **Connection Tests**: ETL can connect to all databases
- [ ] **Manual ETL Run**: ETL completes successfully
- [ ] **Cron Configuration**: Cron job properly configured
- [ ] **Health Checks**: ETL health monitoring functional
**Critical Check**:
```bash
docker-compose exec etl-scheduler python -m etl.main test-connections
```
---
## Phase 5: Testing & Validation
### ✅ Task 5.1: Run API Functionality Tests
**Script**: `test-api-formats.sh`
- [ ] Test dropdown API response formats
- [ ] Validate data counts and structure
- [ ] Verify error handling
- [ ] Check all endpoint availability
**Verification**: All API format tests pass
### ✅ Task 5.2: Run Authentication Tests
**Script**: `test-authentication.sh`
- [ ] Test dropdown endpoints are unauthenticated
- [ ] Test CRUD endpoints require authentication
- [ ] Verify security model unchanged
**Verification**: All authentication tests pass
### ✅ Task 5.3: Run Performance Tests
**Script**: `test-performance.sh`, `test-cache-performance.sh`
- [ ] Measure response times for all endpoints
- [ ] Verify < 100ms requirement met
- [ ] Test cache performance improvement
- [ ] Validate under load
**Verification**: All performance tests pass
### ✅ Task 5.4: Run Data Accuracy Tests
**Script**: `test-vin-accuracy.sh`, `test-data-completeness.sh`
- [ ] Test VIN decoding accuracy
- [ ] Verify data completeness
- [ ] Check data quality metrics
- [ ] Validate against known test cases
**Verification**: All accuracy tests pass
### ✅ Task 5.5: Run ETL Process Tests
**Script**: `test-etl-execution.sh`, `test-etl-scheduling.sh`
- [ ] Test ETL execution
- [ ] Verify scheduling configuration
- [ ] Check error handling
- [ ] Validate monitoring
**Verification**: All ETL tests pass
### ✅ Task 5.6: Run Error Handling Tests
**Script**: `test-error-handling.sh`
- [ ] Test database unavailability scenarios
- [ ] Verify graceful degradation
- [ ] Test recovery mechanisms
- [ ] Check error responses
**Verification**: All error handling tests pass
### ✅ Task 5.7: Run Load Tests
**Script**: `test-load.sh`
- [ ] Test concurrent request handling
- [ ] Measure performance under load
- [ ] Verify system stability
- [ ] Check resource usage
**Verification**: All load tests pass
### ✅ Task 5.8: Run Security Tests
**Script**: `test-security.sh`
- [ ] Test SQL injection prevention
- [ ] Verify input validation
- [ ] Check authentication bypasses
- [ ] Test parameter tampering
**Verification**: All security tests pass
### ✅ Phase 5 Validation
- [ ] **Master Test Script**: `test-all.sh` passes completely
- [ ] **Zero Breaking Changes**: All existing functionality preserved
- [ ] **Performance Requirements**: < 100ms response times achieved
- [ ] **Data Accuracy**: 99.9%+ VIN decoding accuracy maintained
- [ ] **ETL Reliability**: Weekly ETL process functional
**Critical Check**:
```bash
./test-all.sh && echo "ALL TESTS PASSED ✅"
```
---
## Final Implementation Checklist
### ✅ Pre-Production Validation
- [ ] **All Phases Complete**: Phases 1-5 successfully implemented
- [ ] **All Tests Pass**: Master test script shows 100% pass rate
- [ ] **Documentation Updated**: All documentation reflects current state
- [ ] **Environment Variables**: All required environment variables configured
- [ ] **Backup Validated**: Can restore to pre-implementation state if needed
### ✅ Production Readiness
- [ ] **Monitoring Configured**: ETL success/failure alerting set up
- [ ] **Log Rotation**: Log file rotation configured for ETL processes
- [ ] **Database Maintenance**: MVP Platform database backup scheduled
- [ ] **Performance Baseline**: Response time baselines established
- [ ] **Error Alerting**: API error rate monitoring configured
### ✅ Deployment
- [ ] **Staging Deployment**: Changes deployed and tested in staging
- [ ] **Production Deployment**: Changes deployed to production
- [ ] **Post-Deployment Tests**: All tests pass in production
- [ ] **Performance Monitoring**: Response times within acceptable range
- [ ] **ETL Schedule Active**: First scheduled ETL run successful
### ✅ Post-Deployment
- [ ] **Documentation Complete**: All documentation updated and accurate
- [ ] **Team Handover**: Development team trained on new architecture
- [ ] **Monitoring Active**: All monitoring and alerting operational
- [ ] **Support Runbook**: Troubleshooting procedures documented
- [ ] **MVP Platform Foundation**: Architecture pattern ready for next services
---
## Success Criteria Validation
### ✅ **Zero Breaking Changes**
- [ ] All existing vehicle endpoints work identically
- [ ] Frontend requires no changes
- [ ] User experience unchanged
- [ ] API response formats preserved exactly
### ✅ **Performance Requirements**
- [ ] Dropdown APIs consistently < 100ms
- [ ] VIN decoding < 200ms
- [ ] Cache hit rates > 90%
- [ ] No performance degradation under load
### ✅ **Data Accuracy**
- [ ] VIN decoding accuracy ≥ 99.9%
- [ ] All makes/models/trims available
- [ ] Data completeness maintained
- [ ] No data quality regressions
### ✅ **Reliability Requirements**
- [ ] Weekly ETL completes successfully
- [ ] Error handling and recovery functional
- [ ] Health checks operational
- [ ] Monitoring and alerting active
### ✅ **MVP Platform Foundation**
- [ ] Standardized naming conventions established
- [ ] Service isolation pattern implemented
- [ ] Scheduled processing framework operational
- [ ] Ready for additional platform services
---
## Emergency Rollback Plan
If critical issues arise during implementation:
### ✅ Immediate Rollback Steps
1. **Stop New Services**:
```bash
docker-compose stop mvp-platform-database mssql-source etl-scheduler
```
2. **Restore Backend Code**:
```bash
git checkout HEAD~1 -- backend/src/features/vehicles/
git checkout HEAD~1 -- backend/src/core/config/
```
3. **Restore Docker Configuration**:
```bash
git checkout HEAD~1 -- docker-compose.yml
git checkout HEAD~1 -- .env.example
```
4. **Restart Application**:
```bash
docker-compose restart backend
```
5. **Validate Rollback**:
```bash
curl -s http://localhost:3001/api/vehicles/dropdown/makes | jq '. | length'
```
### ✅ Rollback Validation
- [ ] **External API Working**: vPIC API endpoints functional
- [ ] **All Tests Pass**: Original functionality restored
- [ ] **No Data Loss**: No existing data affected
- [ ] **Performance Restored**: Response times back to baseline
---
## Implementation Notes
### Dependencies Between Phases
- **Phase 2** requires **Phase 1** infrastructure
- **Phase 3** requires **Phase 2** backend changes
- **Phase 4** requires **Phase 1** infrastructure
- **Phase 5** requires **Phases 1-4** complete
### Critical Success Factors
1. **Database Connectivity**: All database connections must be stable
2. **Data Population**: MVP Platform database must have comprehensive data
3. **Performance Optimization**: Database queries must be optimized for speed
4. **Error Handling**: Graceful degradation when services unavailable
5. **Cache Strategy**: Proper caching for performance requirements
### AI Assistant Guidance
This checklist is designed for efficient execution by AI assistants:
- Each task has clear file locations and verification steps
- Dependencies are explicitly stated
- Validation commands provided for each step
- Rollback procedures documented for safety
- Critical checks identified for each phase
**For any implementation questions, refer to the detailed phase documentation in the same directory.**