diff --git a/K8S-OVERVIEW.md b/K8S-OVERVIEW.md new file mode 100644 index 0000000..d43840c --- /dev/null +++ b/K8S-OVERVIEW.md @@ -0,0 +1,308 @@ +# Kubernetes Modernization Plan for MotoVaultPro + +## Executive Summary + +This document provides an overview of the comprehensive plan to modernize MotoVaultPro from a traditional self-hosted application to a cloud-native, highly available system running on Kubernetes. The modernization focuses on transforming the current monolithic ASP.NET Core application into a resilient, scalable platform capable of handling enterprise-level workloads while maintaining the existing feature set and user experience. + +### Key Objectives +- **High Availability**: Eliminate single points of failure through distributed architecture +- **Scalability**: Enable horizontal scaling to handle increased user loads +- **Resilience**: Implement fault tolerance and automatic recovery mechanisms +- **Cloud-Native**: Adopt Kubernetes-native patterns and best practices +- **Operational Excellence**: Improve monitoring, logging, and maintenance capabilities + +### Strategic Benefits +- **Reduced Downtime**: Multi-replica deployments with automatic failover +- **Improved Performance**: Distributed caching and optimized data access patterns +- **Enhanced Security**: Pod-level isolation and secret management +- **Cost Optimization**: Efficient resource utilization through auto-scaling +- **Future-Ready**: Foundation for microservices and advanced cloud features + +## Current Architecture Analysis + +### Existing System Overview +MotoVaultPro is currently deployed as a monolithic ASP.NET Core 8.0 application with the following characteristics: + +#### Application Architecture +- **Monolithic Design**: Single deployable unit containing all functionality +- **MVC Pattern**: Traditional Model-View-Controller architecture +- **Dual Database Support**: LiteDB (embedded) and PostgreSQL (external) +- **File Storage**: Local filesystem for document attachments +- **Session Management**: In-memory or cookie-based sessions +- **Configuration**: File-based configuration with environment variables + +#### Identified Limitations for Kubernetes +1. **State Dependencies**: LiteDB and local file storage prevent stateless operation +2. **Configuration Management**: File-based configuration not suitable for container orchestration +3. **Health Monitoring**: Lacks Kubernetes-compatible health check endpoints +4. **Logging**: Basic logging not optimized for centralized log aggregation +5. **Resource Management**: No resource constraints or auto-scaling capabilities +6. **Secret Management**: Sensitive configuration stored in plain text files + +## Target Architecture + +### Cloud-Native Design Principles +The modernized architecture will embrace the following cloud-native principles: + +#### Stateless Application Design +- **External State Storage**: All state moved to external, highly available services +- **Horizontal Scalability**: Multiple application replicas with load balancing +- **Configuration as Code**: All configuration externalized to ConfigMaps and Secrets +- **Ephemeral Containers**: Pods can be created, destroyed, and recreated without data loss + +#### Distributed Data Architecture +- **PostgreSQL Cluster**: Primary/replica configuration with automatic failover +- **MinIO High Availability**: Distributed object storage for file attachments +- **Redis Cluster**: Distributed caching and session storage +- **Backup Strategy**: Automated backups with point-in-time recovery + +#### Observability and Operations +- **Structured Logging**: JSON logging with correlation IDs for distributed tracing +- **Metrics Collection**: Prometheus-compatible metrics for monitoring +- **Health Checks**: Kubernetes-native readiness and liveness probes +- **Distributed Tracing**: OpenTelemetry integration for request flow analysis + +### High-Level Architecture Diagram +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Kubernetes Cluster │ +├─────────────────────────────────────────────────────────────────┤ +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ +│ │ MotoVault │ │ MotoVault │ │ MotoVault │ │ +│ │ Pod (1) │ │ Pod (2) │ │ Pod (3) │ │ +│ │ │ │ │ │ │ │ +│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ +│ │ │ │ │ +│ ┌─────────────────────────────────────────────────────────────┐ │ +│ │ Load Balancer Service │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ │ │ │ │ +├───────────┼─────────────────────┼─────────────────────┼──────────┤ +│ ┌────────▼──────┐ ┌─────────▼──────┐ ┌─────────▼──────┐ │ +│ │ PostgreSQL │ │ Redis Cluster │ │ MinIO Cluster │ │ +│ │ Primary │ │ (3 nodes) │ │ (4+ nodes) │ │ +│ │ + 2 Replicas │ │ │ │ Erasure Coded │ │ +│ └───────────────┘ └────────────────┘ └────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +## Implementation Phases Overview + +The modernization is structured in four distinct phases, each building upon the previous phase to ensure a smooth and risk-managed transition: + +### [Phase 1: Core Kubernetes Readiness](K8S-PHASE-1.md) (Weeks 1-4) + +**Objective**: Make the application compatible with Kubernetes deployment patterns. + +**Key Deliverables**: +- Configuration externalization to ConfigMaps and Secrets +- Removal of LiteDB dependencies +- PostgreSQL connection pooling optimization +- Kubernetes health check endpoints +- Structured logging implementation + +**Success Criteria**: +- Application starts using only environment variables +- Health checks return appropriate status codes +- Database migrations work seamlessly +- Structured JSON logging operational + +### [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md) (Weeks 5-8) + +**Objective**: Deploy highly available supporting infrastructure. + +**Key Deliverables**: +- MinIO distributed object storage cluster +- File storage abstraction layer +- PostgreSQL HA cluster with automated failover +- Redis cluster for distributed sessions and caching +- Comprehensive monitoring setup + +**Success Criteria**: +- MinIO cluster operational with erasure coding +- PostgreSQL cluster with automatic failover +- Redis cluster providing distributed sessions +- All file operations using object storage +- Infrastructure monitoring and alerting active + +### [Phase 3: Production Deployment](K8S-PHASE-3.md) (Weeks 9-12) + +**Objective**: Deploy to production with security, monitoring, and backup strategies. + +**Key Deliverables**: +- Production Kubernetes manifests with HPA +- Secure ingress with automated TLS certificates +- Comprehensive application and infrastructure monitoring +- Automated backup and disaster recovery procedures +- Migration tools and procedures + +**Success Criteria**: +- Production deployment with 99.9% availability target +- Secure external access with TLS +- Monitoring dashboards and alerting operational +- Backup and recovery procedures validated +- Migration dry runs successful + +### [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md) (Weeks 13-16) + +**Objective**: Implement advanced features and optimize for scale and performance. + +**Key Deliverables**: +- Multi-layer caching (Memory, Redis, CDN) +- Advanced performance optimizations +- Enhanced security features and compliance +- Production migration execution +- Operational excellence and automation + +**Success Criteria**: +- Multi-layer caching reducing database load by 70% +- 95th percentile response time under 500ms +- Zero-downtime production migration completed +- Advanced security policies implemented +- Team trained on new operational procedures + +## Migration Strategy + +### Pre-Migration Assessment +1. **Data Inventory**: Catalog all existing data, configurations, and file attachments +2. **Dependency Mapping**: Identify all external dependencies and integrations +3. **Performance Baseline**: Establish current performance metrics for comparison +4. **User Impact Assessment**: Analyze potential downtime and user experience changes + +### Migration Execution Plan + +#### Blue-Green Deployment Strategy +- Parallel environment setup to minimize risk +- Gradual traffic migration with automated rollback +- Comprehensive validation at each step +- Minimal downtime through DNS cutover + +#### Data Migration Approach +- Initial bulk data migration during low-usage periods +- Incremental synchronization during cutover +- Automated validation and integrity checks +- Point-in-time recovery capabilities + +## Risk Assessment and Mitigation + +### High Impact Risks + +**Data Loss or Corruption** +- **Probability**: Low | **Impact**: Critical +- **Mitigation**: Multiple backup strategies, parallel systems, automated validation + +**Extended Downtime During Migration** +- **Probability**: Medium | **Impact**: High +- **Mitigation**: Blue-green deployment, comprehensive rollback procedures + +**Performance Degradation** +- **Probability**: Medium | **Impact**: Medium +- **Mitigation**: Load testing, performance monitoring, auto-scaling + +### Mitigation Strategies +- Comprehensive testing at each phase +- Automated rollback procedures +- Parallel running systems during transition +- 24/7 monitoring during critical periods + +## Success Metrics + +### Technical Success Criteria +- **Availability**: 99.9% uptime (≤ 8.76 hours downtime/year) +- **Performance**: 95th percentile response time < 500ms +- **Scalability**: Handle 10x current user load +- **Recovery**: RTO < 1 hour, RPO < 15 minutes + +### Operational Success Criteria +- **Deployment Frequency**: Weekly deployments with zero downtime +- **Mean Time to Recovery**: < 30 minutes for critical issues +- **Change Failure Rate**: < 5% of deployments require rollback +- **Monitoring Coverage**: 100% of critical services monitored + +### Business Success Criteria +- **User Satisfaction**: No degradation in user experience +- **Cost Efficiency**: Infrastructure costs within 20% of current spending +- **Maintenance Overhead**: 50% reduction in operational maintenance time +- **Future Readiness**: Foundation for advanced features and scaling + +## Implementation Timeline + +### 16-Week Detailed Schedule + +**Weeks 1-4**: [Phase 1 - Core Kubernetes Readiness](K8S-PHASE-1.md) +- Application configuration externalization +- Database architecture modernization +- Health checks and logging implementation + +**Weeks 5-8**: [Phase 2 - High Availability Infrastructure](K8S-PHASE-2.md) +- MinIO and PostgreSQL HA deployment +- File storage abstraction +- Redis cluster implementation + +**Weeks 9-12**: [Phase 3 - Production Deployment](K8S-PHASE-3.md) +- Production Kubernetes deployment +- Security and monitoring implementation +- Backup and recovery procedures + +**Weeks 13-16**: [Phase 4 - Advanced Features](K8S-PHASE-4.md) +- Performance optimization +- Security enhancements +- Production migration execution + +## Team Requirements + +### Skills and Training +- **Kubernetes Administration**: Container orchestration and cluster management +- **Cloud-Native Development**: Microservices patterns and distributed systems +- **Monitoring and Observability**: Prometheus, Grafana, and logging systems +- **Security**: Container security, network policies, and secret management + +### Operational Procedures +- **Deployment Automation**: CI/CD pipelines and GitOps workflows +- **Incident Response**: Monitoring, alerting, and escalation procedures +- **Backup and Recovery**: Automated backup validation and recovery testing +- **Performance Management**: Capacity planning and scaling procedures + +## Getting Started + +### Prerequisites +- Kubernetes cluster (development/staging/production) +- Container registry for Docker images +- Persistent storage classes +- Network policies and ingress controller +- Monitoring infrastructure (Prometheus/Grafana) + +### Phase 1 Quick Start +1. Review [Phase 1 implementation guide](K8S-PHASE-1.md) +2. Set up development Kubernetes environment +3. Create ConfigMap and Secret templates +4. Begin application configuration externalization +5. Remove LiteDB dependencies + +### Next Steps +After completing Phase 1, proceed with: +- [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md) +- [Phase 3: Production Deployment](K8S-PHASE-3.md) +- [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md) + +## Support and Documentation + +### Additional Resources +- **Architecture Documentation**: See [docs/architecture.md](docs/architecture.md) +- **Development Guidelines**: Follow existing code conventions and patterns +- **Testing Strategy**: Comprehensive testing at each phase +- **Security Guidelines**: Container and Kubernetes security best practices + +### Team Contacts +- **Project Lead**: Kubernetes modernization coordination +- **DevOps Team**: Infrastructure and deployment automation +- **Security Team**: Security policies and compliance validation +- **QA Team**: Testing and validation procedures + +--- + +**Document Version**: 1.0 +**Last Updated**: January 2025 +**Status**: Implementation Ready + +This comprehensive modernization plan provides a structured approach to transforming MotoVaultPro into a cloud-native, highly available application running on Kubernetes. Each phase builds upon the previous one, ensuring minimal risk while delivering maximum benefits for future growth and reliability. \ No newline at end of file diff --git a/K8S-PHASE-1-DETAILED.md b/K8S-PHASE-1-DETAILED.md new file mode 100644 index 0000000..6025cf2 --- /dev/null +++ b/K8S-PHASE-1-DETAILED.md @@ -0,0 +1,3416 @@ +# Phase 1: Core Kubernetes Readiness - Detailed Implementation Plan + +## Executive Summary + +This document provides a comprehensive, step-by-step implementation plan for Phase 1 that ensures minimal risk through incremental changes, thorough testing, and debugging at each step. Each change is isolated, tested, and verified before proceeding to the next step. + +## Improved Implementation Strategy + +### Key Principles +1. **One Change at a Time**: Each step focuses on a single, well-defined change +2. **Non-Destructive First**: Start with safest changes that don't affect data or core functionality +3. **Comprehensive Testing**: Automated and manual validation at each step +4. **Rollback Ready**: Every change includes a rollback procedure +5. **Debugging First**: Extensive debugging and diagnostic capabilities before making changes +6. **Continuous Validation**: Performance and functionality validation throughout + +### Risk Mitigation Improvements +- **Comprehensive Logging**: Extensive structured logging for troubleshooting +- **Feature Flags**: Enable/disable new functionality without code changes +- **Automated Testing**: Comprehensive test suite validation at each step +- **Performance Monitoring**: Baseline and continuous performance validation +- **User Experience**: Functional testing ensures no user-facing regressions + +## Step-by-Step Implementation Plan + +### Step 1: Structured Logging Implementation +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Implement structured JSON logging to improve observability during subsequent changes. + +#### Implementation + +```csharp +// 1.1: Add logging configuration (Program.cs) +builder.Services.AddLogging(loggingBuilder => +{ + loggingBuilder.ClearProviders(); + + if (builder.Environment.IsDevelopment()) + { + loggingBuilder.AddConsole(); + } + else + { + loggingBuilder.AddJsonConsole(options => + { + options.IncludeScopes = true; + options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ"; + options.JsonWriterOptions = new JsonWriterOptions { Indented = false }; + }); + } +}); + +// 1.2: Add correlation ID service +public class CorrelationIdService +{ + public string CorrelationId { get; } = Guid.NewGuid().ToString(); +} + +// 1.3: Add correlation ID middleware +public class CorrelationIdMiddleware +{ + private readonly RequestDelegate _next; + private readonly ILogger _logger; + + public async Task InvokeAsync(HttpContext context) + { + var correlationId = context.Request.Headers["X-Correlation-ID"] + .FirstOrDefault() ?? Guid.NewGuid().ToString(); + + context.Items["CorrelationId"] = correlationId; + context.Response.Headers.Add("X-Correlation-ID", correlationId); + + using var scope = _logger.BeginScope(new Dictionary + { + ["CorrelationId"] = correlationId, + ["RequestPath"] = context.Request.Path, + ["RequestMethod"] = context.Request.Method + }); + + await _next(context); + } +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public async Task StructuredLogging_ProducesValidJson() +{ + // Arrange + var logOutput = new StringWriter(); + var logger = CreateTestLogger(logOutput); + + // Act + logger.LogInformation("Test message", new { TestProperty = "TestValue" }); + + // Assert + var logEntry = JsonSerializer.Deserialize(logOutput.ToString()); + Assert.IsNotNull(logEntry.Timestamp); + Assert.AreEqual("Information", logEntry.Level); + Assert.Contains("Test message", logEntry.Message); +} + +[Test] +public async Task CorrelationId_PreservedAcrossRequests() +{ + // Test that correlation ID flows through request pipeline + var client = _factory.CreateClient(); + var correlationId = Guid.NewGuid().ToString(); + + client.DefaultRequestHeaders.Add("X-Correlation-ID", correlationId); + var response = await client.GetAsync("/health"); + + Assert.AreEqual(correlationId, response.Headers.GetValues("X-Correlation-ID").First()); +} +``` + +**Manual Validation**: +1. Start application and verify JSON log format in console +2. Make HTTP requests and verify correlation IDs in logs +3. Check log aggregation works with external tools +4. Verify existing functionality unchanged + +**Success Criteria**: +- [ ] All logs output in structured JSON format +- [ ] Correlation IDs generated and preserved +- [ ] No existing functionality affected +- [ ] Performance impact < 5ms per request + +**Rollback Procedure**: +```bash +# Revert logging configuration +git checkout HEAD~1 -- Program.cs +# Remove middleware registration +# Restart application +``` + +--- + +### Step 2: Health Check Infrastructure +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Implement comprehensive health check endpoints for Kubernetes readiness and liveness probes. + +#### Implementation + +```csharp +// 2.1: Add health check services +public class DatabaseHealthCheck : IHealthCheck +{ + private readonly IConfiguration _configuration; + private readonly ILogger _logger; + + public async Task CheckHealthAsync( + HealthCheckContext context, + CancellationToken cancellationToken = default) + { + try + { + var connectionString = _configuration.GetConnectionString("DefaultConnection"); + + if (connectionString?.Contains("LiteDB") == true) + { + return await CheckLiteDBHealthAsync(connectionString); + } + else if (!string.IsNullOrEmpty(connectionString)) + { + return await CheckPostgreSQLHealthAsync(connectionString, cancellationToken); + } + + return HealthCheckResult.Unhealthy("No database configuration found"); + } + catch (Exception ex) + { + _logger.LogError(ex, "Database health check failed"); + return HealthCheckResult.Unhealthy("Database health check failed", ex); + } + } + + private async Task CheckPostgreSQLHealthAsync( + string connectionString, + CancellationToken cancellationToken) + { + using var connection = new NpgsqlConnection(connectionString); + await connection.OpenAsync(cancellationToken); + + using var command = new NpgsqlCommand("SELECT 1", connection); + var result = await command.ExecuteScalarAsync(cancellationToken); + + return HealthCheckResult.Healthy($"PostgreSQL connection successful. Result: {result}"); + } + + private async Task CheckLiteDBHealthAsync(string connectionString) + { + try + { + using var db = new LiteDatabase(connectionString); + var collections = db.GetCollectionNames().ToList(); + return HealthCheckResult.Healthy($"LiteDB connection successful. Collections: {collections.Count}"); + } + catch (Exception ex) + { + return HealthCheckResult.Unhealthy("LiteDB connection failed", ex); + } + } +} + +// 2.2: Add application health check +public class ApplicationHealthCheck : IHealthCheck +{ + private readonly IServiceProvider _serviceProvider; + + public async Task CheckHealthAsync( + HealthCheckContext context, + CancellationToken cancellationToken = default) + { + try + { + // Verify essential services are available + var vehicleLogic = _serviceProvider.GetService(); + var userLogic = _serviceProvider.GetService(); + + if (vehicleLogic == null || userLogic == null) + { + return HealthCheckResult.Unhealthy("Essential services not available"); + } + + return HealthCheckResult.Healthy("Application services available"); + } + catch (Exception ex) + { + return HealthCheckResult.Unhealthy("Application health check failed", ex); + } + } +} + +// 2.3: Configure health checks in Program.cs +builder.Services.AddHealthChecks() + .AddCheck("database", tags: new[] { "ready", "db" }) + .AddCheck("application", tags: new[] { "ready", "app" }); + +// 2.4: Add health check endpoints +app.MapHealthChecks("/health/live", new HealthCheckOptions +{ + Predicate = _ => false, // Only checks if the app is running + ResponseWriter = async (context, report) => + { + context.Response.ContentType = "application/json"; + var response = new + { + status = "Healthy", + timestamp = DateTime.UtcNow, + uptime = DateTime.UtcNow - Process.GetCurrentProcess().StartTime + }; + await context.Response.WriteAsync(JsonSerializer.Serialize(response)); + } +}); + +app.MapHealthChecks("/health/ready", new HealthCheckOptions +{ + Predicate = check => check.Tags.Contains("ready"), + ResponseWriter = async (context, report) => + { + context.Response.ContentType = "application/json"; + var response = new + { + status = report.Status.ToString(), + timestamp = DateTime.UtcNow, + checks = report.Entries.Select(x => new + { + name = x.Key, + status = x.Value.Status.ToString(), + description = x.Value.Description, + duration = x.Value.Duration.TotalMilliseconds + }) + }; + await context.Response.WriteAsync(JsonSerializer.Serialize(response)); + } +}); +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public async Task HealthCheck_Live_ReturnsHealthy() +{ + var client = _factory.CreateClient(); + var response = await client.GetAsync("/health/live"); + + Assert.AreEqual(HttpStatusCode.OK, response.StatusCode); + + var content = await response.Content.ReadAsStringAsync(); + var healthResponse = JsonSerializer.Deserialize(content); + + Assert.AreEqual("Healthy", healthResponse.Status); + Assert.IsTrue(healthResponse.Uptime > TimeSpan.Zero); +} + +[Test] +public async Task HealthCheck_Ready_ValidatesAllServices() +{ + var client = _factory.CreateClient(); + var response = await client.GetAsync("/health/ready"); + + Assert.AreEqual(HttpStatusCode.OK, response.StatusCode); + + var content = await response.Content.ReadAsStringAsync(); + var healthResponse = JsonSerializer.Deserialize(content); + + Assert.Contains(healthResponse.Checks, c => c.Name == "database"); + Assert.Contains(healthResponse.Checks, c => c.Name == "application"); +} + +[Test] +public async Task HealthCheck_DatabaseFailure_ReturnsUnhealthy() +{ + // Test with invalid connection string + var factory = CreateTestFactory(invalidConnectionString: true); + var client = factory.CreateClient(); + + var response = await client.GetAsync("/health/ready"); + Assert.AreEqual(HttpStatusCode.ServiceUnavailable, response.StatusCode); +} +``` + +**Manual Validation**: +1. Verify `/health/live` returns 200 OK with JSON response +2. Verify `/health/ready` returns detailed health information +3. Test with database disconnected - should return 503 +4. Verify health checks work with both LiteDB and PostgreSQL +5. Test health check performance (< 100ms response time) + +**Success Criteria**: +- [ ] Health endpoints return appropriate HTTP status codes +- [ ] JSON responses contain required health information +- [ ] Database connectivity properly validated +- [ ] Health checks complete within 100ms +- [ ] Unhealthy conditions properly detected + +--- + +### Step 3: Configuration Framework Enhancement +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Enhance configuration framework to support both file-based and environment variable configuration. + +#### Implementation + +```csharp +// 3.1: Create configuration validation service +public class ConfigurationValidationService +{ + private readonly IConfiguration _configuration; + private readonly ILogger _logger; + + public ValidationResult ValidateConfiguration() + { + var result = new ValidationResult(); + + // Database configuration + ValidateDatabaseConfiguration(result); + + // Application configuration + ValidateApplicationConfiguration(result); + + // External service configuration + ValidateExternalServiceConfiguration(result); + + return result; + } + + private void ValidateDatabaseConfiguration(ValidationResult result) + { + var postgresConnection = _configuration.GetConnectionString("DefaultConnection"); + var liteDbPath = _configuration["LiteDB:DatabasePath"]; + + if (string.IsNullOrEmpty(postgresConnection) && string.IsNullOrEmpty(liteDbPath)) + { + result.AddError("Database", "No database configuration found"); + } + + if (!string.IsNullOrEmpty(postgresConnection)) + { + try + { + var builder = new NpgsqlConnectionStringBuilder(postgresConnection); + if (string.IsNullOrEmpty(builder.Database)) + { + result.AddWarning("Database", "PostgreSQL database name not specified"); + } + } + catch (Exception ex) + { + result.AddError("Database", $"Invalid PostgreSQL connection string: {ex.Message}"); + } + } + } + + private void ValidateApplicationConfiguration(ValidationResult result) + { + var appName = _configuration["App:Name"]; + if (string.IsNullOrEmpty(appName)) + { + result.AddWarning("Application", "Application name not configured"); + } + + var logLevel = _configuration["Logging:LogLevel:Default"]; + if (!IsValidLogLevel(logLevel)) + { + result.AddWarning("Logging", $"Invalid log level: {logLevel}"); + } + } + + private void ValidateExternalServiceConfiguration(ValidationResult result) + { + // Validate email configuration if enabled + var emailEnabled = _configuration.GetValue("Email:Enabled"); + if (emailEnabled) + { + var smtpServer = _configuration["Email:SmtpServer"]; + if (string.IsNullOrEmpty(smtpServer)) + { + result.AddError("Email", "SMTP server required when email is enabled"); + } + } + + // Validate OIDC configuration if enabled + var oidcEnabled = _configuration.GetValue("Authentication:OpenIDConnect:Enabled"); + if (oidcEnabled) + { + var authority = _configuration["Authentication:OpenIDConnect:Authority"]; + var clientId = _configuration["Authentication:OpenIDConnect:ClientId"]; + + if (string.IsNullOrEmpty(authority) || string.IsNullOrEmpty(clientId)) + { + result.AddError("OIDC", "Authority and ClientId required for OpenID Connect"); + } + } + } +} + +// 3.2: Create startup configuration validation +public class StartupConfigurationValidator : IHostedService +{ + private readonly ConfigurationValidationService _validator; + private readonly ILogger _logger; + private readonly IHostApplicationLifetime _lifetime; + + public async Task StartAsync(CancellationToken cancellationToken) + { + _logger.LogInformation("Validating application configuration..."); + + var result = _validator.ValidateConfiguration(); + + foreach (var warning in result.Warnings) + { + _logger.LogWarning("Configuration warning: {Category}: {Message}", + warning.Category, warning.Message); + } + + if (result.HasErrors) + { + foreach (var error in result.Errors) + { + _logger.LogError("Configuration error: {Category}: {Message}", + error.Category, error.Message); + } + + _logger.LogCritical("Application startup failed due to configuration errors"); + _lifetime.StopApplication(); + return; + } + + _logger.LogInformation("Configuration validation completed successfully"); + } + + public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask; +} + +// 3.3: Enhanced configuration builder +public static class ConfigurationBuilderExtensions +{ + public static IConfigurationBuilder AddMotoVaultConfiguration( + this IConfigurationBuilder builder, + IWebHostEnvironment environment) + { + // Base configuration files + builder.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true); + builder.AddJsonFile($"appsettings.{environment.EnvironmentName}.json", + optional: true, reloadOnChange: true); + + // Environment variables with prefix + builder.AddEnvironmentVariables("MOTOVAULT_"); + + // Standard environment variables for compatibility + builder.AddEnvironmentVariables(); + + return builder; + } +} + +// 3.4: Register services +builder.Services.AddSingleton(); +builder.Services.AddHostedService(); +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void ConfigurationValidation_ValidConfiguration_ReturnsSuccess() +{ + var configuration = CreateTestConfiguration(new Dictionary + { + ["ConnectionStrings:DefaultConnection"] = "Host=localhost;Database=test;Username=test;Password=test", + ["App:Name"] = "MotoVaultPro", + ["Logging:LogLevel:Default"] = "Information" + }); + + var validator = new ConfigurationValidationService(configuration, Mock.Of>()); + var result = validator.ValidateConfiguration(); + + Assert.IsFalse(result.HasErrors); + Assert.AreEqual(0, result.Errors.Count); +} + +[Test] +public void ConfigurationValidation_MissingDatabase_ReturnsError() +{ + var configuration = CreateTestConfiguration(new Dictionary + { + ["App:Name"] = "MotoVaultPro" + }); + + var validator = new ConfigurationValidationService(configuration, Mock.Of>()); + var result = validator.ValidateConfiguration(); + + Assert.IsTrue(result.HasErrors); + Assert.Contains(result.Errors, e => e.Category == "Database"); +} + +[Test] +public async Task StartupValidator_InvalidConfiguration_StopsApplication() +{ + var mockLifetime = new Mock(); + var validator = CreateStartupValidator(invalidConfig: true, mockLifetime.Object); + + await validator.StartAsync(CancellationToken.None); + + mockLifetime.Verify(x => x.StopApplication(), Times.Once); +} +``` + +**Manual Validation**: +1. Start application with valid configuration - should start normally +2. Start with missing database configuration - should fail with clear error +3. Start with invalid PostgreSQL connection string - should fail +4. Test environment variable override of JSON configuration +5. Verify configuration warnings are logged but don't stop startup + +**Success Criteria**: +- [ ] Configuration validation runs at startup +- [ ] Invalid configuration prevents application startup +- [ ] Clear error messages for configuration issues +- [ ] Environment variables properly override JSON settings +- [ ] Existing functionality unchanged + +--- + +### Step 4: Configuration Externalization +**Duration**: 3-4 days +**Risk Level**: Medium +**Rollback Complexity**: Moderate + +#### Objective +Externalize configuration to support Kubernetes ConfigMaps and Secrets while maintaining compatibility. + +#### Implementation + +```csharp +// 4.1: Create Kubernetes configuration extensions +public static class KubernetesConfigurationExtensions +{ + public static IConfigurationBuilder AddKubernetesConfiguration( + this IConfigurationBuilder builder) + { + // Check if running in Kubernetes + var kubernetesServiceHost = Environment.GetEnvironmentVariable("KUBERNETES_SERVICE_HOST"); + if (!string.IsNullOrEmpty(kubernetesServiceHost)) + { + builder.AddKubernetesSecrets(); + builder.AddKubernetesConfigMaps(); + } + + return builder; + } + + private static IConfigurationBuilder AddKubernetesSecrets(this IConfigurationBuilder builder) + { + var secretsPath = "/var/secrets"; + if (Directory.Exists(secretsPath)) + { + foreach (var secretFile in Directory.GetFiles(secretsPath)) + { + var key = Path.GetFileName(secretFile); + var value = File.ReadAllText(secretFile); + builder.AddInMemoryCollection(new[] { new KeyValuePair(key, value) }); + } + } + return builder; + } + + private static IConfigurationBuilder AddKubernetesConfigMaps(this IConfigurationBuilder builder) + { + var configPath = "/var/config"; + if (Directory.Exists(configPath)) + { + foreach (var configFile in Directory.GetFiles(configPath)) + { + var key = Path.GetFileName(configFile); + var value = File.ReadAllText(configFile); + builder.AddInMemoryCollection(new[] { new KeyValuePair(key, value) }); + } + } + return builder; + } +} + +// 4.2: Create configuration mapping service +public class ConfigurationMappingService +{ + private readonly IConfiguration _configuration; + + public DatabaseConfiguration GetDatabaseConfiguration() + { + return new DatabaseConfiguration + { + PostgreSQLConnectionString = GetConnectionString("POSTGRES_CONNECTION", "ConnectionStrings:DefaultConnection"), + LiteDBPath = GetConfigValue("LITEDB_PATH", "LiteDB:DatabasePath"), + CommandTimeout = GetConfigValue("DB_COMMAND_TIMEOUT", "Database:CommandTimeout", 30), + MaxPoolSize = GetConfigValue("DB_MAX_POOL_SIZE", "Database:MaxPoolSize", 100), + MinPoolSize = GetConfigValue("DB_MIN_POOL_SIZE", "Database:MinPoolSize", 10) + }; + } + + public ApplicationConfiguration GetApplicationConfiguration() + { + return new ApplicationConfiguration + { + Name = GetConfigValue("APP_NAME", "App:Name", "MotoVaultPro"), + Environment = GetConfigValue("ASPNETCORE_ENVIRONMENT", "App:Environment", "Production"), + LogLevel = GetConfigValue("LOG_LEVEL", "Logging:LogLevel:Default", "Information"), + EnableFeatures = GetConfigValue("ENABLE_FEATURES", "App:EnableFeatures", "").Split(','), + CacheExpiryMinutes = GetConfigValue("CACHE_EXPIRY_MINUTES", "App:CacheExpiryMinutes", 30) + }; + } + + public EmailConfiguration GetEmailConfiguration() + { + return new EmailConfiguration + { + Enabled = GetConfigValue("EMAIL_ENABLED", "Email:Enabled", false), + SmtpServer = GetConfigValue("EMAIL_SMTP_SERVER", "Email:SmtpServer"), + SmtpPort = GetConfigValue("EMAIL_SMTP_PORT", "Email:SmtpPort", 587), + Username = GetConfigValue("EMAIL_USERNAME", "Email:Username"), + Password = GetConfigValue("EMAIL_PASSWORD", "Email:Password"), + FromAddress = GetConfigValue("EMAIL_FROM_ADDRESS", "Email:FromAddress"), + EnableSsl = GetConfigValue("EMAIL_ENABLE_SSL", "Email:EnableSsl", true) + }; + } + + private string GetConnectionString(string envKey, string configKey) + { + return _configuration[envKey] ?? _configuration.GetConnectionString(configKey); + } + + private string GetConfigValue(string envKey, string configKey, string defaultValue = null) + { + return _configuration[envKey] ?? _configuration[configKey] ?? defaultValue; + } + + private T GetConfigValue(string envKey, string configKey, T defaultValue = default) + { + var value = _configuration[envKey] ?? _configuration[configKey]; + if (string.IsNullOrEmpty(value)) + return defaultValue; + + return (T)Convert.ChangeType(value, typeof(T)); + } +} + +// 4.3: Create configuration models +public class DatabaseConfiguration +{ + public string PostgreSQLConnectionString { get; set; } + public string LiteDBPath { get; set; } + public int CommandTimeout { get; set; } + public int MaxPoolSize { get; set; } + public int MinPoolSize { get; set; } +} + +public class ApplicationConfiguration +{ + public string Name { get; set; } + public string Environment { get; set; } + public string LogLevel { get; set; } + public string[] EnableFeatures { get; set; } + public int CacheExpiryMinutes { get; set; } +} + +public class EmailConfiguration +{ + public bool Enabled { get; set; } + public string SmtpServer { get; set; } + public int SmtpPort { get; set; } + public string Username { get; set; } + public string Password { get; set; } + public string FromAddress { get; set; } + public bool EnableSsl { get; set; } +} + +// 4.4: Update Program.cs configuration +var builder = WebApplication.CreateBuilder(args); + +// Enhanced configuration setup +builder.Configuration + .AddMotoVaultConfiguration(builder.Environment) + .AddKubernetesConfiguration(); + +// Register configuration services +builder.Services.AddSingleton(); +builder.Services.Configure(config => + config = builder.Services.GetRequiredService().GetDatabaseConfiguration()); +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void ConfigurationMapping_EnvironmentVariableOverride_TakesPrecedence() +{ + Environment.SetEnvironmentVariable("APP_NAME", "TestApp"); + var configuration = CreateTestConfiguration(new Dictionary + { + ["App:Name"] = "ConfigApp" + }); + + var mapper = new ConfigurationMappingService(configuration); + var appConfig = mapper.GetApplicationConfiguration(); + + Assert.AreEqual("TestApp", appConfig.Name); + + Environment.SetEnvironmentVariable("APP_NAME", null); // Cleanup +} + +[Test] +public void KubernetesConfiguration_SecretsPath_LoadsSecrets() +{ + // Create temporary secrets directory + var secretsPath = Path.Combine(Path.GetTempPath(), "secrets"); + Directory.CreateDirectory(secretsPath); + File.WriteAllText(Path.Combine(secretsPath, "POSTGRES_CONNECTION"), "test-connection-string"); + + try + { + Environment.SetEnvironmentVariable("KUBERNETES_SERVICE_HOST", "localhost"); + var builder = new ConfigurationBuilder(); + builder.AddKubernetesConfiguration(); + var configuration = builder.Build(); + + Assert.AreEqual("test-connection-string", configuration["POSTGRES_CONNECTION"]); + } + finally + { + Directory.Delete(secretsPath, true); + Environment.SetEnvironmentVariable("KUBERNETES_SERVICE_HOST", null); + } +} + +[Test] +public async Task Application_StartupWithExternalizedConfig_Succeeds() +{ + var factory = new WebApplicationFactory() + .WithWebHostBuilder(builder => + { + builder.UseEnvironment("Testing"); + builder.ConfigureAppConfiguration((context, config) => + { + config.AddInMemoryCollection(new[] + { + new KeyValuePair("POSTGRES_CONNECTION", "Host=localhost;Database=test;Username=test;Password=test"), + new KeyValuePair("APP_NAME", "TestApp") + }); + }); + }); + + var client = factory.CreateClient(); + var response = await client.GetAsync("/health/ready"); + + Assert.AreEqual(HttpStatusCode.OK, response.StatusCode); +} +``` + +**Kubernetes Manifests for Testing**: +```yaml +# test-configmap.yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: motovault-config-test +data: + APP_NAME: "MotoVaultPro" + LOG_LEVEL: "Information" + CACHE_EXPIRY_MINUTES: "30" + ENABLE_FEATURES: "OpenIDConnect,EmailNotifications" + +--- +# test-secret.yaml +apiVersion: v1 +kind: Secret +metadata: + name: motovault-secrets-test +type: Opaque +data: + POSTGRES_CONNECTION: + EMAIL_PASSWORD: + JWT_SECRET: + +--- +# test-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: motovault-config-test +spec: + replicas: 1 + selector: + matchLabels: + app: motovault-test + template: + spec: + containers: + - name: motovault + image: motovault:test + envFrom: + - configMapRef: + name: motovault-config-test + - secretRef: + name: motovault-secrets-test + volumeMounts: + - name: config-volume + mountPath: /var/config + - name: secrets-volume + mountPath: /var/secrets + volumes: + - name: config-volume + configMap: + name: motovault-config-test + - name: secrets-volume + secret: + secretName: motovault-secrets-test +``` + +**Manual Validation**: +1. Test with environment variables only - application should start +2. Test with JSON configuration only - application should start +3. Test with Kubernetes ConfigMap/Secret simulation - application should start +4. Verify environment variables override JSON configuration +5. Test configuration validation with externalized config +6. Deploy to test Kubernetes environment and verify functionality + +**Success Criteria**: +- [ ] Application starts with environment variables only +- [ ] Kubernetes ConfigMap/Secret integration works +- [ ] Environment variables override JSON configuration +- [ ] Configuration validation works with externalized config +- [ ] All existing functionality preserved +- [ ] No hardcoded configuration remains in code + +--- + +### Step 5: PostgreSQL Connection Optimization +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Optimize PostgreSQL connections for high availability and performance without affecting LiteDB functionality. + +#### Implementation + +```csharp +// 5.1: Enhanced PostgreSQL configuration +public class PostgreSQLConnectionService +{ + private readonly DatabaseConfiguration _config; + private readonly ILogger _logger; + private readonly IHostEnvironment _environment; + + public NpgsqlConnectionStringBuilder CreateOptimizedConnectionString() + { + var builder = new NpgsqlConnectionStringBuilder(_config.PostgreSQLConnectionString); + + // Connection pooling optimization + builder.MaxPoolSize = _config.MaxPoolSize; + builder.MinPoolSize = _config.MinPoolSize; + builder.ConnectionLifetime = 300; // 5 minutes + builder.ConnectionIdleLifetime = 300; // 5 minutes + builder.ConnectionPruningInterval = 10; // 10 seconds + + // Performance optimization + builder.CommandTimeout = _config.CommandTimeout; + builder.NoResetOnClose = true; + builder.Enlist = false; // Disable distributed transactions for performance + + // Reliability settings + builder.KeepAlive = 30; // 30 seconds + builder.TcpKeepAliveTime = 30; + builder.TcpKeepAliveInterval = 5; + + // Application name for monitoring + builder.ApplicationName = $"{_config.ApplicationName}-{_environment.EnvironmentName}"; + + _logger.LogInformation("PostgreSQL connection configured: Pool({MinPoolSize}-{MaxPoolSize}), Timeout({CommandTimeout}s)", + builder.MinPoolSize, builder.MaxPoolSize, builder.CommandTimeout); + + return builder; + } + + public async Task TestConnectionAsync(CancellationToken cancellationToken = default) + { + try + { + var connectionString = CreateOptimizedConnectionString().ConnectionString; + using var connection = new NpgsqlConnection(connectionString); + + await connection.OpenAsync(cancellationToken); + + using var command = new NpgsqlCommand("SELECT version()", connection); + var version = await command.ExecuteScalarAsync(cancellationToken); + + _logger.LogInformation("PostgreSQL connection test successful. Version: {Version}", version); + return true; + } + catch (Exception ex) + { + _logger.LogError(ex, "PostgreSQL connection test failed"); + return false; + } + } +} + +// 5.2: Enhanced database context configuration +public static class DatabaseServiceExtensions +{ + public static IServiceCollection AddOptimizedDatabase( + this IServiceCollection services, + DatabaseConfiguration config) + { + if (!string.IsNullOrEmpty(config.PostgreSQLConnectionString)) + { + services.AddOptimizedPostgreSQL(config); + } + else if (!string.IsNullOrEmpty(config.LiteDBPath)) + { + services.AddLiteDB(config); + } + else + { + throw new InvalidOperationException("No database configuration provided"); + } + + return services; + } + + private static IServiceCollection AddOptimizedPostgreSQL( + this IServiceCollection services, + DatabaseConfiguration config) + { + services.AddSingleton(); + + services.AddDbContextFactory((serviceProvider, options) => + { + var connectionService = serviceProvider.GetRequiredService(); + var connectionString = connectionService.CreateOptimizedConnectionString().ConnectionString; + + options.UseNpgsql(connectionString, npgsqlOptions => + { + npgsqlOptions.EnableRetryOnFailure( + maxRetryCount: 3, + maxRetryDelay: TimeSpan.FromSeconds(5), + errorCodesToAdd: null); + + npgsqlOptions.CommandTimeout(config.CommandTimeout); + npgsqlOptions.MigrationsAssembly(typeof(MotoVaultContext).Assembly.FullName); + }); + + // Performance optimizations + options.EnableSensitiveDataLogging(false); + options.EnableServiceProviderCaching(); + options.EnableDetailedErrors(false); + + }, ServiceLifetime.Singleton); + + // Register data access implementations + services.AddScoped(); + services.AddScoped(); + services.AddScoped(); + services.AddScoped(); + + return services; + } + + private static IServiceCollection AddLiteDB( + this IServiceCollection services, + DatabaseConfiguration config) + { + // Keep existing LiteDB configuration unchanged + services.AddSingleton(provider => + { + var connectionString = $"Filename={config.LiteDBPath};Connection=shared"; + return new LiteDatabase(connectionString); + }); + + // Register LiteDB data access implementations + services.AddScoped(); + services.AddScoped(); + services.AddScoped(); + services.AddScoped(); + + return services; + } +} + +// 5.3: Connection monitoring service +public class DatabaseConnectionMonitoringService : BackgroundService +{ + private readonly IServiceProvider _serviceProvider; + private readonly ILogger _logger; + private readonly Counter _connectionAttempts; + private readonly Counter _connectionFailures; + private readonly Gauge _activeConnections; + + public DatabaseConnectionMonitoringService(IServiceProvider serviceProvider, ILogger logger) + { + _serviceProvider = serviceProvider; + _logger = logger; + + _connectionAttempts = Metrics.CreateCounter( + "motovault_db_connection_attempts_total", + "Total database connection attempts"); + + _connectionFailures = Metrics.CreateCounter( + "motovault_db_connection_failures_total", + "Total database connection failures"); + + _activeConnections = Metrics.CreateGauge( + "motovault_db_active_connections", + "Number of active database connections"); + } + + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + while (!stoppingToken.IsCancellationRequested) + { + await MonitorConnections(); + await Task.Delay(TimeSpan.FromMinutes(1), stoppingToken); + } + } + + private async Task MonitorConnections() + { + try + { + using var scope = _serviceProvider.CreateScope(); + var connectionService = scope.ServiceProvider.GetService(); + + if (connectionService != null) + { + _connectionAttempts.Inc(); + + var isHealthy = await connectionService.TestConnectionAsync(); + if (!isHealthy) + { + _connectionFailures.Inc(); + _logger.LogWarning("Database connection health check failed"); + } + + // Monitor connection pool if available + await MonitorConnectionPool(); + } + } + catch (Exception ex) + { + _logger.LogError(ex, "Error monitoring database connections"); + _connectionFailures.Inc(); + } + } + + private async Task MonitorConnectionPool() + { + // This would require access to Npgsql connection pool metrics + // For now, we'll implement a basic check + try + { + using var scope = _serviceProvider.CreateScope(); + var contextFactory = scope.ServiceProvider.GetService>(); + + if (contextFactory != null) + { + using var context = contextFactory.CreateDbContext(); + var connectionState = context.Database.GetDbConnection().State; + + _logger.LogDebug("Database connection state: {State}", connectionState); + } + } + catch (Exception ex) + { + _logger.LogWarning(ex, "Failed to monitor connection pool"); + } + } +} + +// 5.4: Register services in Program.cs +var databaseConfig = builder.Services.GetRequiredService() + .GetDatabaseConfiguration(); + +builder.Services.AddOptimizedDatabase(databaseConfig); +builder.Services.AddHostedService(); +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void PostgreSQLConnectionService_CreatesOptimizedConnectionString() +{ + var config = new DatabaseConfiguration + { + PostgreSQLConnectionString = "Host=localhost;Database=test;Username=test;Password=test", + MaxPoolSize = 50, + MinPoolSize = 5, + CommandTimeout = 30 + }; + + var service = new PostgreSQLConnectionService(config, Mock.Of>(), Mock.Of()); + var builder = service.CreateOptimizedConnectionString(); + + Assert.AreEqual(50, builder.MaxPoolSize); + Assert.AreEqual(5, builder.MinPoolSize); + Assert.AreEqual(30, builder.CommandTimeout); + Assert.AreEqual(300, builder.ConnectionLifetime); +} + +[Test] +public async Task PostgreSQLConnectionService_TestConnection_ValidConnection_ReturnsTrue() +{ + var config = CreateValidDatabaseConfiguration(); + var service = new PostgreSQLConnectionService(config, Mock.Of>(), Mock.Of()); + + var result = await service.TestConnectionAsync(); + + Assert.IsTrue(result); +} + +[Test] +public async Task DatabaseServiceExtensions_PostgreSQLConfiguration_RegistersCorrectServices() +{ + var services = new ServiceCollection(); + var config = new DatabaseConfiguration + { + PostgreSQLConnectionString = "Host=localhost;Database=test;Username=test;Password=test" + }; + + services.AddOptimizedDatabase(config); + + var serviceProvider = services.BuildServiceProvider(); + + Assert.IsNotNull(serviceProvider.GetService>()); + Assert.IsNotNull(serviceProvider.GetService()); + Assert.IsInstanceOf(serviceProvider.GetService()); +} + +[Test] +public async Task DatabaseServiceExtensions_LiteDBConfiguration_RegistersCorrectServices() +{ + var services = new ServiceCollection(); + var config = new DatabaseConfiguration + { + LiteDBPath = ":memory:" + }; + + services.AddOptimizedDatabase(config); + + var serviceProvider = services.BuildServiceProvider(); + + Assert.IsNotNull(serviceProvider.GetService()); + Assert.IsInstanceOf(serviceProvider.GetService()); +} +``` + +**Performance Tests**: +```csharp +[Test] +public async Task PostgreSQLConnection_ConcurrentConnections_HandlesLoad() +{ + var config = CreateValidDatabaseConfiguration(); + var service = new PostgreSQLConnectionService(config, Mock.Of>(), Mock.Of()); + + var tasks = Enumerable.Range(0, 20).Select(async i => + { + var stopwatch = Stopwatch.StartNew(); + var result = await service.TestConnectionAsync(); + stopwatch.Stop(); + + return new { Success = result, Duration = stopwatch.ElapsedMilliseconds }; + }); + + var results = await Task.WhenAll(tasks); + + Assert.IsTrue(results.All(r => r.Success)); + Assert.IsTrue(results.All(r => r.Duration < 1000)); // All connections under 1 second +} + +[Test] +public async Task DatabaseContext_ConcurrentQueries_OptimalPerformance() +{ + using var factory = CreateDbContextFactory(); + + var tasks = Enumerable.Range(0, 10).Select(async i => + { + using var context = factory.CreateDbContext(); + var stopwatch = Stopwatch.StartNew(); + + var count = await context.Vehicles.CountAsync(); + + stopwatch.Stop(); + return stopwatch.ElapsedMilliseconds; + }); + + var durations = await Task.WhenAll(tasks); + + Assert.IsTrue(durations.All(d => d < 500)); // All queries under 500ms + Assert.IsTrue(durations.Average() < 200); // Average under 200ms +} +``` + +**Manual Validation**: +1. Test PostgreSQL connection with optimized settings +2. Verify connection pooling behavior under load +3. Test connection recovery after database restart +4. Verify LiteDB functionality remains unchanged +5. Monitor connection metrics during testing +6. Test with both PostgreSQL and LiteDB configurations + +**Success Criteria**: +- [ ] PostgreSQL connections use optimized settings +- [ ] Connection pooling configured correctly +- [ ] Connection monitoring provides metrics +- [ ] LiteDB functionality unchanged +- [ ] Performance improvement measurable +- [ ] Connection recovery works after database restart + +--- + +### Step 6: Database Provider Selection and Debugging Infrastructure +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Implement a clean database provider selection mechanism with comprehensive debugging and diagnostic capabilities. + +#### Implementation + +```csharp +// 6.1: Database provider selector +public enum DatabaseProvider +{ + LiteDB, + PostgreSQL +} + +public class DatabaseProviderService +{ + private readonly DatabaseConfiguration _config; + private readonly ILogger _logger; + + public DatabaseProvider GetActiveProvider() + { + var hasPostgreSQL = !string.IsNullOrEmpty(_config.PostgreSQLConnectionString); + var hasLiteDB = !string.IsNullOrEmpty(_config.LiteDBPath); + + if (hasPostgreSQL) + { + _logger.LogInformation("PostgreSQL database mode enabled. Connection: {ConnectionInfo}", + GetConnectionInfo(_config.PostgreSQLConnectionString)); + return DatabaseProvider.PostgreSQL; + } + + if (hasLiteDB) + { + _logger.LogInformation("LiteDB database mode enabled. Path: {LiteDBPath}", _config.LiteDBPath); + return DatabaseProvider.LiteDB; + } + + throw new InvalidOperationException("No database provider configured"); + } + + private string GetConnectionInfo(string connectionString) + { + try + { + var builder = new NpgsqlConnectionStringBuilder(connectionString); + return $"Host={builder.Host}, Database={builder.Database}, Port={builder.Port}"; + } + catch + { + return "Invalid connection string"; + } + } +} + +// 6.2: Database diagnostics service +public class DatabaseDiagnosticsService +{ + private readonly ILogger _logger; + private readonly DatabaseConfiguration _config; + private readonly DatabaseProviderService _providerService; + + public async Task PerformDiagnosticsAsync() + { + var result = new DatabaseDiagnosticResult(); + var provider = _providerService.GetActiveProvider(); + + _logger.LogInformation("Starting database diagnostics for provider: {Provider}", provider); + + switch (provider) + { + case DatabaseProvider.PostgreSQL: + await DiagnosePostgreSQLAsync(result); + break; + case DatabaseProvider.LiteDB: + await DiagnoseLiteDBAsync(result); + break; + } + + _logger.LogInformation("Database diagnostics completed. Status: {Status}, Issues: {IssueCount}", + result.OverallStatus, result.Issues.Count); + + return result; + } + + private async Task DiagnosePostgreSQLAsync(DatabaseDiagnosticResult result) + { + result.Provider = "PostgreSQL"; + + // Test connection string parsing + try + { + var builder = new NpgsqlConnectionStringBuilder(_config.PostgreSQLConnectionString); + result.ConnectionDetails = new Dictionary + { + ["Host"] = builder.Host, + ["Port"] = builder.Port, + ["Database"] = builder.Database, + ["Username"] = builder.Username, + ["MaxPoolSize"] = builder.MaxPoolSize, + ["MinPoolSize"] = builder.MinPoolSize, + ["CommandTimeout"] = builder.CommandTimeout + }; + _logger.LogDebug("PostgreSQL connection string parsed successfully"); + } + catch (Exception ex) + { + result.Issues.Add($"Invalid PostgreSQL connection string: {ex.Message}"); + _logger.LogError(ex, "Failed to parse PostgreSQL connection string"); + result.OverallStatus = "Failed"; + return; + } + + // Test connectivity + try + { + using var connection = new NpgsqlConnection(_config.PostgreSQLConnectionString); + var stopwatch = Stopwatch.StartNew(); + await connection.OpenAsync(); + stopwatch.Stop(); + + result.ConnectionTime = stopwatch.ElapsedMilliseconds; + _logger.LogDebug("PostgreSQL connection established in {ElapsedMs}ms", stopwatch.ElapsedMilliseconds); + + // Test basic query + using var command = new NpgsqlCommand("SELECT version(), current_database(), current_user", connection); + using var reader = await command.ExecuteReaderAsync(); + + if (await reader.ReadAsync()) + { + result.ServerInfo = new Dictionary + { + ["Version"] = reader.GetString(0), + ["Database"] = reader.GetString(1), + ["User"] = reader.GetString(2) + }; + } + + result.OverallStatus = "Healthy"; + _logger.LogInformation("PostgreSQL diagnostics successful. Version: {Version}", + result.ServerInfo?["Version"]); + } + catch (Exception ex) + { + result.Issues.Add($"PostgreSQL connection failed: {ex.Message}"); + result.OverallStatus = "Failed"; + _logger.LogError(ex, "PostgreSQL connection failed during diagnostics"); + } + } + + private async Task DiagnoseLiteDBAsync(DatabaseDiagnosticResult result) + { + result.Provider = "LiteDB"; + + try + { + var dbPath = _config.LiteDBPath; + var directory = Path.GetDirectoryName(dbPath); + + result.ConnectionDetails = new Dictionary + { + ["DatabasePath"] = dbPath, + ["Directory"] = directory, + ["DirectoryExists"] = Directory.Exists(directory), + ["FileExists"] = File.Exists(dbPath) + }; + + // Test directory access + if (!Directory.Exists(directory)) + { + _logger.LogWarning("LiteDB directory does not exist: {Directory}", directory); + Directory.CreateDirectory(directory); + _logger.LogInformation("Created LiteDB directory: {Directory}", directory); + } + + // Test LiteDB access + var stopwatch = Stopwatch.StartNew(); + using var db = new LiteDatabase($"Filename={dbPath};Connection=shared"); + var collections = db.GetCollectionNames().ToList(); + stopwatch.Stop(); + + result.ConnectionTime = stopwatch.ElapsedMilliseconds; + result.ServerInfo = new Dictionary + { + ["Collections"] = collections, + ["CollectionCount"] = collections.Count, + ["FileSize"] = File.Exists(dbPath) ? new FileInfo(dbPath).Length : 0 + }; + + result.OverallStatus = "Healthy"; + _logger.LogInformation("LiteDB diagnostics successful. Collections: {CollectionCount}, Size: {FileSize} bytes", + collections.Count, result.ServerInfo["FileSize"]); + } + catch (Exception ex) + { + result.Issues.Add($"LiteDB access failed: {ex.Message}"); + result.OverallStatus = "Failed"; + _logger.LogError(ex, "LiteDB access failed during diagnostics"); + } + } +} + +// 6.3: Database diagnostic result model +public class DatabaseDiagnosticResult +{ + public string Provider { get; set; } + public string OverallStatus { get; set; } = "Unknown"; + public long ConnectionTime { get; set; } + public Dictionary ConnectionDetails { get; set; } = new(); + public Dictionary ServerInfo { get; set; } = new(); + public List Issues { get; set; } = new(); + public List Recommendations { get; set; } = new(); +} + +// 6.4: Database startup diagnostics service +public class DatabaseStartupDiagnosticsService : IHostedService +{ + private readonly DatabaseDiagnosticsService _diagnostics; + private readonly ILogger _logger; + + public DatabaseStartupDiagnosticsService( + DatabaseDiagnosticsService diagnostics, + ILogger logger) + { + _diagnostics = diagnostics; + _logger = logger; + } + + public async Task StartAsync(CancellationToken cancellationToken) + { + try + { + _logger.LogInformation("Running database diagnostics at startup"); + var result = await _diagnostics.PerformDiagnosticsAsync(); + + _logger.LogInformation("Database diagnostics completed. Provider: {Provider}, Status: {Status}, ConnectionTime: {ConnectionTime}ms", + result.Provider, result.OverallStatus, result.ConnectionTime); + + if (result.Issues.Any()) + { + foreach (var issue in result.Issues) + { + _logger.LogWarning("Database diagnostic issue: {Issue}", issue); + } + } + + foreach (var detail in result.ConnectionDetails) + { + _logger.LogDebug("Database connection detail - {Key}: {Value}", detail.Key, detail.Value); + } + + foreach (var info in result.ServerInfo) + { + _logger.LogInformation("Database server info - {Key}: {Value}", info.Key, info.Value); + } + } + catch (Exception ex) + { + _logger.LogError(ex, "Database startup diagnostics failed"); + } + } + + public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask; +} + +// 6.5: Enhanced database service registration +public static class DatabaseServiceExtensions +{ + public static IServiceCollection AddDatabaseWithDiagnostics( + this IServiceCollection services, + DatabaseConfiguration config) + { + services.AddSingleton(); + services.AddSingleton(); + + var providerService = new DatabaseProviderService(config, null); + var provider = providerService.GetActiveProvider(); + + switch (provider) + { + case DatabaseProvider.PostgreSQL: + services.AddOptimizedPostgreSQL(config); + break; + + case DatabaseProvider.LiteDB: + services.AddLiteDB(config); + break; + } + + return services; + } +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void DatabaseProviderService_PostgreSQLConfigured_ReturnsPostgreSQL() +{ + var config = new DatabaseConfiguration + { + PostgreSQLConnectionString = "Host=localhost;Database=test;Username=test;Password=test" + }; + + var service = new DatabaseProviderService(config, Mock.Of>()); + var provider = service.GetActiveProvider(); + + Assert.AreEqual(DatabaseProvider.PostgreSQL, provider); +} + +[Test] +public void DatabaseProviderService_LiteDBConfigured_ReturnsLiteDB() +{ + var config = new DatabaseConfiguration + { + LiteDBPath = "/tmp/test.db" + }; + + var service = new DatabaseProviderService(config, Mock.Of>()); + var provider = service.GetActiveProvider(); + + Assert.AreEqual(DatabaseProvider.LiteDB, provider); +} + +[Test] +public void DatabaseProviderService_NoConfiguration_ThrowsException() +{ + var config = new DatabaseConfiguration(); + + var service = new DatabaseProviderService(config, Mock.Of>()); + + Assert.Throws(() => service.GetActiveProvider()); +} + +[Test] +public async Task DatabaseDiagnosticsService_PostgreSQL_ValidConnection_ReturnsHealthy() +{ + var config = new DatabaseConfiguration + { + PostgreSQLConnectionString = GetTestPostgreSQLConnectionString() + }; + + var providerService = new DatabaseProviderService(config, Mock.Of>()); + var diagnostics = new DatabaseDiagnosticsService( + Mock.Of>(), + config, + providerService); + + var result = await diagnostics.PerformDiagnosticsAsync(); + + Assert.AreEqual("Healthy", result.OverallStatus); + Assert.AreEqual("PostgreSQL", result.Provider); + Assert.IsTrue(result.ConnectionTime > 0); + Assert.IsTrue(result.ServerInfo.ContainsKey("Version")); +} + +[Test] +public async Task DatabaseDiagnosticsService_LiteDB_ValidPath_ReturnsHealthy() +{ + var tempPath = Path.GetTempFileName(); + var config = new DatabaseConfiguration + { + LiteDBPath = tempPath + }; + + try + { + var providerService = new DatabaseProviderService(config, Mock.Of>()); + var diagnostics = new DatabaseDiagnosticsService( + Mock.Of>(), + config, + providerService); + + var result = await diagnostics.PerformDiagnosticsAsync(); + + Assert.AreEqual("Healthy", result.OverallStatus); + Assert.AreEqual("LiteDB", result.Provider); + Assert.IsTrue(result.ConnectionTime >= 0); + } + finally + { + File.Delete(tempPath); + } +} + +[Test] +public async Task DatabaseStartupDiagnosticsService_RunsAtStartup_LogsResults() +{ + var mockLogger = new Mock>(); + var mockDiagnostics = new Mock(); + + var diagnosticResult = new DatabaseDiagnosticResult + { + Provider = "PostgreSQL", + OverallStatus = "Healthy", + ConnectionTime = 50 + }; + + mockDiagnostics.Setup(x => x.PerformDiagnosticsAsync()) + .ReturnsAsync(diagnosticResult); + + var service = new DatabaseStartupDiagnosticsService(mockDiagnostics.Object, mockLogger.Object); + await service.StartAsync(CancellationToken.None); + + // Verify that diagnostic information was logged + mockLogger.Verify( + x => x.Log( + LogLevel.Information, + It.IsAny(), + It.Is((v, t) => v.ToString().Contains("Database diagnostics completed")), + It.IsAny(), + It.IsAny>()), + Times.Once); +} +``` + +**Manual Validation**: +1. Test database provider selection with PostgreSQL configuration +2. Test database provider selection with LiteDB configuration +3. Review startup logs for diagnostic information +4. Test with invalid PostgreSQL connection string and verify error logging +5. Test with invalid LiteDB path and verify error logging +6. Verify logging output provides comprehensive debugging information + +**Success Criteria**: +- [ ] Database provider correctly selected based on configuration +- [ ] Comprehensive diagnostic information logged at startup +- [ ] Error conditions properly detected and logged +- [ ] Logging provides sufficient detail for debugging +- [ ] Connection details and server info logged appropriately +- [ ] No impact on existing functionality + +--- + +### Step 7: Database Migration Preparation and Tooling +**Duration**: 3-4 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Create comprehensive database migration tools and validation utilities for future PostgreSQL transition. + +#### Implementation + +```csharp +// 7.1: Database migration service +public class DatabaseMigrationService +{ + private readonly ILogger _logger; + private readonly DatabaseConfiguration _config; + + public async Task PrepareMigrationAsync() + { + var result = new MigrationPreparationResult(); + + _logger.LogInformation("Starting migration preparation analysis"); + + // Analyze current data structure + await AnalyzeDataStructureAsync(result); + + // Validate migration prerequisites + await ValidateMigrationPrerequisitesAsync(result); + + // Generate migration plan + await GenerateMigrationPlanAsync(result); + + _logger.LogInformation("Migration preparation completed. Status: {Status}", result.Status); + return result; + } + + private async Task AnalyzeDataStructureAsync(MigrationPreparationResult result) + { + try + { + if (!string.IsNullOrEmpty(_config.LiteDBPath) && File.Exists(_config.LiteDBPath)) + { + using var db = new LiteDatabase(_config.LiteDBPath); + var collections = db.GetCollectionNames().ToList(); + + result.DataAnalysis = new Dictionary(); + + foreach (var collectionName in collections) + { + var collection = db.GetCollection(collectionName); + var count = collection.Count(); + + result.DataAnalysis[collectionName] = new + { + RecordCount = count, + EstimatedSize = count * 1024 // Rough estimate + }; + + _logger.LogDebug("Collection {Collection}: {Count} records", collectionName, count); + } + + result.TotalRecords = result.DataAnalysis.Values + .Cast() + .Sum(x => (int)x.RecordCount); + + _logger.LogInformation("Data analysis completed. Total records: {TotalRecords}", result.TotalRecords); + } + else + { + result.DataAnalysis = new Dictionary(); + result.TotalRecords = 0; + _logger.LogInformation("No LiteDB database found for analysis"); + } + } + catch (Exception ex) + { + result.Issues.Add($"Data analysis failed: {ex.Message}"); + _logger.LogError(ex, "Failed to analyze data structure"); + } + } + + private async Task ValidateMigrationPrerequisitesAsync(MigrationPreparationResult result) + { + _logger.LogDebug("Validating migration prerequisites"); + + // Check PostgreSQL availability if configured + if (!string.IsNullOrEmpty(_config.PostgreSQLConnectionString)) + { + try + { + using var connection = new NpgsqlConnection(_config.PostgreSQLConnectionString); + await connection.OpenAsync(); + + // Check if database is empty or has expected schema + using var command = new NpgsqlCommand( + "SELECT COUNT(*) FROM information_schema.tables WHERE table_schema = 'public'", + connection); + var tableCount = (long)await command.ExecuteScalarAsync(); + + result.Prerequisites["PostgreSQLConnectivity"] = true; + result.Prerequisites["PostgreSQLTableCount"] = tableCount; + + if (tableCount > 0) + { + result.Recommendations.Add("PostgreSQL database contains existing tables. Consider backup before migration."); + } + + _logger.LogDebug("PostgreSQL validation successful. Table count: {TableCount}", tableCount); + } + catch (Exception ex) + { + result.Prerequisites["PostgreSQLConnectivity"] = false; + result.Issues.Add($"PostgreSQL validation failed: {ex.Message}"); + _logger.LogWarning(ex, "PostgreSQL validation failed"); + } + } + + // Check disk space + try + { + var currentPath = Environment.CurrentDirectory; + var drive = new DriveInfo(Path.GetPathRoot(currentPath)); + var freeSpaceGB = drive.AvailableFreeSpace / (1024 * 1024 * 1024); + + result.Prerequisites["DiskSpaceGB"] = freeSpaceGB; + + if (freeSpaceGB < 1) + { + result.Issues.Add("Insufficient disk space for migration (< 1GB available)"); + } + else if (freeSpaceGB < 5) + { + result.Recommendations.Add("Limited disk space available. Monitor during migration."); + } + + _logger.LogDebug("Disk space check: {FreeSpaceGB}GB available", freeSpaceGB); + } + catch (Exception ex) + { + result.Issues.Add($"Disk space check failed: {ex.Message}"); + _logger.LogWarning(ex, "Failed to check disk space"); + } + } + + private async Task GenerateMigrationPlanAsync(MigrationPreparationResult result) + { + _logger.LogDebug("Generating migration plan"); + + var plan = new List(); + + if (result.TotalRecords > 0) + { + plan.Add("1. Create PostgreSQL database schema"); + plan.Add("2. Export data from LiteDB"); + plan.Add("3. Transform data for PostgreSQL compatibility"); + plan.Add("4. Import data to PostgreSQL"); + plan.Add("5. Validate data integrity"); + plan.Add("6. Update configuration to use PostgreSQL"); + plan.Add("7. Test application functionality"); + plan.Add("8. Archive LiteDB data"); + + // Estimate migration time based on record count + var estimatedMinutes = Math.Max(5, result.TotalRecords / 1000); // Rough estimate + result.EstimatedMigrationTime = TimeSpan.FromMinutes(estimatedMinutes); + + plan.Add($"Estimated migration time: {result.EstimatedMigrationTime.TotalMinutes:F0} minutes"); + } + else + { + plan.Add("1. Create PostgreSQL database schema"); + plan.Add("2. Update configuration to use PostgreSQL"); + plan.Add("3. Test application functionality"); + result.EstimatedMigrationTime = TimeSpan.FromMinutes(5); + } + + result.MigrationPlan = plan; + _logger.LogInformation("Migration plan generated with {StepCount} steps", plan.Count); + } +} + +// 7.2: Migration result models +public class MigrationPreparationResult +{ + public string Status { get; set; } = "Success"; + public Dictionary DataAnalysis { get; set; } = new(); + public Dictionary Prerequisites { get; set; } = new(); + public List Issues { get; set; } = new(); + public List Recommendations { get; set; } = new(); + public List MigrationPlan { get; set; } = new(); + public int TotalRecords { get; set; } + public TimeSpan EstimatedMigrationTime { get; set; } +} + +// 7.3: Data validation service +public class DataValidationService +{ + private readonly ILogger _logger; + + public async Task ValidateDataIntegrityAsync(DatabaseProvider provider) + { + var result = new DataValidationResult { Provider = provider.ToString() }; + + _logger.LogInformation("Starting data integrity validation for {Provider}", provider); + + switch (provider) + { + case DatabaseProvider.LiteDB: + await ValidateLiteDBIntegrityAsync(result); + break; + case DatabaseProvider.PostgreSQL: + await ValidatePostgreSQLIntegrityAsync(result); + break; + } + + _logger.LogInformation("Data validation completed for {Provider}. Status: {Status}, Issues: {IssueCount}", + provider, result.Status, result.Issues.Count); + + return result; + } + + private async Task ValidateLiteDBIntegrityAsync(DataValidationResult result) + { + try + { + // Implement LiteDB-specific validation logic + result.ValidationChecks["LiteDBAccessible"] = true; + result.ValidationChecks["CollectionsAccessible"] = true; + result.Status = "Healthy"; + } + catch (Exception ex) + { + result.Issues.Add($"LiteDB validation failed: {ex.Message}"); + result.Status = "Failed"; + _logger.LogError(ex, "LiteDB validation failed"); + } + } + + private async Task ValidatePostgreSQLIntegrityAsync(DataValidationResult result) + { + try + { + // Implement PostgreSQL-specific validation logic + result.ValidationChecks["PostgreSQLAccessible"] = true; + result.ValidationChecks["TablesAccessible"] = true; + result.Status = "Healthy"; + } + catch (Exception ex) + { + result.Issues.Add($"PostgreSQL validation failed: {ex.Message}"); + result.Status = "Failed"; + _logger.LogError(ex, "PostgreSQL validation failed"); + } + } +} + +public class DataValidationResult +{ + public string Provider { get; set; } + public string Status { get; set; } = "Unknown"; + public Dictionary ValidationChecks { get; set; } = new(); + public List Issues { get; set; } = new(); + public Dictionary Statistics { get; set; } = new(); +} + +// 7.4: Migration analysis startup service +public class MigrationAnalysisService : IHostedService +{ + private readonly DatabaseMigrationService _migrationService; + private readonly DataValidationService _validationService; + private readonly DatabaseProviderService _providerService; + private readonly FeatureFlagService _featureFlags; + private readonly ILogger _logger; + + public MigrationAnalysisService( + DatabaseMigrationService migrationService, + DataValidationService validationService, + DatabaseProviderService providerService, + FeatureFlagService featureFlags, + ILogger logger) + { + _migrationService = migrationService; + _validationService = validationService; + _providerService = providerService; + _featureFlags = featureFlags; + _logger = logger; + } + + public async Task StartAsync(CancellationToken cancellationToken) + { + if (!_featureFlags.IsEnabled("MigrationTools", true)) + { + _logger.LogDebug("Migration tools feature is disabled, skipping migration analysis"); + return; + } + + try + { + var currentProvider = _providerService.GetActiveProvider(); + _logger.LogInformation("Running migration analysis for current provider: {Provider}", currentProvider); + + var migrationResult = await _migrationService.PrepareMigrationAsync(); + _logger.LogInformation("Migration analysis completed. Status: {Status}, Total Records: {TotalRecords}, Estimated Time: {EstimatedTime}", + migrationResult.Status, migrationResult.TotalRecords, migrationResult.EstimatedMigrationTime); + + if (migrationResult.Issues.Any()) + { + foreach (var issue in migrationResult.Issues) + { + _logger.LogWarning("Migration analysis issue: {Issue}", issue); + } + } + + if (migrationResult.Recommendations.Any()) + { + foreach (var recommendation in migrationResult.Recommendations) + { + _logger.LogInformation("Migration recommendation: {Recommendation}", recommendation); + } + } + + foreach (var step in migrationResult.MigrationPlan) + { + _logger.LogDebug("Migration plan step: {Step}", step); + } + + var validationResult = await _validationService.ValidateDataIntegrityAsync(currentProvider); + _logger.LogInformation("Data validation completed for {Provider}. Status: {Status}", + validationResult.Provider, validationResult.Status); + + if (validationResult.Issues.Any()) + { + foreach (var issue in validationResult.Issues) + { + _logger.LogWarning("Data validation issue: {Issue}", issue); + } + } + } + catch (Exception ex) + { + _logger.LogError(ex, "Migration analysis failed during startup"); + } + } + + public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask; +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public async Task DatabaseMigrationService_PrepareMigration_GeneratesValidPlan() +{ + var config = new DatabaseConfiguration + { + LiteDBPath = CreateTestLiteDBWithData(), + PostgreSQLConnectionString = GetTestPostgreSQLConnectionString() + }; + + var migrationService = new DatabaseMigrationService( + Mock.Of>(), + config); + + var result = await migrationService.PrepareMigrationAsync(); + + Assert.AreEqual("Success", result.Status); + Assert.IsTrue(result.MigrationPlan.Count > 0); + Assert.IsTrue(result.TotalRecords >= 0); + Assert.IsTrue(result.EstimatedMigrationTime > TimeSpan.Zero); +} + +[Test] +public async Task DataValidationService_LiteDB_ReturnsValidationResult() +{ + var validationService = new DataValidationService(Mock.Of>()); + + var result = await validationService.ValidateDataIntegrityAsync(DatabaseProvider.LiteDB); + + Assert.AreEqual("LiteDB", result.Provider); + Assert.IsNotNull(result.Status); + Assert.IsNotNull(result.ValidationChecks); +} + +[Test] +public async Task MigrationAnalysisService_RunsAtStartup_LogsAnalysis() +{ + var mockLogger = new Mock>(); + var mockMigrationService = new Mock(); + var mockValidationService = new Mock(); + var mockProviderService = new Mock(); + var mockFeatureFlags = new Mock(); + + mockFeatureFlags.Setup(x => x.IsEnabled("MigrationTools", true)).Returns(true); + mockProviderService.Setup(x => x.GetActiveProvider()).Returns(DatabaseProvider.LiteDB); + + var migrationResult = new MigrationPreparationResult { Status = "Success", TotalRecords = 100 }; + mockMigrationService.Setup(x => x.PrepareMigrationAsync()).ReturnsAsync(migrationResult); + + var validationResult = new DataValidationResult { Provider = "LiteDB", Status = "Healthy" }; + mockValidationService.Setup(x => x.ValidateDataIntegrityAsync(It.IsAny())) + .ReturnsAsync(validationResult); + + var service = new MigrationAnalysisService( + mockMigrationService.Object, + mockValidationService.Object, + mockProviderService.Object, + mockFeatureFlags.Object, + mockLogger.Object); + + await service.StartAsync(CancellationToken.None); + + // Verify migration analysis was logged + mockLogger.Verify( + x => x.Log( + LogLevel.Information, + It.IsAny(), + It.Is((v, t) => v.ToString().Contains("Migration analysis completed")), + It.IsAny(), + It.IsAny>()), + Times.Once); +} +``` + +**Manual Validation**: +1. Review startup logs for migration analysis information +2. Test migration preparation with existing LiteDB data +3. Test migration preparation with empty database +4. Verify PostgreSQL connectivity validation in logs +5. Test with invalid PostgreSQL configuration and check error logs +6. Verify migration plan generation logic through log output + +**Success Criteria**: +- [ ] Migration preparation analysis works correctly +- [ ] Data structure analysis provides accurate information and logs details +- [ ] Migration plan generated with realistic time estimates +- [ ] Prerequisites validation identifies potential issues and logs them +- [ ] Comprehensive migration information logged at startup +- [ ] No impact on existing application functionality + +--- + +### Step 8: Performance Monitoring and Benchmarking +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Implement comprehensive performance monitoring and benchmarking to establish baselines and detect regressions. + +#### Implementation + +```csharp +// 8.1: Performance monitoring service +public class PerformanceMonitoringService +{ + private readonly ILogger _logger; + private readonly Counter _requestCounter; + private readonly Histogram _requestDuration; + private readonly Histogram _databaseOperationDuration; + private readonly Gauge _activeConnections; + + public PerformanceMonitoringService(ILogger logger) + { + _logger = logger; + + _requestCounter = Metrics.CreateCounter( + "motovault_http_requests_total", + "Total HTTP requests", + new[] { "method", "endpoint", "status_code" }); + + _requestDuration = Metrics.CreateHistogram( + "motovault_http_request_duration_seconds", + "HTTP request duration in seconds", + new[] { "method", "endpoint" }); + + _databaseOperationDuration = Metrics.CreateHistogram( + "motovault_database_operation_duration_seconds", + "Database operation duration in seconds", + new[] { "operation", "provider" }); + + _activeConnections = Metrics.CreateGauge( + "motovault_active_connections", + "Number of active connections"); + } + + public void RecordHttpRequest(string method, string endpoint, int statusCode, double durationSeconds) + { + _requestCounter.WithLabels(method, endpoint, statusCode.ToString()).Inc(); + _requestDuration.WithLabels(method, endpoint).Observe(durationSeconds); + } + + public void RecordDatabaseOperation(string operation, string provider, double durationSeconds) + { + _databaseOperationDuration.WithLabels(operation, provider).Observe(durationSeconds); + } + + public void SetActiveConnections(double count) + { + _activeConnections.Set(count); + } +} + +// 8.2: Performance monitoring middleware +public class PerformanceMonitoringMiddleware +{ + private readonly RequestDelegate _next; + private readonly PerformanceMonitoringService _monitoring; + private readonly ILogger _logger; + + public async Task InvokeAsync(HttpContext context) + { + var stopwatch = Stopwatch.StartNew(); + var endpoint = GetEndpointName(context); + + try + { + await _next(context); + } + finally + { + stopwatch.Stop(); + + _monitoring.RecordHttpRequest( + context.Request.Method, + endpoint, + context.Response.StatusCode, + stopwatch.Elapsed.TotalSeconds); + + // Log slow requests + if (stopwatch.ElapsedMilliseconds > 1000) + { + _logger.LogWarning("Slow request detected: {Method} {Endpoint} took {ElapsedMs}ms", + context.Request.Method, endpoint, stopwatch.ElapsedMilliseconds); + } + } + } + + private string GetEndpointName(HttpContext context) + { + var endpoint = context.GetEndpoint(); + if (endpoint?.DisplayName != null) + { + return endpoint.DisplayName; + } + + var path = context.Request.Path.Value; + + // Normalize common patterns + if (path.StartsWith("/Vehicle/")) + { + return "/Vehicle/*"; + } + if (path.StartsWith("/api/")) + { + return "/api/*"; + } + + return path ?? "unknown"; + } +} + +// 8.3: Database operation interceptor +public class DatabaseOperationInterceptor : DbCommandInterceptor +{ + private readonly PerformanceMonitoringService _monitoring; + private readonly ILogger _logger; + + public DatabaseOperationInterceptor( + PerformanceMonitoringService monitoring, + ILogger logger) + { + _monitoring = monitoring; + _logger = logger; + } + + public override ValueTask> ReaderExecutingAsync( + DbCommand command, + CommandEventData eventData, + InterceptionResult result, + CancellationToken cancellationToken = default) + { + var stopwatch = Stopwatch.StartNew(); + eventData.Context.ContextId.ToString(); // Use for correlation + + return base.ReaderExecutingAsync(command, eventData, result, cancellationToken); + } + + public override ValueTask ReaderExecutedAsync( + DbCommand command, + CommandExecutedEventData eventData, + DbDataReader result, + CancellationToken cancellationToken = default) + { + var duration = eventData.Duration.TotalSeconds; + var operation = GetOperationType(command.CommandText); + var provider = GetProviderName(eventData.Context); + + _monitoring.RecordDatabaseOperation(operation, provider, duration); + + // Log slow queries + if (eventData.Duration.TotalMilliseconds > 500) + { + _logger.LogWarning("Slow database query detected: {Operation} took {ElapsedMs}ms. Query: {CommandText}", + operation, eventData.Duration.TotalMilliseconds, command.CommandText); + } + + return base.ReaderExecutedAsync(command, eventData, result, cancellationToken); + } + + private string GetOperationType(string commandText) + { + if (string.IsNullOrEmpty(commandText)) + return "unknown"; + + var upperCommand = commandText.Trim().ToUpper(); + + if (upperCommand.StartsWith("SELECT")) return "SELECT"; + if (upperCommand.StartsWith("INSERT")) return "INSERT"; + if (upperCommand.StartsWith("UPDATE")) return "UPDATE"; + if (upperCommand.StartsWith("DELETE")) return "DELETE"; + + return "other"; + } + + private string GetProviderName(DbContext context) + { + return context?.Database?.ProviderName?.Contains("Npgsql") == true ? "PostgreSQL" : "Unknown"; + } +} + +// 8.4: Performance benchmarking service +public class PerformanceBenchmarkService +{ + private readonly IServiceProvider _serviceProvider; + private readonly ILogger _logger; + + public async Task RunBenchmarkAsync(BenchmarkOptions options) + { + var result = new BenchmarkResult + { + TestName = options.TestName, + StartTime = DateTime.UtcNow + }; + + _logger.LogInformation("Starting benchmark: {TestName}", options.TestName); + + try + { + switch (options.TestType) + { + case BenchmarkType.DatabaseRead: + await BenchmarkDatabaseReads(result, options); + break; + case BenchmarkType.DatabaseWrite: + await BenchmarkDatabaseWrites(result, options); + break; + case BenchmarkType.HttpEndpoint: + await BenchmarkHttpEndpoint(result, options); + break; + } + + result.Status = "Completed"; + } + catch (Exception ex) + { + result.Status = "Failed"; + result.Error = ex.Message; + _logger.LogError(ex, "Benchmark failed: {TestName}", options.TestName); + } + finally + { + result.EndTime = DateTime.UtcNow; + result.Duration = result.EndTime - result.StartTime; + } + + _logger.LogInformation("Benchmark completed: {TestName}, Duration: {Duration}ms, Status: {Status}", + options.TestName, result.Duration.TotalMilliseconds, result.Status); + + return result; + } + + private async Task BenchmarkDatabaseReads(BenchmarkResult result, BenchmarkOptions options) + { + using var scope = _serviceProvider.CreateScope(); + var vehicleAccess = scope.ServiceProvider.GetRequiredService(); + + var durations = new List(); + + for (int i = 0; i < options.Iterations; i++) + { + var stopwatch = Stopwatch.StartNew(); + + try + { + var vehicles = await vehicleAccess.GetVehiclesAsync(1); // Test user ID + stopwatch.Stop(); + durations.Add(stopwatch.Elapsed.TotalMilliseconds); + } + catch (Exception ex) + { + stopwatch.Stop(); + result.Errors.Add($"Iteration {i}: {ex.Message}"); + } + } + + if (durations.Count > 0) + { + result.Metrics["AverageMs"] = durations.Average(); + result.Metrics["MinMs"] = durations.Min(); + result.Metrics["MaxMs"] = durations.Max(); + result.Metrics["P95Ms"] = durations.OrderBy(x => x).Skip((int)(durations.Count * 0.95)).First(); + result.Metrics["SuccessfulIterations"] = durations.Count; + result.Metrics["FailedIterations"] = options.Iterations - durations.Count; + } + } + + private async Task BenchmarkDatabaseWrites(BenchmarkResult result, BenchmarkOptions options) + { + // Similar implementation for write operations + result.Metrics["WriteOperationsCompleted"] = options.Iterations; + } + + private async Task BenchmarkHttpEndpoint(BenchmarkResult result, BenchmarkOptions options) + { + // HTTP endpoint benchmarking implementation + result.Metrics["HttpRequestsCompleted"] = options.Iterations; + } +} + +// 8.5: Benchmark models +public class BenchmarkOptions +{ + public string TestName { get; set; } + public BenchmarkType TestType { get; set; } + public int Iterations { get; set; } = 10; + public string TargetEndpoint { get; set; } + public Dictionary Parameters { get; set; } = new(); +} + +public enum BenchmarkType +{ + DatabaseRead, + DatabaseWrite, + HttpEndpoint +} + +public class BenchmarkResult +{ + public string TestName { get; set; } + public string Status { get; set; } + public DateTime StartTime { get; set; } + public DateTime EndTime { get; set; } + public TimeSpan Duration { get; set; } + public Dictionary Metrics { get; set; } = new(); + public List Errors { get; set; } = new(); + public string Error { get; set; } +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void PerformanceMonitoringService_RecordHttpRequest_UpdatesMetrics() +{ + var monitoring = new PerformanceMonitoringService(Mock.Of>()); + + // Record some test requests + monitoring.RecordHttpRequest("GET", "/test", 200, 0.5); + monitoring.RecordHttpRequest("POST", "/test", 201, 0.8); + + // Verify metrics are updated (would need to access metrics collector in real implementation) + Assert.IsTrue(true); // Placeholder - would verify actual metrics +} + +[Test] +public async Task PerformanceBenchmarkService_DatabaseReadBenchmark_ReturnsValidResults() +{ + var serviceCollection = new ServiceCollection(); + // Add required services for benchmark + var serviceProvider = serviceCollection.BuildServiceProvider(); + + var benchmarkService = new PerformanceBenchmarkService(serviceProvider, Mock.Of>()); + + var options = new BenchmarkOptions + { + TestName = "Database Read Test", + TestType = BenchmarkType.DatabaseRead, + Iterations = 5 + }; + + var result = await benchmarkService.RunBenchmarkAsync(options); + + Assert.AreEqual("Database Read Test", result.TestName); + Assert.IsTrue(result.Duration > TimeSpan.Zero); + Assert.IsNotNull(result.Status); +} + +[Test] +public async Task PerformanceMonitoringMiddleware_SlowRequest_LogsWarning() +{ + var mockLogger = new Mock>(); + var monitoring = Mock.Of(); + + var middleware = new PerformanceMonitoringMiddleware( + async context => await Task.Delay(1100), // Simulate slow request + monitoring, + mockLogger.Object); + + var context = new DefaultHttpContext(); + await middleware.InvokeAsync(context); + + // Verify warning was logged for slow request + mockLogger.Verify( + x => x.Log( + LogLevel.Warning, + It.IsAny(), + It.Is((v, t) => v.ToString().Contains("Slow request detected")), + It.IsAny(), + It.IsAny>()), + Times.Once); +} +``` + +**Manual Validation**: +1. Run application and verify Prometheus metrics are collected +2. Access `/metrics` endpoint and verify metric format +3. Perform operations and verify metrics are updated +4. Test performance monitoring middleware with various request types +5. Run database operation benchmarks +6. Verify slow query logging functionality + +**Success Criteria**: +- [ ] HTTP request metrics collected accurately +- [ ] Database operation metrics recorded +- [ ] Slow requests and queries properly logged +- [ ] Benchmark service provides realistic performance data +- [ ] Prometheus metrics endpoint functional +- [ ] Performance overhead < 5% of request time + +--- + +### Step 9: Feature Flags and Configuration Controls +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Implement feature flags to safely enable/disable functionality during Phase 1 implementation and future migrations. + +#### Implementation + +```csharp +// 9.1: Feature flag service +public class FeatureFlagService +{ + private readonly IConfiguration _configuration; + private readonly ILogger _logger; + private readonly Dictionary _cachedFlags = new(); + private readonly object _cacheLock = new(); + + public FeatureFlagService(IConfiguration configuration, ILogger logger) + { + _configuration = configuration; + _logger = logger; + } + + public bool IsEnabled(string flagName, bool defaultValue = false) + { + lock (_cacheLock) + { + if (_cachedFlags.TryGetValue(flagName, out var cachedValue)) + { + return cachedValue; + } + + var value = GetFlagValue(flagName, defaultValue); + _cachedFlags[flagName] = value; + + _logger.LogDebug("Feature flag {FlagName} = {Value}", flagName, value); + return value; + } + } + + private bool GetFlagValue(string flagName, bool defaultValue) + { + // Check environment variable first (highest priority) + var envValue = Environment.GetEnvironmentVariable($"FEATURE_{flagName.ToUpper()}"); + if (!string.IsNullOrEmpty(envValue)) + { + if (bool.TryParse(envValue, out var envResult)) + { + _logger.LogDebug("Feature flag {FlagName} set via environment variable: {Value}", flagName, envResult); + return envResult; + } + } + + // Check configuration + var configKey = $"Features:{flagName}"; + var configValue = _configuration[configKey]; + if (!string.IsNullOrEmpty(configValue)) + { + if (bool.TryParse(configValue, out var configResult)) + { + _logger.LogDebug("Feature flag {FlagName} set via configuration: {Value}", flagName, configResult); + return configResult; + } + } + + _logger.LogDebug("Feature flag {FlagName} using default value: {Value}", flagName, defaultValue); + return defaultValue; + } + + public void InvalidateCache() + { + lock (_cacheLock) + { + _cachedFlags.Clear(); + _logger.LogInformation("Feature flag cache invalidated"); + } + } + + public Dictionary GetAllFlags() + { + var flags = new Dictionary(); + + // Get all known feature flags + var knownFlags = new[] + { + "StructuredLogging", + "HealthChecks", + "PerformanceMonitoring", + "DatabaseDiagnostics", + "MigrationTools", + "PostgreSQLOptimizations", + "ConfigurationValidation", + "DebugEndpoints" + }; + + foreach (var flag in knownFlags) + { + flags[flag] = IsEnabled(flag); + } + + return flags; + } +} + +// 9.2: Feature flag startup logging service +public class FeatureFlagStartupService : IHostedService +{ + private readonly FeatureFlagService _featureFlags; + private readonly IWebHostEnvironment _environment; + private readonly ILogger _logger; + + public FeatureFlagStartupService( + FeatureFlagService featureFlags, + IWebHostEnvironment environment, + ILogger logger) + { + _featureFlags = featureFlags; + _environment = environment; + _logger = logger; + } + + public async Task StartAsync(CancellationToken cancellationToken) + { + try + { + var allFlags = _featureFlags.GetAllFlags(); + var enabledFlags = allFlags.Where(kvp => kvp.Value).ToList(); + var disabledFlags = allFlags.Where(kvp => !kvp.Value).ToList(); + + _logger.LogInformation("Feature flags initialized. Environment: {Environment}, Total: {Total}, Enabled: {Enabled}, Disabled: {Disabled}", + _environment.EnvironmentName, + allFlags.Count, + enabledFlags.Count, + disabledFlags.Count); + + foreach (var flag in enabledFlags) + { + _logger.LogInformation("Feature flag ENABLED: {FlagName}", flag.Key); + } + + foreach (var flag in disabledFlags) + { + _logger.LogDebug("Feature flag disabled: {FlagName}", flag.Key); + } + } + catch (Exception ex) + { + _logger.LogError(ex, "Failed to log feature flag status at startup"); + } + } + + public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask; +} + +// 9.3: Feature-aware service registration +public static class FeatureAwareServiceExtensions +{ + public static IServiceCollection AddFeatureAwareServices(this IServiceCollection services, IConfiguration configuration) + { + services.AddSingleton(); + + var featureFlags = new FeatureFlagService(configuration, null); + + // Register services based on feature flags + if (featureFlags.IsEnabled("StructuredLogging", true)) + { + services.AddSingleton(); + } + + if (featureFlags.IsEnabled("HealthChecks", true)) + { + services.AddSingleton(); + services.AddSingleton(); + } + + if (featureFlags.IsEnabled("PerformanceMonitoring", true)) + { + services.AddSingleton(); + services.AddSingleton(); + } + + if (featureFlags.IsEnabled("DatabaseDiagnostics", true)) + { + services.AddSingleton(); + } + + if (featureFlags.IsEnabled("MigrationTools", true)) + { + services.AddSingleton(); + services.AddSingleton(); + } + + return services; + } +} + +// 9.4: Feature flag configuration options +public class FeatureFlagOptions +{ + public const string SectionName = "Features"; + + public bool StructuredLogging { get; set; } = true; + public bool HealthChecks { get; set; } = true; + public bool PerformanceMonitoring { get; set; } = true; + public bool DatabaseDiagnostics { get; set; } = true; + public bool MigrationTools { get; set; } = true; + public bool PostgreSQLOptimizations { get; set; } = true; + public bool ConfigurationValidation { get; set; } = true; + public bool DebugEndpoints { get; set; } = false; // Disabled by default in production +} + +// 9.5: Feature-gated components +public class FeatureGatedHealthCheckService +{ + private readonly FeatureFlagService _featureFlags; + private readonly DatabaseHealthCheck _databaseHealthCheck; + private readonly ApplicationHealthCheck _applicationHealthCheck; + + public async Task GetHealthStatusAsync() + { + var healthInfo = new Dictionary(); + + if (_featureFlags.IsEnabled("HealthChecks")) + { + if (_databaseHealthCheck != null) + { + var dbHealth = await _databaseHealthCheck.CheckHealthAsync(null); + healthInfo["Database"] = new + { + Status = dbHealth.Status.ToString(), + Description = dbHealth.Description + }; + } + + if (_applicationHealthCheck != null) + { + var appHealth = await _applicationHealthCheck.CheckHealthAsync(null); + healthInfo["Application"] = new + { + Status = appHealth.Status.ToString(), + Description = appHealth.Description + }; + } + } + else + { + healthInfo["Message"] = "Health checks are disabled"; + } + + return healthInfo; + } +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public void FeatureFlagService_EnvironmentVariable_TakesPrecedence() +{ + Environment.SetEnvironmentVariable("FEATURE_TESTFLAG", "true"); + + var config = new ConfigurationBuilder() + .AddInMemoryCollection(new[] { new KeyValuePair("Features:TestFlag", "false") }) + .Build(); + + try + { + var service = new FeatureFlagService(config, Mock.Of>()); + var result = service.IsEnabled("TestFlag", false); + + Assert.IsTrue(result); // Environment variable should override config + } + finally + { + Environment.SetEnvironmentVariable("FEATURE_TESTFLAG", null); + } +} + +[Test] +public void FeatureFlagService_Configuration_UsedWhenNoEnvironmentVariable() +{ + var config = new ConfigurationBuilder() + .AddInMemoryCollection(new[] { new KeyValuePair("Features:TestFlag", "true") }) + .Build(); + + var service = new FeatureFlagService(config, Mock.Of>()); + var result = service.IsEnabled("TestFlag", false); + + Assert.IsTrue(result); +} + +[Test] +public void FeatureFlagService_DefaultValue_UsedWhenNoConfiguration() +{ + var config = new ConfigurationBuilder().Build(); + + var service = new FeatureFlagService(config, Mock.Of>()); + var result = service.IsEnabled("NonExistentFlag", true); + + Assert.IsTrue(result); +} + +[Test] +public async Task FeatureFlagStartupService_RunsAtStartup_LogsFeatureFlags() +{ + var mockLogger = new Mock>(); + var mockFeatureFlags = new Mock(); + var mockEnvironment = new Mock(); + + mockEnvironment.Setup(x => x.EnvironmentName).Returns("Testing"); + + var flags = new Dictionary + { + ["StructuredLogging"] = true, + ["HealthChecks"] = true, + ["PerformanceMonitoring"] = false + }; + + mockFeatureFlags.Setup(x => x.GetAllFlags()).Returns(flags); + + var service = new FeatureFlagStartupService( + mockFeatureFlags.Object, + mockEnvironment.Object, + mockLogger.Object); + + await service.StartAsync(CancellationToken.None); + + // Verify feature flags were logged + mockLogger.Verify( + x => x.Log( + LogLevel.Information, + It.IsAny(), + It.Is((v, t) => v.ToString().Contains("Feature flags initialized")), + It.IsAny(), + It.IsAny>()), + Times.Once); +} + +[Test] +public void FeatureAwareServiceExtensions_RegistersServicesBasedOnFlags() +{ + var config = new ConfigurationBuilder() + .AddInMemoryCollection(new[] + { + new KeyValuePair("Features:PerformanceMonitoring", "true"), + new KeyValuePair("Features:DatabaseDiagnostics", "false") + }) + .Build(); + + var services = new ServiceCollection(); + services.AddFeatureAwareServices(config); + + var serviceProvider = services.BuildServiceProvider(); + + Assert.IsNotNull(serviceProvider.GetService()); + Assert.IsNull(serviceProvider.GetService()); +} +``` + +**Manual Validation**: +1. Test feature flags via environment variables +2. Test feature flags via configuration file +3. Review startup logs for feature flag status +4. Toggle feature flags and verify services are enabled/disabled +5. Test feature flag cache invalidation +6. Verify feature flags work in different environments and log appropriately + +**Success Criteria**: +- [ ] Feature flags correctly control service registration +- [ ] Environment variables override configuration values +- [ ] Feature flag status logged comprehensively at startup +- [ ] Cache invalidation works properly +- [ ] No performance impact when features are disabled +- [ ] Feature flags properly logged for debugging + +--- + +### Step 10: Final Integration and Validation +**Duration**: 2-3 days +**Risk Level**: Low +**Rollback Complexity**: Simple + +#### Objective +Integrate all Phase 1 components and perform comprehensive validation of the Kubernetes-ready application. + +#### Implementation + +```csharp +// 10.1: Integration validation service +public class IntegrationValidationService +{ + private readonly IServiceProvider _serviceProvider; + private readonly ILogger _logger; + + public async Task ValidateIntegrationAsync() + { + var result = new IntegrationValidationResult(); + _logger.LogInformation("Starting comprehensive integration validation"); + + try + { + // Validate all Phase 1 components + await ValidateStructuredLogging(result); + await ValidateHealthChecks(result); + await ValidateConfigurationFramework(result); + await ValidateDatabaseProvider(result); + await ValidatePerformanceMonitoring(result); + await ValidateFeatureFlags(result); + + // Overall integration test + await ValidateEndToEndWorkflow(result); + + result.OverallStatus = result.ComponentResults.All(c => c.Value.Status == "Healthy") ? "Healthy" : "Degraded"; + } + catch (Exception ex) + { + result.OverallStatus = "Failed"; + result.GeneralIssues.Add($"Integration validation failed: {ex.Message}"); + _logger.LogError(ex, "Integration validation failed"); + } + + _logger.LogInformation("Integration validation completed. Status: {Status}", result.OverallStatus); + return result; + } + + private async Task ValidateStructuredLogging(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "StructuredLogging" }; + + try + { + var logger = _serviceProvider.GetService>(); + var correlationService = _serviceProvider.GetService(); + + if (logger != null) + { + logger.LogInformation("Testing structured logging functionality"); + componentResult.Status = "Healthy"; + componentResult.Details["LoggerAvailable"] = true; + componentResult.Details["CorrelationIdService"] = correlationService != null; + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Logger service not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Structured logging validation failed: {ex.Message}"); + } + + result.ComponentResults["StructuredLogging"] = componentResult; + } + + private async Task ValidateHealthChecks(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "HealthChecks" }; + + try + { + var databaseHealthCheck = _serviceProvider.GetService(); + var applicationHealthCheck = _serviceProvider.GetService(); + + if (databaseHealthCheck != null && applicationHealthCheck != null) + { + var dbHealth = await databaseHealthCheck.CheckHealthAsync(null); + var appHealth = await applicationHealthCheck.CheckHealthAsync(null); + + componentResult.Status = (dbHealth.Status == HealthStatus.Healthy && appHealth.Status == HealthStatus.Healthy) + ? "Healthy" : "Degraded"; + + componentResult.Details["DatabaseHealth"] = dbHealth.Status.ToString(); + componentResult.Details["ApplicationHealth"] = appHealth.Status.ToString(); + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Health check services not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Health check validation failed: {ex.Message}"); + } + + result.ComponentResults["HealthChecks"] = componentResult; + } + + private async Task ValidateConfigurationFramework(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "ConfigurationFramework" }; + + try + { + var configValidation = _serviceProvider.GetService(); + var configMapping = _serviceProvider.GetService(); + + if (configValidation != null && configMapping != null) + { + var validationResult = configValidation.ValidateConfiguration(); + + componentResult.Status = validationResult.HasErrors ? "Failed" : "Healthy"; + componentResult.Details["ValidationErrors"] = validationResult.Errors.Count; + componentResult.Details["ValidationWarnings"] = validationResult.Warnings.Count; + + if (validationResult.HasErrors) + { + componentResult.Issues.AddRange(validationResult.Errors.Select(e => e.Message)); + } + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Configuration services not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Configuration validation failed: {ex.Message}"); + } + + result.ComponentResults["ConfigurationFramework"] = componentResult; + } + + private async Task ValidateDatabaseProvider(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "DatabaseProvider" }; + + try + { + var providerService = _serviceProvider.GetService(); + var diagnosticsService = _serviceProvider.GetService(); + + if (providerService != null && diagnosticsService != null) + { + var provider = providerService.GetActiveProvider(); + var diagnostics = await diagnosticsService.PerformDiagnosticsAsync(); + + componentResult.Status = diagnostics.OverallStatus; + componentResult.Details["ActiveProvider"] = provider.ToString(); + componentResult.Details["ConnectionTime"] = diagnostics.ConnectionTime; + componentResult.Details["IssueCount"] = diagnostics.Issues.Count; + + if (diagnostics.Issues.Count > 0) + { + componentResult.Issues.AddRange(diagnostics.Issues); + } + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Database services not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Database provider validation failed: {ex.Message}"); + } + + result.ComponentResults["DatabaseProvider"] = componentResult; + } + + private async Task ValidatePerformanceMonitoring(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "PerformanceMonitoring" }; + + try + { + var performanceService = _serviceProvider.GetService(); + var benchmarkService = _serviceProvider.GetService(); + + if (performanceService != null) + { + // Test metrics recording + performanceService.RecordHttpRequest("GET", "/test", 200, 0.1); + performanceService.RecordDatabaseOperation("SELECT", "Test", 0.05); + + componentResult.Status = "Healthy"; + componentResult.Details["PerformanceServiceAvailable"] = true; + componentResult.Details["BenchmarkServiceAvailable"] = benchmarkService != null; + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Performance monitoring service not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Performance monitoring validation failed: {ex.Message}"); + } + + result.ComponentResults["PerformanceMonitoring"] = componentResult; + } + + private async Task ValidateFeatureFlags(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "FeatureFlags" }; + + try + { + var featureFlagService = _serviceProvider.GetService(); + + if (featureFlagService != null) + { + var allFlags = featureFlagService.GetAllFlags(); + var enabledCount = allFlags.Count(f => f.Value); + + componentResult.Status = "Healthy"; + componentResult.Details["TotalFlags"] = allFlags.Count; + componentResult.Details["EnabledFlags"] = enabledCount; + componentResult.Details["DisabledFlags"] = allFlags.Count - enabledCount; + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Feature flag service not available"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"Feature flag validation failed: {ex.Message}"); + } + + result.ComponentResults["FeatureFlags"] = componentResult; + } + + private async Task ValidateEndToEndWorkflow(IntegrationValidationResult result) + { + var componentResult = new ComponentValidationResult { ComponentName = "EndToEndWorkflow" }; + + try + { + // Test a complete workflow that uses multiple components + using var scope = _serviceProvider.CreateScope(); + var vehicleAccess = scope.ServiceProvider.GetService(); + + if (vehicleAccess != null) + { + // Test basic database operation + var vehicles = await vehicleAccess.GetVehiclesAsync(1); + + componentResult.Status = "Healthy"; + componentResult.Details["DatabaseOperationSuccessful"] = true; + componentResult.Details["VehicleCount"] = vehicles?.Count ?? 0; + } + else + { + componentResult.Status = "Failed"; + componentResult.Issues.Add("Vehicle data access not available for end-to-end test"); + } + } + catch (Exception ex) + { + componentResult.Status = "Failed"; + componentResult.Issues.Add($"End-to-end workflow validation failed: {ex.Message}"); + } + + result.ComponentResults["EndToEndWorkflow"] = componentResult; + } +} + +// 10.2: Integration validation models +public class IntegrationValidationResult +{ + public string OverallStatus { get; set; } = "Unknown"; + public DateTime ValidationTime { get; set; } = DateTime.UtcNow; + public Dictionary ComponentResults { get; set; } = new(); + public List GeneralIssues { get; set; } = new(); + public Dictionary Summary { get; set; } = new(); +} + +public class ComponentValidationResult +{ + public string ComponentName { get; set; } + public string Status { get; set; } = "Unknown"; + public Dictionary Details { get; set; } = new(); + public List Issues { get; set; } = new(); +} + +// 10.3: Integration validation startup service +public class IntegrationValidationStartupService : IHostedService +{ + private readonly IntegrationValidationService _validationService; + private readonly ILogger _logger; + + public IntegrationValidationStartupService( + IntegrationValidationService validationService, + ILogger logger) + { + _validationService = validationService; + _logger = logger; + } + + public async Task StartAsync(CancellationToken cancellationToken) + { + try + { + _logger.LogInformation("Starting comprehensive integration validation"); + var result = await _validationService.ValidateIntegrationAsync(); + + _logger.LogInformation("Integration validation completed. Overall Status: {OverallStatus}", result.OverallStatus); + + foreach (var component in result.ComponentResults) + { + var componentResult = component.Value; + _logger.LogInformation("Component {ComponentName}: {Status}", + componentResult.ComponentName, componentResult.Status); + + foreach (var detail in componentResult.Details) + { + _logger.LogDebug("Component {ComponentName} detail - {Key}: {Value}", + componentResult.ComponentName, detail.Key, detail.Value); + } + + foreach (var issue in componentResult.Issues) + { + _logger.LogWarning("Component {ComponentName} issue: {Issue}", + componentResult.ComponentName, issue); + } + } + + foreach (var issue in result.GeneralIssues) + { + _logger.LogError("Integration validation general issue: {Issue}", issue); + } + + if (result.OverallStatus != "Healthy") + { + _logger.LogWarning("Application integration validation indicates issues. Review component logs for details."); + } + else + { + _logger.LogInformation("All integration components are healthy and ready for Kubernetes deployment"); + } + } + catch (Exception ex) + { + _logger.LogError(ex, "Integration validation failed during startup"); + } + } + + public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask; +} +``` + +#### Testing Plan + +**Automated Tests**: +```csharp +[Test] +public async Task IntegrationValidationService_AllComponentsHealthy_ReturnsHealthyStatus() +{ + var services = new ServiceCollection(); + + // Add all required services + services.AddSingleton(); + services.AddSingleton(); + services.AddSingleton(); + // ... add other services + + var serviceProvider = services.BuildServiceProvider(); + var validationService = new IntegrationValidationService(serviceProvider, Mock.Of>()); + + var result = await validationService.ValidateIntegrationAsync(); + + Assert.AreEqual("Healthy", result.OverallStatus); + Assert.IsTrue(result.ComponentResults.Count > 0); +} + +[Test] +public async Task IntegrationValidationStartupService_RunsAtStartup_LogsValidationResults() +{ + var mockLogger = new Mock>(); + var mockValidationService = new Mock(); + + var validationResult = new IntegrationValidationResult + { + OverallStatus = "Healthy", + ComponentResults = new Dictionary + { + ["StructuredLogging"] = new ComponentValidationResult + { + ComponentName = "StructuredLogging", + Status = "Healthy" + } + } + }; + + mockValidationService.Setup(x => x.ValidateIntegrationAsync()) + .ReturnsAsync(validationResult); + + var service = new IntegrationValidationStartupService( + mockValidationService.Object, + mockLogger.Object); + + await service.StartAsync(CancellationToken.None); + + // Verify integration validation was logged + mockLogger.Verify( + x => x.Log( + LogLevel.Information, + It.IsAny(), + It.Is((v, t) => v.ToString().Contains("Integration validation completed")), + It.IsAny(), + It.IsAny>()), + Times.Once); +} +``` + +**Manual Validation**: +1. Review startup logs to verify all components are healthy +2. Test with individual component failures and verify proper error logging +3. Verify all Phase 1 features work together correctly +4. Test application startup with all new components and review logs +5. Perform end-to-end user workflows +6. Verify Kubernetes readiness (health checks, configuration, etc.) + +**Success Criteria**: +- [ ] All Phase 1 components integrate successfully +- [ ] Integration validation service reports accurate status and logs details +- [ ] End-to-end workflows function correctly +- [ ] Application ready for Kubernetes deployment +- [ ] Comprehensive logging provides visibility into all components +- [ ] Performance remains within acceptable limits + +--- + +## Summary + +This detailed implementation plan provides a safe, step-by-step approach to Phase 1 with: + +1. **Incremental Changes**: Each step is isolated and testable +2. **Comprehensive Testing**: Automated and manual validation at each step +3. **Debugging Focus**: Extensive logging and diagnostic capabilities +4. **Risk Mitigation**: Rollback procedures and thorough validation +5. **Performance Monitoring**: Baseline and continuous validation +6. **Feature Control**: Feature flags for safe rollout of new functionality + +The plan ensures that any issues can be detected and resolved quickly before proceeding to the next step, making the overall Phase 1 implementation much safer and more reliable than the original approach. Each step builds upon the previous ones while maintaining full backward compatibility until the final integration. \ No newline at end of file diff --git a/K8S-PHASE-1.md b/K8S-PHASE-1.md new file mode 100644 index 0000000..d1825f0 --- /dev/null +++ b/K8S-PHASE-1.md @@ -0,0 +1,365 @@ +# Phase 1: Core Kubernetes Readiness (Weeks 1-4) + +This phase focuses on making the application compatible with Kubernetes deployment patterns while maintaining existing functionality. + +## Overview + +The primary goal of Phase 1 is to transform MotoVaultPro from a traditional self-hosted application into a Kubernetes-ready application. This involves removing state dependencies, externalizing configuration, implementing health checks, and modernizing the database architecture. + +## Key Objectives + +- **Configuration Externalization**: Move all configuration from files to Kubernetes-native management +- **Database Modernization**: Eliminate LiteDB dependency and optimize PostgreSQL usage +- **Health Check Implementation**: Add Kubernetes-compatible health check endpoints +- **Logging Enhancement**: Implement structured logging for centralized log aggregation + +## 1.1 Configuration Externalization + +**Objective**: Move all configuration from files to Kubernetes-native configuration management. + +**Current State**: +- Configuration stored in `appsettings.json` and environment variables +- Database connection strings in configuration files +- Feature flags and application settings mixed with deployment configuration + +**Target State**: +- All configuration externalized to ConfigMaps and Secrets +- Environment-specific configuration separated from application code +- Sensitive data (passwords, API keys) managed through Kubernetes Secrets + +### Implementation Tasks + +#### 1. Create ConfigMap templates for non-sensitive configuration +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: motovault-config +data: + APP_NAME: "MotoVaultPro" + LOG_LEVEL: "Information" + ENABLE_FEATURES: "OpenIDConnect,EmailNotifications" + CACHE_EXPIRY_MINUTES: "30" +``` + +#### 2. Create Secret templates for sensitive configuration +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: motovault-secrets +type: Opaque +data: + POSTGRES_CONNECTION: + MINIO_ACCESS_KEY: + MINIO_SECRET_KEY: + JWT_SECRET: +``` + +#### 3. Modify application startup to read from environment variables +- Update `Program.cs` to prioritize environment variables over file configuration +- Remove dependencies on `appsettings.json` for runtime configuration +- Implement configuration validation at startup + +#### 4. Remove file-based configuration dependencies +- Update all services to use IConfiguration instead of direct file access +- Ensure all configuration is injectable through dependency injection + +#### 5. Implement configuration validation at startup +- Add startup checks to ensure all required configuration is present +- Fail fast if critical configuration is missing + +## 1.2 Database Architecture Modernization + +**Objective**: Eliminate LiteDB dependency and optimize PostgreSQL usage for Kubernetes. + +**Current State**: +- Dual database support with LiteDB as default +- Single PostgreSQL connection for external database mode +- No connection pooling optimization for multiple instances + +**Target State**: +- PostgreSQL-only configuration with high availability +- Optimized connection pooling for horizontal scaling +- Database migration strategy for existing LiteDB installations + +### Implementation Tasks + +#### 1. Remove LiteDB implementation and dependencies +```csharp +// Remove all LiteDB-related code from: +// - External/Implementations/LiteDB/ +// - Remove LiteDB package references +// - Update dependency injection to only register PostgreSQL implementations +``` + +#### 2. Implement PostgreSQL HA configuration +```csharp +services.AddDbContext(options => +{ + options.UseNpgsql(connectionString, npgsqlOptions => + { + npgsqlOptions.EnableRetryOnFailure( + maxRetryCount: 3, + maxRetryDelay: TimeSpan.FromSeconds(5), + errorCodesToAdd: null); + }); +}); +``` + +#### 3. Add connection pooling configuration +```csharp +// Configure connection pooling for multiple instances +services.Configure(options => +{ + options.MaxPoolSize = 100; + options.MinPoolSize = 10; + options.ConnectionLifetime = 300; // 5 minutes +}); +``` + +#### 4. Create data migration tools for LiteDB to PostgreSQL conversion +- Develop utility to export data from LiteDB format +- Create import scripts for PostgreSQL +- Ensure data integrity during migration + +#### 5. Implement database health checks for Kubernetes probes +```csharp +public class DatabaseHealthCheck : IHealthCheck +{ + private readonly IDbContextFactory _contextFactory; + + public async Task CheckHealthAsync( + HealthCheckContext context, + CancellationToken cancellationToken = default) + { + try + { + using var dbContext = _contextFactory.CreateDbContext(); + await dbContext.Database.CanConnectAsync(cancellationToken); + return HealthCheckResult.Healthy("Database connection successful"); + } + catch (Exception ex) + { + return HealthCheckResult.Unhealthy("Database connection failed", ex); + } + } +} +``` + +## 1.3 Health Check Implementation + +**Objective**: Add Kubernetes-compatible health check endpoints for proper orchestration. + +**Current State**: +- No dedicated health check endpoints +- Application startup/shutdown not optimized for Kubernetes + +**Target State**: +- Comprehensive health checks for all dependencies +- Proper readiness and liveness probe endpoints +- Graceful shutdown handling for pod termination + +### Implementation Tasks + +#### 1. Add health check middleware +```csharp +// Program.cs +builder.Services.AddHealthChecks() + .AddNpgSql(connectionString, name: "database") + .AddRedis(redisConnectionString, name: "cache") + .AddCheck("minio"); + +app.MapHealthChecks("/health/ready", new HealthCheckOptions +{ + Predicate = check => check.Tags.Contains("ready"), + ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse +}); + +app.MapHealthChecks("/health/live", new HealthCheckOptions +{ + Predicate = _ => false // Only check if the app is responsive +}); +``` + +#### 2. Implement custom health checks +```csharp +public class MinIOHealthCheck : IHealthCheck +{ + private readonly IMinioClient _minioClient; + + public async Task CheckHealthAsync( + HealthCheckContext context, + CancellationToken cancellationToken = default) + { + try + { + await _minioClient.ListBucketsAsync(cancellationToken); + return HealthCheckResult.Healthy("MinIO is accessible"); + } + catch (Exception ex) + { + return HealthCheckResult.Unhealthy("MinIO is not accessible", ex); + } + } +} +``` + +#### 3. Add graceful shutdown handling +```csharp +builder.Services.Configure(options => +{ + options.ShutdownTimeout = TimeSpan.FromSeconds(30); +}); +``` + +## 1.4 Logging Enhancement + +**Objective**: Implement structured logging suitable for centralized log aggregation. + +**Current State**: +- Basic logging with simple string messages +- No correlation IDs for distributed tracing +- Log levels not optimized for production monitoring + +**Target State**: +- JSON-structured logging with correlation IDs +- Centralized log aggregation compatibility +- Performance and error metrics embedded in logs + +### Implementation Tasks + +#### 1. Configure structured logging +```csharp +builder.Services.AddLogging(loggingBuilder => +{ + loggingBuilder.ClearProviders(); + loggingBuilder.AddJsonConsole(options => + { + options.IncludeScopes = true; + options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ"; + options.JsonWriterOptions = new JsonWriterOptions + { + Indented = false + }; + }); +}); +``` + +#### 2. Add correlation ID middleware +```csharp +public class CorrelationIdMiddleware +{ + public async Task InvokeAsync(HttpContext context, RequestDelegate next) + { + var correlationId = context.Request.Headers["X-Correlation-ID"] + .FirstOrDefault() ?? Guid.NewGuid().ToString(); + + using var scope = _logger.BeginScope(new Dictionary + { + ["CorrelationId"] = correlationId, + ["UserId"] = context.User?.Identity?.Name + }); + + context.Response.Headers.Add("X-Correlation-ID", correlationId); + await next(context); + } +} +``` + +#### 3. Implement performance logging for critical operations +- Add timing information to database operations +- Log request/response metrics +- Include user context in all log entries + +## Week-by-Week Breakdown + +### Week 1: Environment Setup and Configuration +- **Days 1-2**: Set up development Kubernetes environment +- **Days 3-4**: Create ConfigMap and Secret templates +- **Days 5-7**: Modify application to read from environment variables + +### Week 2: Database Migration +- **Days 1-3**: Remove LiteDB dependencies +- **Days 4-5**: Implement PostgreSQL connection pooling +- **Days 6-7**: Create data migration utilities + +### Week 3: Health Checks and Monitoring +- **Days 1-3**: Implement health check endpoints +- **Days 4-5**: Add custom health checks for dependencies +- **Days 6-7**: Test health check functionality + +### Week 4: Logging and Documentation +- **Days 1-3**: Implement structured logging +- **Days 4-5**: Add correlation ID middleware +- **Days 6-7**: Document changes and prepare for Phase 2 + +## Success Criteria + +- [ ] Application starts successfully using only environment variables +- [ ] All LiteDB dependencies removed +- [ ] PostgreSQL connection pooling configured and tested +- [ ] Health check endpoints return appropriate status +- [ ] Structured JSON logging implemented +- [ ] Data migration tool successfully converts LiteDB to PostgreSQL +- [ ] Application can be deployed to Kubernetes without file dependencies + +## Testing Requirements + +### Unit Tests +- Configuration validation logic +- Health check implementations +- Database connection handling + +### Integration Tests +- End-to-end application startup with external configuration +- Database connectivity and migration +- Health check endpoint responses + +### Manual Testing +- Deploy to development Kubernetes cluster +- Verify all functionality works without local file dependencies +- Test health check endpoints with kubectl + +## Deliverables + +1. **Updated Application Code** + - Removed LiteDB dependencies + - Externalized configuration + - Added health checks + - Implemented structured logging + +2. **Kubernetes Manifests** + - ConfigMap templates + - Secret templates + - Basic deployment configuration for testing + +3. **Migration Tools** + - LiteDB to PostgreSQL data migration utility + - Configuration migration scripts + +4. **Documentation** + - Updated deployment instructions + - Configuration reference + - Health check endpoint documentation + +## Dependencies + +- Kubernetes cluster (development environment) +- PostgreSQL instance for testing +- Docker registry for container images + +## Risks and Mitigations + +### Risk: Data Loss During Migration +**Mitigation**: Comprehensive backup strategy and thorough testing of migration tools + +### Risk: Configuration Errors +**Mitigation**: Configuration validation at startup and extensive testing + +### Risk: Performance Degradation +**Mitigation**: Performance testing and gradual rollout with monitoring + +--- + +**Next Phase**: [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md) \ No newline at end of file diff --git a/K8S-PHASE-2.md b/K8S-PHASE-2.md new file mode 100644 index 0000000..df21bb5 --- /dev/null +++ b/K8S-PHASE-2.md @@ -0,0 +1,742 @@ +# Phase 2: High Availability Infrastructure (Weeks 5-8) + +This phase focuses on implementing the supporting infrastructure required for high availability, including MinIO clusters, PostgreSQL HA setup, Redis clusters, and file storage abstraction. + +## Overview + +Phase 2 transforms MotoVaultPro's supporting infrastructure from single-instance services to highly available, distributed systems. This phase establishes the foundation for true high availability by eliminating all single points of failure in the data layer. + +## Key Objectives + +- **MinIO High Availability**: Deploy distributed object storage with erasure coding +- **File Storage Abstraction**: Create unified interface for file operations +- **PostgreSQL HA**: Implement primary/replica configuration with automated failover +- **Redis Cluster**: Deploy distributed caching and session storage +- **Data Migration**: Seamless transition from local storage to distributed systems + +## 2.1 MinIO High Availability Setup + +**Objective**: Deploy a highly available MinIO cluster for file storage with automatic failover. + +**Architecture Overview**: +MinIO will be deployed as a distributed cluster with erasure coding for data protection and automatic healing capabilities. + +### MinIO Cluster Configuration + +```yaml +# MinIO Tenant Configuration +apiVersion: minio.min.io/v2 +kind: Tenant +metadata: + name: motovault-minio + namespace: motovault +spec: + image: minio/minio:RELEASE.2024-01-16T16-07-38Z + creationDate: 2024-01-20T10:00:00Z + pools: + - servers: 4 + name: pool-0 + volumesPerServer: 4 + volumeClaimTemplate: + metadata: + name: data + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 100Gi + storageClassName: fast-ssd + mountPath: /export + subPath: /data + requestAutoCert: false + certConfig: + commonName: "" + organizationName: [] + dnsNames: [] + console: + image: minio/console:v0.22.5 + replicas: 2 + consoleSecret: + name: motovault-minio-console-secret + configuration: + name: motovault-minio-config +``` + +### Implementation Tasks + +#### 1. Deploy MinIO Operator +```bash +kubectl apply -k "github.com/minio/operator/resources" +``` + +#### 2. Create MinIO cluster configuration with erasure coding +- Configure 4+ nodes for optimal erasure coding +- Set up data protection with automatic healing +- Configure storage classes for performance + +#### 3. Configure backup policies for disaster recovery +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: minio-backup-policy +data: + backup-policy.json: | + { + "rules": [ + { + "id": "motovault-backup", + "status": "Enabled", + "transition": { + "days": 30, + "storage_class": "GLACIER" + } + } + ] + } +``` + +#### 4. Set up monitoring with Prometheus metrics +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: minio-metrics +spec: + selector: + matchLabels: + app: minio + endpoints: + - port: http-minio + path: /minio/v2/metrics/cluster +``` + +#### 5. Create service endpoints for application connectivity +```yaml +apiVersion: v1 +kind: Service +metadata: + name: minio-service +spec: + selector: + app: minio + ports: + - name: http + port: 9000 + targetPort: 9000 + - name: console + port: 9001 + targetPort: 9001 +``` + +### MinIO High Availability Features + +- **Erasure Coding**: Data is split across multiple drives with parity for automatic healing +- **Distributed Architecture**: No single point of failure +- **Automatic Healing**: Corrupted data is automatically detected and repaired +- **Load Balancing**: Built-in load balancing across cluster nodes +- **Bucket Policies**: Fine-grained access control for different data types + +## 2.2 File Storage Abstraction Implementation + +**Objective**: Create an abstraction layer that allows seamless switching between local filesystem and MinIO object storage. + +**Current State**: +- Direct filesystem operations throughout the application +- File paths hardcoded in various controllers and services +- No abstraction for different storage backends + +**Target State**: +- Unified file storage interface +- Pluggable storage implementations +- Transparent migration between storage types + +### Implementation Tasks + +#### 1. Define storage abstraction interface +```csharp +public interface IFileStorageService +{ + Task UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default); + Task DownloadFileAsync(string fileId, CancellationToken cancellationToken = default); + Task DeleteFileAsync(string fileId, CancellationToken cancellationToken = default); + Task GetFileMetadataAsync(string fileId, CancellationToken cancellationToken = default); + Task> ListFilesAsync(string prefix = null, CancellationToken cancellationToken = default); + Task GeneratePresignedUrlAsync(string fileId, TimeSpan expiration, CancellationToken cancellationToken = default); +} + +public class FileMetadata +{ + public string Id { get; set; } + public string FileName { get; set; } + public string ContentType { get; set; } + public long Size { get; set; } + public DateTime CreatedDate { get; set; } + public DateTime ModifiedDate { get; set; } + public Dictionary Tags { get; set; } +} +``` + +#### 2. Implement MinIO storage service +```csharp +public class MinIOFileStorageService : IFileStorageService +{ + private readonly IMinioClient _minioClient; + private readonly ILogger _logger; + private readonly string _bucketName; + + public MinIOFileStorageService(IMinioClient minioClient, IConfiguration configuration, ILogger logger) + { + _minioClient = minioClient; + _logger = logger; + _bucketName = configuration["MinIO:BucketName"] ?? "motovault-files"; + } + + public async Task UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default) + { + var fileId = $"{Guid.NewGuid()}/{fileName}"; + + try + { + await _minioClient.PutObjectAsync(new PutObjectArgs() + .WithBucket(_bucketName) + .WithObject(fileId) + .WithStreamData(fileStream) + .WithObjectSize(fileStream.Length) + .WithContentType(contentType) + .WithHeaders(new Dictionary + { + ["X-Amz-Meta-Original-Name"] = fileName, + ["X-Amz-Meta-Upload-Date"] = DateTime.UtcNow.ToString("O") + }), cancellationToken); + + _logger.LogInformation("File uploaded successfully: {FileId}", fileId); + return fileId; + } + catch (Exception ex) + { + _logger.LogError(ex, "Failed to upload file: {FileName}", fileName); + throw; + } + } + + public async Task DownloadFileAsync(string fileId, CancellationToken cancellationToken = default) + { + try + { + var memoryStream = new MemoryStream(); + await _minioClient.GetObjectAsync(new GetObjectArgs() + .WithBucket(_bucketName) + .WithObject(fileId) + .WithCallbackStream(stream => stream.CopyTo(memoryStream)), cancellationToken); + + memoryStream.Position = 0; + return memoryStream; + } + catch (Exception ex) + { + _logger.LogError(ex, "Failed to download file: {FileId}", fileId); + throw; + } + } + + // Additional method implementations... +} +``` + +#### 3. Create fallback storage service for graceful degradation +```csharp +public class FallbackFileStorageService : IFileStorageService +{ + private readonly IFileStorageService _primaryService; + private readonly IFileStorageService _fallbackService; + private readonly ILogger _logger; + + public FallbackFileStorageService( + IFileStorageService primaryService, + IFileStorageService fallbackService, + ILogger logger) + { + _primaryService = primaryService; + _fallbackService = fallbackService; + _logger = logger; + } + + public async Task UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default) + { + try + { + return await _primaryService.UploadFileAsync(fileStream, fileName, contentType, cancellationToken); + } + catch (Exception ex) + { + _logger.LogWarning(ex, "Primary storage failed, falling back to secondary storage"); + fileStream.Position = 0; // Reset stream position + return await _fallbackService.UploadFileAsync(fileStream, fileName, contentType, cancellationToken); + } + } + + // Implementation with automatic fallback logic for other methods... +} +``` + +#### 4. Update all file operations to use the abstraction layer +- Replace direct File.WriteAllBytes, File.ReadAllBytes calls +- Update all controllers to use IFileStorageService +- Modify attachment handling in vehicle records + +#### 5. Implement file migration utility for existing local files +```csharp +public class FileMigrationService +{ + private readonly IFileStorageService _targetStorage; + private readonly ILogger _logger; + + public async Task MigrateLocalFilesAsync(string localPath) + { + var result = new MigrationResult(); + var files = Directory.GetFiles(localPath, "*", SearchOption.AllDirectories); + + foreach (var filePath in files) + { + try + { + using var fileStream = File.OpenRead(filePath); + var fileName = Path.GetFileName(filePath); + var contentType = GetContentType(fileName); + + var fileId = await _targetStorage.UploadFileAsync(fileStream, fileName, contentType); + result.ProcessedFiles.Add(new MigratedFile + { + OriginalPath = filePath, + NewFileId = fileId, + Success = true + }); + } + catch (Exception ex) + { + _logger.LogError(ex, "Failed to migrate file: {FilePath}", filePath); + result.ProcessedFiles.Add(new MigratedFile + { + OriginalPath = filePath, + Success = false, + Error = ex.Message + }); + } + } + + return result; + } +} +``` + +## 2.3 PostgreSQL High Availability Configuration + +**Objective**: Set up a PostgreSQL cluster with automatic failover and read replicas. + +**Architecture Overview**: +PostgreSQL will be deployed using an operator (like CloudNativePG or Postgres Operator) to provide automated failover, backup, and scaling capabilities. + +### PostgreSQL Cluster Configuration + +```yaml +apiVersion: postgresql.cnpg.io/v1 +kind: Cluster +metadata: + name: motovault-postgres + namespace: motovault +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + postgresql: + parameters: + max_connections: "200" + shared_buffers: "256MB" + effective_cache_size: "1GB" + maintenance_work_mem: "64MB" + checkpoint_completion_target: "0.9" + wal_buffers: "16MB" + default_statistics_target: "100" + random_page_cost: "1.1" + effective_io_concurrency: "200" + + resources: + requests: + memory: "2Gi" + cpu: "1000m" + limits: + memory: "4Gi" + cpu: "2000m" + + storage: + size: "100Gi" + storageClass: "fast-ssd" + + monitoring: + enabled: true + + backup: + retentionPolicy: "30d" + barmanObjectStore: + destinationPath: "s3://motovault-backups/postgres" + s3Credentials: + accessKeyId: + name: postgres-backup-credentials + key: ACCESS_KEY_ID + secretAccessKey: + name: postgres-backup-credentials + key: SECRET_ACCESS_KEY + wal: + retention: "5d" + data: + retention: "30d" + jobs: 1 +``` + +### Implementation Tasks + +#### 1. Deploy PostgreSQL operator (CloudNativePG recommended) +```bash +kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.20/releases/cnpg-1.20.1.yaml +``` + +#### 2. Configure cluster with primary/replica setup +- 3-node cluster with automatic failover +- Read-write split capability +- Streaming replication configuration + +#### 3. Set up automated backups to MinIO or external storage +```yaml +apiVersion: postgresql.cnpg.io/v1 +kind: ScheduledBackup +metadata: + name: motovault-postgres-backup +spec: + schedule: "0 2 * * *" # Daily at 2 AM + backupOwnerReference: self + cluster: + name: motovault-postgres +``` + +#### 4. Implement connection pooling with PgBouncer +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: pgbouncer +spec: + replicas: 2 + selector: + matchLabels: + app: pgbouncer + template: + spec: + containers: + - name: pgbouncer + image: pgbouncer/pgbouncer:latest + env: + - name: DATABASES_HOST + value: motovault-postgres-rw + - name: DATABASES_PORT + value: "5432" + - name: DATABASES_DATABASE + value: motovault + - name: POOL_MODE + value: session + - name: MAX_CLIENT_CONN + value: "1000" + - name: DEFAULT_POOL_SIZE + value: "25" +``` + +#### 5. Configure monitoring and alerting for database health +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: postgres-metrics +spec: + selector: + matchLabels: + app.kubernetes.io/name: cloudnative-pg + endpoints: + - port: metrics + path: /metrics +``` + +## 2.4 Redis Cluster for Session Management + +**Objective**: Implement distributed session storage and caching using Redis cluster. + +**Current State**: +- In-memory session storage tied to individual application instances +- No distributed caching for expensive operations +- Configuration and translation data loaded on each application start + +**Target State**: +- Redis cluster for distributed session storage +- Centralized caching for frequently accessed data +- High availability with automatic failover + +### Redis Cluster Configuration + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: redis-cluster-config + namespace: motovault +data: + redis.conf: | + cluster-enabled yes + cluster-require-full-coverage no + cluster-node-timeout 15000 + cluster-config-file /data/nodes.conf + cluster-migration-barrier 1 + appendonly yes + appendfsync everysec + save 900 1 + save 300 10 + save 60 10000 + +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: redis-cluster + namespace: motovault +spec: + serviceName: redis-cluster + replicas: 6 + selector: + matchLabels: + app: redis-cluster + template: + metadata: + labels: + app: redis-cluster + spec: + containers: + - name: redis + image: redis:7-alpine + command: + - redis-server + - /etc/redis/redis.conf + ports: + - containerPort: 6379 + - containerPort: 16379 + resources: + requests: + memory: "512Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "500m" + volumeMounts: + - name: redis-config + mountPath: /etc/redis + - name: redis-data + mountPath: /data + volumes: + - name: redis-config + configMap: + name: redis-cluster-config + volumeClaimTemplates: + - metadata: + name: redis-data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 10Gi +``` + +### Implementation Tasks + +#### 1. Deploy Redis cluster with 6 nodes (3 masters, 3 replicas) +```bash +# Initialize Redis cluster after deployment +kubectl exec -it redis-cluster-0 -- redis-cli --cluster create \ + redis-cluster-0.redis-cluster:6379 \ + redis-cluster-1.redis-cluster:6379 \ + redis-cluster-2.redis-cluster:6379 \ + redis-cluster-3.redis-cluster:6379 \ + redis-cluster-4.redis-cluster:6379 \ + redis-cluster-5.redis-cluster:6379 \ + --cluster-replicas 1 +``` + +#### 2. Configure session storage +```csharp +services.AddStackExchangeRedisCache(options => +{ + options.Configuration = configuration.GetConnectionString("Redis"); + options.InstanceName = "MotoVault"; +}); + +services.AddSession(options => +{ + options.IdleTimeout = TimeSpan.FromMinutes(30); + options.Cookie.HttpOnly = true; + options.Cookie.IsEssential = true; + options.Cookie.SecurePolicy = CookieSecurePolicy.Always; +}); +``` + +#### 3. Implement distributed caching +```csharp +public class CachedTranslationService : ITranslationService +{ + private readonly IDistributedCache _cache; + private readonly ITranslationService _translationService; + private readonly ILogger _logger; + + public async Task GetTranslationAsync(string key, string language) + { + var cacheKey = $"translation:{language}:{key}"; + var cached = await _cache.GetStringAsync(cacheKey); + + if (cached != null) + { + return cached; + } + + var translation = await _translationService.GetTranslationAsync(key, language); + + await _cache.SetStringAsync(cacheKey, translation, new DistributedCacheEntryOptions + { + SlidingExpiration = TimeSpan.FromHours(1) + }); + + return translation; + } +} +``` + +#### 4. Add cache monitoring and performance metrics +```csharp +public class CacheMetricsService +{ + private readonly Counter _cacheHits; + private readonly Counter _cacheMisses; + private readonly Histogram _cacheOperationDuration; + + public CacheMetricsService() + { + _cacheHits = Metrics.CreateCounter( + "motovault_cache_hits_total", + "Total cache hits", + new[] { "cache_type" }); + + _cacheMisses = Metrics.CreateCounter( + "motovault_cache_misses_total", + "Total cache misses", + new[] { "cache_type" }); + + _cacheOperationDuration = Metrics.CreateHistogram( + "motovault_cache_operation_duration_seconds", + "Cache operation duration", + new[] { "operation", "cache_type" }); + } +} +``` + +## Week-by-Week Breakdown + +### Week 5: MinIO Deployment +- **Days 1-2**: Deploy MinIO operator and configure basic cluster +- **Days 3-4**: Implement file storage abstraction interface +- **Days 5-7**: Create MinIO storage service implementation + +### Week 6: File Migration and PostgreSQL HA +- **Days 1-2**: Complete file storage abstraction and migration tools +- **Days 3-4**: Deploy PostgreSQL operator and HA cluster +- **Days 5-7**: Configure connection pooling and backup strategies + +### Week 7: Redis Cluster and Caching +- **Days 1-3**: Deploy Redis cluster and configure session storage +- **Days 4-5**: Implement distributed caching layer +- **Days 6-7**: Add cache monitoring and performance metrics + +### Week 8: Integration and Testing +- **Days 1-3**: End-to-end testing of all HA components +- **Days 4-5**: Performance testing and optimization +- **Days 6-7**: Documentation and preparation for Phase 3 + +## Success Criteria + +- [ ] MinIO cluster operational with erasure coding +- [ ] File storage abstraction implemented and tested +- [ ] PostgreSQL HA cluster with automatic failover +- [ ] Redis cluster providing distributed sessions +- [ ] All file operations migrated to object storage +- [ ] Comprehensive monitoring for all infrastructure components +- [ ] Backup and recovery procedures validated + +## Testing Requirements + +### Infrastructure Tests +- MinIO cluster failover scenarios +- PostgreSQL primary/replica failover +- Redis cluster node failure recovery +- Network partition handling + +### Application Integration Tests +- File upload/download through abstraction layer +- Session persistence across application restarts +- Cache performance and invalidation +- Database connection pool behavior + +### Performance Tests +- File storage throughput and latency +- Database query performance with connection pooling +- Cache hit/miss ratios and response times + +## Deliverables + +1. **Infrastructure Components** + - MinIO HA cluster configuration + - PostgreSQL HA cluster with operator + - Redis cluster deployment + - Monitoring and alerting setup + +2. **Application Updates** + - File storage abstraction implementation + - Session management configuration + - Distributed caching integration + - Connection pooling optimization + +3. **Migration Tools** + - File migration utility + - Database migration scripts + - Configuration migration helpers + +4. **Documentation** + - Infrastructure architecture diagrams + - Operational procedures + - Monitoring and alerting guides + +## Dependencies + +- Kubernetes cluster with sufficient resources +- Storage classes for persistent volumes +- Prometheus and Grafana for monitoring +- Network connectivity between components + +## Risks and Mitigations + +### Risk: Data Corruption During File Migration +**Mitigation**: Checksum validation and parallel running of old/new systems + +### Risk: Database Failover Issues +**Mitigation**: Extensive testing of failover scenarios and automated recovery + +### Risk: Cache Inconsistency +**Mitigation**: Proper cache invalidation strategies and monitoring + +--- + +**Previous Phase**: [Phase 1: Core Kubernetes Readiness](K8S-PHASE-1.md) +**Next Phase**: [Phase 3: Production Deployment](K8S-PHASE-3.md) \ No newline at end of file diff --git a/K8S-PHASE-3.md b/K8S-PHASE-3.md new file mode 100644 index 0000000..c2ebbf3 --- /dev/null +++ b/K8S-PHASE-3.md @@ -0,0 +1,862 @@ +# Phase 3: Production Deployment (Weeks 9-12) + +This phase focuses on deploying the modernized application with proper production configurations, monitoring, backup strategies, and operational procedures. + +## Overview + +Phase 3 transforms the development-ready Kubernetes application into a production-grade system with comprehensive monitoring, automated backup and recovery, secure ingress, and operational excellence. This phase ensures the system is ready for enterprise-level workloads with proper security, performance, and reliability guarantees. + +## Key Objectives + +- **Production Kubernetes Deployment**: Configure scalable, secure deployment manifests +- **Ingress and TLS Configuration**: Secure external access with proper routing +- **Comprehensive Monitoring**: Application and infrastructure observability +- **Backup and Disaster Recovery**: Automated backup strategies and recovery procedures +- **Migration Execution**: Seamless transition from legacy system + +## 3.1 Kubernetes Deployment Configuration + +**Objective**: Create production-ready Kubernetes manifests with proper resource management and high availability. + +### Application Deployment Configuration + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: motovault-app + namespace: motovault + labels: + app: motovault + version: v1.0.0 +spec: + replicas: 3 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: motovault + template: + metadata: + labels: + app: motovault + version: v1.0.0 + annotations: + prometheus.io/scrape: "true" + prometheus.io/path: "/metrics" + prometheus.io/port: "8080" + spec: + serviceAccountName: motovault-service-account + securityContext: + runAsNonRoot: true + runAsUser: 1000 + fsGroup: 2000 + affinity: + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - motovault + topologyKey: kubernetes.io/hostname + - weight: 50 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - motovault + topologyKey: topology.kubernetes.io/zone + containers: + - name: motovault + image: motovault:latest + imagePullPolicy: Always + ports: + - containerPort: 8080 + name: http + protocol: TCP + env: + - name: ASPNETCORE_ENVIRONMENT + value: "Production" + - name: ASPNETCORE_URLS + value: "http://+:8080" + envFrom: + - configMapRef: + name: motovault-config + - secretRef: + name: motovault-secrets + resources: + requests: + memory: "512Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "500m" + readinessProbe: + httpGet: + path: /health/ready + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + livenessProbe: + httpGet: + path: /health/live + port: 8080 + initialDelaySeconds: 30 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL + volumeMounts: + - name: tmp-volume + mountPath: /tmp + - name: app-logs + mountPath: /app/logs + volumes: + - name: tmp-volume + emptyDir: {} + - name: app-logs + emptyDir: {} + terminationGracePeriodSeconds: 30 + +--- +apiVersion: v1 +kind: Service +metadata: + name: motovault-service + namespace: motovault + labels: + app: motovault +spec: + type: ClusterIP + ports: + - port: 80 + targetPort: 8080 + protocol: TCP + name: http + selector: + app: motovault + +--- +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: motovault-pdb + namespace: motovault +spec: + minAvailable: 2 + selector: + matchLabels: + app: motovault +``` + +### Horizontal Pod Autoscaler Configuration + +```yaml +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: motovault-hpa + namespace: motovault +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: motovault-app + minReplicas: 3 + maxReplicas: 10 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 + behavior: + scaleUp: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 100 + periodSeconds: 15 + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 10 + periodSeconds: 60 +``` + +### Implementation Tasks + +#### 1. Create production namespace with security policies +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: motovault + labels: + pod-security.kubernetes.io/enforce: restricted + pod-security.kubernetes.io/audit: restricted + pod-security.kubernetes.io/warn: restricted +``` + +#### 2. Configure resource quotas and limits +```yaml +apiVersion: v1 +kind: ResourceQuota +metadata: + name: motovault-quota + namespace: motovault +spec: + hard: + requests.cpu: "4" + requests.memory: 8Gi + limits.cpu: "8" + limits.memory: 16Gi + persistentvolumeclaims: "10" + pods: "20" +``` + +#### 3. Set up service accounts and RBAC +```yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + name: motovault-service-account + namespace: motovault +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: motovault-role + namespace: motovault +rules: +- apiGroups: [""] + resources: ["configmaps", "secrets"] + verbs: ["get", "list"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: motovault-rolebinding + namespace: motovault +subjects: +- kind: ServiceAccount + name: motovault-service-account + namespace: motovault +roleRef: + kind: Role + name: motovault-role + apiGroup: rbac.authorization.k8s.io +``` + +#### 4. Configure pod anti-affinity for high availability +- Spread pods across nodes and availability zones +- Ensure no single point of failure +- Optimize for both performance and availability + +#### 5. Implement rolling update strategy with zero downtime +- Configure progressive rollout with health checks +- Automatic rollback on failure +- Canary deployment capabilities + +## 3.2 Ingress and TLS Configuration + +**Objective**: Configure secure external access with proper TLS termination and routing. + +### Ingress Configuration + +```yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: motovault-ingress + namespace: motovault + annotations: + nginx.ingress.kubernetes.io/ssl-redirect: "true" + nginx.ingress.kubernetes.io/force-ssl-redirect: "true" + nginx.ingress.kubernetes.io/proxy-body-size: "50m" + nginx.ingress.kubernetes.io/proxy-read-timeout: "300" + nginx.ingress.kubernetes.io/proxy-send-timeout: "300" + cert-manager.io/cluster-issuer: "letsencrypt-prod" + nginx.ingress.kubernetes.io/rate-limit: "100" + nginx.ingress.kubernetes.io/rate-limit-window: "1m" +spec: + ingressClassName: nginx + tls: + - hosts: + - motovault.example.com + secretName: motovault-tls + rules: + - host: motovault.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: motovault-service + port: + number: 80 +``` + +### TLS Certificate Management + +```yaml +apiVersion: cert-manager.io/v1 +kind: ClusterIssuer +metadata: + name: letsencrypt-prod +spec: + acme: + server: https://acme-v02.api.letsencrypt.org/directory + email: admin@motovault.example.com + privateKeySecretRef: + name: letsencrypt-prod + solvers: + - http01: + ingress: + class: nginx +``` + +### Implementation Tasks + +#### 1. Deploy cert-manager for automated TLS +```bash +kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml +``` + +#### 2. Configure Let's Encrypt for SSL certificates +- Automated certificate provisioning and renewal +- DNS-01 or HTTP-01 challenge configuration +- Certificate monitoring and alerting + +#### 3. Set up WAF and DDoS protection +```yaml +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: motovault-ingress-policy + namespace: motovault +spec: + podSelector: + matchLabels: + app: motovault + policyTypes: + - Ingress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: nginx-ingress + ports: + - protocol: TCP + port: 8080 +``` + +#### 4. Configure rate limiting and security headers +- Request rate limiting per IP +- Security headers (HSTS, CSP, etc.) +- Request size limitations + +#### 5. Set up health check endpoints for load balancer +- Configure ingress health checks +- Implement graceful degradation +- Monitor certificate expiration + +## 3.3 Monitoring and Observability Setup + +**Objective**: Implement comprehensive monitoring, logging, and alerting for production operations. + +### Prometheus ServiceMonitor Configuration + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: motovault-metrics + namespace: motovault + labels: + app: motovault +spec: + selector: + matchLabels: + app: motovault + endpoints: + - port: http + path: /metrics + interval: 30s + scrapeTimeout: 10s +``` + +### Application Metrics Implementation + +```csharp +public class MetricsService +{ + private readonly Counter _httpRequestsTotal; + private readonly Histogram _httpRequestDuration; + private readonly Gauge _activeConnections; + private readonly Counter _databaseOperationsTotal; + private readonly Histogram _databaseOperationDuration; + + public MetricsService() + { + _httpRequestsTotal = Metrics.CreateCounter( + "motovault_http_requests_total", + "Total number of HTTP requests", + new[] { "method", "endpoint", "status_code" }); + + _httpRequestDuration = Metrics.CreateHistogram( + "motovault_http_request_duration_seconds", + "Duration of HTTP requests in seconds", + new[] { "method", "endpoint" }); + + _activeConnections = Metrics.CreateGauge( + "motovault_active_connections", + "Number of active database connections"); + + _databaseOperationsTotal = Metrics.CreateCounter( + "motovault_database_operations_total", + "Total number of database operations", + new[] { "operation", "table", "status" }); + + _databaseOperationDuration = Metrics.CreateHistogram( + "motovault_database_operation_duration_seconds", + "Duration of database operations in seconds", + new[] { "operation", "table" }); + } + + public void RecordHttpRequest(string method, string endpoint, int statusCode, double duration) + { + _httpRequestsTotal.WithLabels(method, endpoint, statusCode.ToString()).Inc(); + _httpRequestDuration.WithLabels(method, endpoint).Observe(duration); + } + + public void RecordDatabaseOperation(string operation, string table, bool success, double duration) + { + var status = success ? "success" : "error"; + _databaseOperationsTotal.WithLabels(operation, table, status).Inc(); + _databaseOperationDuration.WithLabels(operation, table).Observe(duration); + } +} +``` + +### Grafana Dashboard Configuration + +```json +{ + "dashboard": { + "title": "MotoVaultPro Application Dashboard", + "panels": [ + { + "title": "HTTP Request Rate", + "type": "graph", + "targets": [ + { + "expr": "rate(motovault_http_requests_total[5m])", + "legendFormat": "{{method}} {{endpoint}}" + } + ] + }, + { + "title": "Response Time Percentiles", + "type": "graph", + "targets": [ + { + "expr": "histogram_quantile(0.50, rate(motovault_http_request_duration_seconds_bucket[5m]))", + "legendFormat": "50th percentile" + }, + { + "expr": "histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m]))", + "legendFormat": "95th percentile" + } + ] + }, + { + "title": "Database Connection Pool", + "type": "singlestat", + "targets": [ + { + "expr": "motovault_active_connections", + "legendFormat": "Active Connections" + } + ] + }, + { + "title": "Error Rate", + "type": "graph", + "targets": [ + { + "expr": "rate(motovault_http_requests_total{status_code=~\"5..\"}[5m])", + "legendFormat": "5xx errors" + } + ] + } + ] + } +} +``` + +### Alert Manager Configuration + +```yaml +groups: +- name: motovault.rules + rules: + - alert: HighErrorRate + expr: rate(motovault_http_requests_total{status_code=~"5.."}[5m]) > 0.1 + for: 2m + labels: + severity: critical + annotations: + summary: "High error rate detected" + description: "Error rate is {{ $value }}% for the last 5 minutes" + + - alert: HighResponseTime + expr: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m])) > 2 + for: 5m + labels: + severity: warning + annotations: + summary: "High response time detected" + description: "95th percentile response time is {{ $value }}s" + + - alert: DatabaseConnectionPoolExhaustion + expr: motovault_active_connections > 80 + for: 2m + labels: + severity: warning + annotations: + summary: "Database connection pool nearly exhausted" + description: "Active connections: {{ $value }}/100" + + - alert: PodCrashLooping + expr: rate(kube_pod_container_status_restarts_total{namespace="motovault"}[15m]) > 0 + for: 5m + labels: + severity: critical + annotations: + summary: "Pod is crash looping" + description: "Pod {{ $labels.pod }} is restarting frequently" +``` + +### Implementation Tasks + +#### 1. Deploy Prometheus and Grafana stack +```bash +kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml +``` + +#### 2. Configure application metrics endpoints +- Add Prometheus metrics middleware +- Implement custom business metrics +- Configure metric collection intervals + +#### 3. Set up centralized logging with structured logs +```csharp +builder.Services.AddLogging(loggingBuilder => +{ + loggingBuilder.AddJsonConsole(options => + { + options.JsonWriterOptions = new JsonWriterOptions { Indented = false }; + options.IncludeScopes = true; + options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ"; + }); +}); +``` + +#### 4. Create operational dashboards and alerts +- Application performance dashboards +- Infrastructure monitoring dashboards +- Business metrics and KPIs +- Alert routing and escalation + +#### 5. Implement distributed tracing +```csharp +services.AddOpenTelemetry() + .WithTracing(builder => + { + builder + .AddAspNetCoreInstrumentation() + .AddNpgsql() + .AddRedisInstrumentation() + .AddJaegerExporter(); + }); +``` + +## 3.4 Backup and Disaster Recovery + +**Objective**: Implement comprehensive backup strategies and disaster recovery procedures. + +### Velero Backup Configuration + +```yaml +apiVersion: velero.io/v1 +kind: Schedule +metadata: + name: motovault-daily-backup + namespace: velero +spec: + schedule: "0 2 * * *" # Daily at 2 AM + template: + includedNamespaces: + - motovault + includedResources: + - "*" + storageLocation: default + ttl: 720h0m0s # 30 days + snapshotVolumes: true + +--- +apiVersion: velero.io/v1 +kind: Schedule +metadata: + name: motovault-weekly-backup + namespace: velero +spec: + schedule: "0 3 * * 0" # Weekly on Sunday at 3 AM + template: + includedNamespaces: + - motovault + includedResources: + - "*" + storageLocation: default + ttl: 2160h0m0s # 90 days + snapshotVolumes: true +``` + +### Database Backup Strategy + +```bash +#!/bin/bash +# Automated database backup script + +BACKUP_DATE=$(date +%Y%m%d_%H%M%S) +BACKUP_FILE="motovault_backup_${BACKUP_DATE}.sql" +S3_BUCKET="motovault-backups" + +# Create database backup +kubectl exec -n motovault motovault-postgres-1 -- \ + pg_dump -U postgres motovault > "${BACKUP_FILE}" + +# Compress backup +gzip "${BACKUP_FILE}" + +# Upload to S3/MinIO +aws s3 cp "${BACKUP_FILE}.gz" "s3://${S3_BUCKET}/database/" + +# Clean up local file +rm "${BACKUP_FILE}.gz" + +# Retain only last 30 days of backups +aws s3api list-objects-v2 \ + --bucket "${S3_BUCKET}" \ + --prefix "database/" \ + --query 'Contents[?LastModified<=`'$(date -d "30 days ago" --iso-8601)'`].[Key]' \ + --output text | \ + xargs -I {} aws s3 rm "s3://${S3_BUCKET}/{}" +``` + +### Disaster Recovery Procedures + +```bash +#!/bin/bash +# Full system recovery script + +BACKUP_DATE=$1 +if [ -z "$BACKUP_DATE" ]; then + echo "Usage: $0 " + echo "Example: $0 20240120_020000" + exit 1 +fi + +# Stop application +echo "Scaling down application..." +kubectl scale deployment motovault-app --replicas=0 -n motovault + +# Restore database +echo "Restoring database from backup..." +aws s3 cp "s3://motovault-backups/database/database_backup_${BACKUP_DATE}.sql.gz" . +gunzip "database_backup_${BACKUP_DATE}.sql.gz" +kubectl exec -i motovault-postgres-1 -n motovault -- \ + psql -U postgres -d motovault < "database_backup_${BACKUP_DATE}.sql" + +# Restore MinIO data +echo "Restoring MinIO data..." +aws s3 sync "s3://motovault-backups/minio/${BACKUP_DATE}/" /tmp/minio_restore/ +mc mirror /tmp/minio_restore/ motovault-minio/motovault-files/ + +# Restart application +echo "Scaling up application..." +kubectl scale deployment motovault-app --replicas=3 -n motovault + +# Verify health +echo "Waiting for application to be ready..." +kubectl wait --for=condition=ready pod -l app=motovault -n motovault --timeout=300s + +echo "Recovery completed successfully" +``` + +### Implementation Tasks + +#### 1. Deploy Velero for Kubernetes backup +```bash +velero install \ + --provider aws \ + --plugins velero/velero-plugin-for-aws:v1.7.0 \ + --bucket motovault-backups \ + --backup-location-config region=us-west-2 \ + --snapshot-location-config region=us-west-2 +``` + +#### 2. Configure automated database backups +- Point-in-time recovery setup +- Incremental backup strategies +- Cross-region backup replication + +#### 3. Implement MinIO backup synchronization +- Automated file backup to external storage +- Metadata backup and restoration +- Verification of backup integrity + +#### 4. Create disaster recovery runbooks +- Step-by-step recovery procedures +- RTO/RPO definitions and testing +- Contact information and escalation procedures + +#### 5. Set up backup monitoring and alerting +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: backup-alerts +spec: + groups: + - name: backup.rules + rules: + - alert: BackupFailed + expr: velero_backup_failure_total > 0 + labels: + severity: critical + annotations: + summary: "Backup operation failed" + description: "Velero backup has failed" +``` + +## Week-by-Week Breakdown + +### Week 9: Production Kubernetes Configuration +- **Days 1-2**: Create production deployment manifests +- **Days 3-4**: Configure HPA, PDB, and resource quotas +- **Days 5-7**: Set up RBAC and security policies + +### Week 10: Ingress and TLS Setup +- **Days 1-2**: Deploy and configure ingress controller +- **Days 3-4**: Set up cert-manager and TLS certificates +- **Days 5-7**: Configure security policies and rate limiting + +### Week 11: Monitoring and Observability +- **Days 1-3**: Deploy Prometheus and Grafana stack +- **Days 4-5**: Configure application metrics and dashboards +- **Days 6-7**: Set up alerting and notification channels + +### Week 12: Backup and Migration Preparation +- **Days 1-3**: Deploy and configure backup solutions +- **Days 4-5**: Create migration scripts and procedures +- **Days 6-7**: Execute migration dry runs and validation + +## Success Criteria + +- [ ] Production Kubernetes deployment with 99.9% availability +- [ ] Secure ingress with automated TLS certificate management +- [ ] Comprehensive monitoring with alerting +- [ ] Automated backup and recovery procedures tested +- [ ] Migration procedures validated and documented +- [ ] Security policies and network controls implemented +- [ ] Performance baselines established and monitored + +## Testing Requirements + +### Production Readiness Tests +- Load testing under expected traffic patterns +- Failover testing for all components +- Security penetration testing +- Backup and recovery validation + +### Performance Tests +- Application response time under load +- Database performance with connection pooling +- Cache performance and hit ratios +- Network latency and throughput + +### Security Tests +- Container image vulnerability scanning +- Network policy validation +- Authentication and authorization testing +- TLS configuration verification + +## Deliverables + +1. **Production Deployment** + - Complete Kubernetes manifests + - Security configurations + - Monitoring and alerting setup + - Backup and recovery procedures + +2. **Documentation** + - Operational runbooks + - Security procedures + - Monitoring guides + - Disaster recovery plans + +3. **Migration Tools** + - Data migration scripts + - Validation tools + - Rollback procedures + +## Dependencies + +- Production Kubernetes cluster +- External storage for backups +- DNS management for ingress +- Certificate authority for TLS +- Monitoring infrastructure + +## Risks and Mitigations + +### Risk: Extended Downtime During Migration +**Mitigation**: Blue-green deployment strategy with comprehensive rollback plan + +### Risk: Data Integrity Issues +**Mitigation**: Extensive validation and parallel running during transition + +### Risk: Performance Degradation +**Mitigation**: Load testing and gradual traffic migration + +--- + +**Previous Phase**: [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md) +**Next Phase**: [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md) \ No newline at end of file diff --git a/K8S-PHASE-4.md b/K8S-PHASE-4.md new file mode 100644 index 0000000..1b343d6 --- /dev/null +++ b/K8S-PHASE-4.md @@ -0,0 +1,885 @@ +# Phase 4: Advanced Features and Optimization (Weeks 13-16) + +This phase focuses on advanced cloud-native features, performance optimization, security enhancements, and final production migration. + +## Overview + +Phase 4 elevates MotoVaultPro to a truly cloud-native application with enterprise-grade features including advanced caching strategies, performance optimization, enhanced security, and seamless production migration. This phase ensures the system is optimized for scale, security, and operational excellence. + +## Key Objectives + +- **Advanced Caching Strategies**: Multi-layer caching for optimal performance +- **Performance Optimization**: Database and application tuning for high load +- **Security Enhancements**: Advanced security features and compliance +- **Production Migration**: Final cutover and optimization +- **Operational Excellence**: Advanced monitoring and automation + +## 4.1 Advanced Caching Strategies + +**Objective**: Implement multi-layer caching for optimal performance and reduced database load. + +### Cache Architecture + +``` +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ Browser │ │ CDN/Proxy │ │ Application │ +│ Cache │◄──►│ Cache │◄──►│ Memory Cache │ +│ (Static) │ │ (Static + │ │ (L1) │ +│ │ │ Dynamic) │ │ │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ + ┌─────────────────┐ + │ Redis Cache │ + │ (L2) │ + │ Distributed │ + └─────────────────┘ + │ + ┌─────────────────┐ + │ Database │ + │ (Source) │ + │ │ + └─────────────────┘ +``` + +### Multi-Level Cache Service Implementation + +```csharp +public class MultiLevelCacheService +{ + private readonly IMemoryCache _memoryCache; + private readonly IDistributedCache _distributedCache; + private readonly ILogger _logger; + + public async Task GetAsync(string key, Func> factory, TimeSpan? expiration = null) + { + // L1 Cache - Memory + if (_memoryCache.TryGetValue(key, out T cachedValue)) + { + _logger.LogDebug("Cache hit (L1): {Key}", key); + return cachedValue; + } + + // L2 Cache - Redis + var distributedValue = await _distributedCache.GetStringAsync(key); + if (distributedValue != null) + { + var deserializedValue = JsonSerializer.Deserialize(distributedValue); + _memoryCache.Set(key, deserializedValue, TimeSpan.FromMinutes(5)); // Short-lived L1 cache + _logger.LogDebug("Cache hit (L2): {Key}", key); + return deserializedValue; + } + + // Cache miss - fetch from source + _logger.LogDebug("Cache miss: {Key}", key); + var value = await factory(); + + // Store in both cache levels + var serializedValue = JsonSerializer.Serialize(value); + await _distributedCache.SetStringAsync(key, serializedValue, new DistributedCacheEntryOptions + { + SlidingExpiration = expiration ?? TimeSpan.FromHours(1) + }); + + _memoryCache.Set(key, value, TimeSpan.FromMinutes(5)); + + return value; + } +} +``` + +### Cache Invalidation Strategy + +```csharp +public class CacheInvalidationService +{ + private readonly IDistributedCache _distributedCache; + private readonly IMemoryCache _memoryCache; + private readonly ILogger _logger; + + public async Task InvalidatePatternAsync(string pattern) + { + // Implement cache invalidation using Redis key pattern matching + var keys = await GetKeysMatchingPatternAsync(pattern); + + var tasks = keys.Select(async key => + { + await _distributedCache.RemoveAsync(key); + _memoryCache.Remove(key); + _logger.LogDebug("Invalidated cache key: {Key}", key); + }); + + await Task.WhenAll(tasks); + } + + public async Task InvalidateVehicleDataAsync(int vehicleId) + { + var patterns = new[] + { + $"vehicle:{vehicleId}:*", + $"dashboard:{vehicleId}:*", + $"reports:{vehicleId}:*" + }; + + foreach (var pattern in patterns) + { + await InvalidatePatternAsync(pattern); + } + } +} +``` + +### Implementation Tasks + +#### 1. Implement intelligent cache warming +```csharp +public class CacheWarmupService : BackgroundService +{ + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + while (!stoppingToken.IsCancellationRequested) + { + await WarmupFrequentlyAccessedData(); + await Task.Delay(TimeSpan.FromHours(1), stoppingToken); + } + } + + private async Task WarmupFrequentlyAccessedData() + { + // Pre-load dashboard data for active users + var activeUsers = await GetActiveUsersAsync(); + + var warmupTasks = activeUsers.Select(async user => + { + await _cacheService.GetAsync($"dashboard:{user.Id}", + () => _dashboardService.GetDashboardDataAsync(user.Id)); + }); + + await Task.WhenAll(warmupTasks); + } +} +``` + +#### 2. Configure CDN integration for static assets +```yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: motovault-cdn-ingress + annotations: + nginx.ingress.kubernetes.io/configuration-snippet: | + add_header Cache-Control "public, max-age=31536000, immutable"; + add_header X-Cache-Status $upstream_cache_status; +spec: + rules: + - host: cdn.motovault.example.com + http: + paths: + - path: /static + pathType: Prefix + backend: + service: + name: motovault-service + port: + number: 80 +``` + +#### 3. Implement cache monitoring and metrics +```csharp +public class CacheMetricsMiddleware +{ + private readonly Counter _cacheHits; + private readonly Counter _cacheMisses; + private readonly Histogram _cacheLatency; + + public async Task InvokeAsync(HttpContext context, RequestDelegate next) + { + var stopwatch = Stopwatch.StartNew(); + + // Track cache operations during request + context.Response.OnStarting(() => + { + var cacheStatus = context.Response.Headers["X-Cache-Status"].FirstOrDefault(); + + if (cacheStatus == "HIT") + _cacheHits.Inc(); + else if (cacheStatus == "MISS") + _cacheMisses.Inc(); + + _cacheLatency.Observe(stopwatch.Elapsed.TotalSeconds); + return Task.CompletedTask; + }); + + await next(context); + } +} +``` + +## 4.2 Performance Optimization + +**Objective**: Optimize application performance for high-load scenarios. + +### Database Query Optimization + +```csharp +public class OptimizedVehicleService +{ + private readonly IDbContextFactory _dbContextFactory; + private readonly IMemoryCache _cache; + + public async Task GetDashboardDataAsync(int userId, int vehicleId) + { + var cacheKey = $"dashboard:{userId}:{vehicleId}"; + + if (_cache.TryGetValue(cacheKey, out VehicleDashboardData cached)) + { + return cached; + } + + using var context = _dbContextFactory.CreateDbContext(); + + // Optimized single query with projections + var dashboardData = await context.Vehicles + .Where(v => v.Id == vehicleId && v.UserId == userId) + .Select(v => new VehicleDashboardData + { + Vehicle = v, + RecentServices = v.ServiceRecords + .OrderByDescending(s => s.Date) + .Take(5) + .ToList(), + UpcomingReminders = v.ReminderRecords + .Where(r => r.IsActive && r.DueDate > DateTime.Now) + .OrderBy(r => r.DueDate) + .Take(5) + .ToList(), + FuelEfficiency = v.GasRecords + .Where(g => g.Date >= DateTime.Now.AddMonths(-3)) + .Average(g => g.Efficiency), + TotalMileage = v.OdometerRecords + .OrderByDescending(o => o.Date) + .FirstOrDefault().Mileage ?? 0 + }) + .AsNoTracking() + .FirstOrDefaultAsync(); + + _cache.Set(cacheKey, dashboardData, TimeSpan.FromMinutes(15)); + return dashboardData; + } +} +``` + +### Connection Pool Optimization + +```csharp +services.AddDbContextFactory(options => +{ + options.UseNpgsql(connectionString, npgsqlOptions => + { + npgsqlOptions.EnableRetryOnFailure( + maxRetryCount: 3, + maxRetryDelay: TimeSpan.FromSeconds(5), + errorCodesToAdd: null); + npgsqlOptions.CommandTimeout(30); + }); + + // Optimize for read-heavy workloads + options.EnableSensitiveDataLogging(false); + options.EnableServiceProviderCaching(); + options.EnableDetailedErrors(false); +}, ServiceLifetime.Singleton); + +// Configure connection pooling +services.Configure(builder => +{ + builder.MaxPoolSize = 100; + builder.MinPoolSize = 10; + builder.ConnectionLifetime = 300; + builder.ConnectionPruningInterval = 10; + builder.ConnectionIdleLifetime = 300; +}); +``` + +### Application Performance Optimization + +```csharp +public class PerformanceOptimizationService +{ + // Implement bulk operations for data modifications + public async Task BulkUpdateServiceRecordsAsync( + List records) + { + using var context = _dbContextFactory.CreateDbContext(); + + // Use EF Core bulk operations + context.AttachRange(records); + context.UpdateRange(records); + + var affectedRows = await context.SaveChangesAsync(); + + // Invalidate related cache entries + var vehicleIds = records.Select(r => r.VehicleId).Distinct(); + foreach (var vehicleId in vehicleIds) + { + await _cacheInvalidation.InvalidateVehicleDataAsync(vehicleId); + } + + return new BulkUpdateResult { AffectedRows = affectedRows }; + } + + // Implement read-through cache for expensive calculations + public async Task GetFuelEfficiencyReportAsync( + int vehicleId, + DateTime startDate, + DateTime endDate) + { + var cacheKey = $"fuel_report:{vehicleId}:{startDate:yyyyMM}:{endDate:yyyyMM}"; + + return await _multiLevelCache.GetAsync(cacheKey, async () => + { + using var context = _dbContextFactory.CreateDbContext(); + + var gasRecords = await context.GasRecords + .Where(g => g.VehicleId == vehicleId && + g.Date >= startDate && + g.Date <= endDate) + .AsNoTracking() + .ToListAsync(); + + return CalculateFuelEfficiencyReport(gasRecords); + }, TimeSpan.FromHours(6)); + } +} +``` + +### Implementation Tasks + +#### 1. Implement database indexing strategy +```sql +-- Create optimized indexes for common queries +CREATE INDEX CONCURRENTLY idx_gasrecords_vehicle_date + ON gas_records(vehicle_id, date DESC); + +CREATE INDEX CONCURRENTLY idx_servicerecords_vehicle_date + ON service_records(vehicle_id, date DESC); + +CREATE INDEX CONCURRENTLY idx_reminderrecords_active_due + ON reminder_records(is_active, due_date) + WHERE is_active = true; + +-- Partial indexes for better performance +CREATE INDEX CONCURRENTLY idx_vehicles_active_users + ON vehicles(user_id) + WHERE is_active = true; +``` + +#### 2. Configure response compression and bundling +```csharp +builder.Services.AddResponseCompression(options => +{ + options.Providers.Add(); + options.Providers.Add(); + options.MimeTypes = ResponseCompressionDefaults.MimeTypes.Concat( + new[] { "application/json", "text/css", "application/javascript" }); +}); + +builder.Services.Configure(options => +{ + options.Level = CompressionLevel.Optimal; +}); +``` + +#### 3. Implement request batching for API endpoints +```csharp +[HttpPost("batch")] +public async Task BatchOperations([FromBody] BatchRequest request) +{ + var results = new List(); + + // Execute operations in parallel where possible + var tasks = request.Operations.Select(async operation => + { + try + { + var result = await ExecuteOperationAsync(operation); + return new BatchResult { Success = true, Data = result }; + } + catch (Exception ex) + { + return new BatchResult { Success = false, Error = ex.Message }; + } + }); + + results.AddRange(await Task.WhenAll(tasks)); + return Ok(new { Results = results }); +} +``` + +## 4.3 Security Enhancements + +**Objective**: Implement advanced security features for production deployment. + +### Network Security Policies + +```yaml +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: motovault-network-policy + namespace: motovault +spec: + podSelector: + matchLabels: + app: motovault + policyTypes: + - Ingress + - Egress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: nginx-ingress + ports: + - protocol: TCP + port: 8080 + egress: + - to: + - namespaceSelector: + matchLabels: + name: motovault + ports: + - protocol: TCP + port: 5432 # PostgreSQL + - protocol: TCP + port: 6379 # Redis + - protocol: TCP + port: 9000 # MinIO + - to: [] # Allow external HTTPS for OIDC + ports: + - protocol: TCP + port: 443 + - protocol: TCP + port: 80 +``` + +### Pod Security Standards + +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: motovault + labels: + pod-security.kubernetes.io/enforce: restricted + pod-security.kubernetes.io/audit: restricted + pod-security.kubernetes.io/warn: restricted +``` + +### External Secrets Management + +```yaml +apiVersion: external-secrets.io/v1beta1 +kind: SecretStore +metadata: + name: vault-backend + namespace: motovault +spec: + provider: + vault: + server: "https://vault.example.com" + path: "secret" + version: "v2" + auth: + kubernetes: + mountPath: "kubernetes" + role: "motovault-role" + +--- +apiVersion: external-secrets.io/v1beta1 +kind: ExternalSecret +metadata: + name: motovault-secrets + namespace: motovault +spec: + refreshInterval: 1h + secretStoreRef: + name: vault-backend + kind: SecretStore + target: + name: motovault-secrets + creationPolicy: Owner + data: + - secretKey: POSTGRES_CONNECTION + remoteRef: + key: motovault/database + property: connection_string + - secretKey: JWT_SECRET + remoteRef: + key: motovault/auth + property: jwt_secret +``` + +### Application Security Enhancements + +```csharp +public class SecurityMiddleware +{ + public async Task InvokeAsync(HttpContext context, RequestDelegate next) + { + // Add security headers + context.Response.Headers.Add("X-Content-Type-Options", "nosniff"); + context.Response.Headers.Add("X-Frame-Options", "DENY"); + context.Response.Headers.Add("X-XSS-Protection", "1; mode=block"); + context.Response.Headers.Add("Referrer-Policy", "strict-origin-when-cross-origin"); + context.Response.Headers.Add("Permissions-Policy", "geolocation=(), microphone=(), camera=()"); + + // Content Security Policy + var csp = "default-src 'self'; " + + "script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; " + + "style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; " + + "img-src 'self' data: https:; " + + "connect-src 'self';"; + context.Response.Headers.Add("Content-Security-Policy", csp); + + await next(context); + } +} +``` + +### Implementation Tasks + +#### 1. Implement container image scanning +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Workflow +metadata: + name: security-scan +spec: + entrypoint: scan-workflow + templates: + - name: scan-workflow + steps: + - - name: trivy-scan + template: trivy-container-scan + - - name: publish-results + template: publish-scan-results + - name: trivy-container-scan + container: + image: aquasec/trivy:latest + command: [trivy] + args: ["image", "--exit-code", "1", "--severity", "HIGH,CRITICAL", "motovault:latest"] +``` + +#### 2. Configure security monitoring and alerting +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: security-alerts +spec: + groups: + - name: security.rules + rules: + - alert: HighFailedLoginAttempts + expr: rate(motovault_failed_login_attempts_total[5m]) > 10 + labels: + severity: warning + annotations: + summary: "High number of failed login attempts" + description: "{{ $value }} failed login attempts per second" + + - alert: SuspiciousNetworkActivity + expr: rate(container_network_receive_bytes_total{namespace="motovault"}[5m]) > 1e8 + labels: + severity: critical + annotations: + summary: "Unusual network activity detected" +``` + +#### 3. Implement rate limiting and DDoS protection +```csharp +services.AddRateLimiter(options => +{ + options.RejectionStatusCode = StatusCodes.Status429TooManyRequests; + + options.AddFixedWindowLimiter("api", limiterOptions => + { + limiterOptions.PermitLimit = 100; + limiterOptions.Window = TimeSpan.FromMinutes(1); + limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; + limiterOptions.QueueLimit = 10; + }); + + options.AddSlidingWindowLimiter("login", limiterOptions => + { + limiterOptions.PermitLimit = 5; + limiterOptions.Window = TimeSpan.FromMinutes(5); + limiterOptions.SegmentsPerWindow = 5; + }); +}); +``` + +## 4.4 Production Migration Execution + +**Objective**: Execute seamless production migration with minimal downtime. + +### Blue-Green Deployment Strategy + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Rollout +metadata: + name: motovault-rollout + namespace: motovault +spec: + replicas: 5 + strategy: + blueGreen: + activeService: motovault-active + previewService: motovault-preview + autoPromotionEnabled: false + scaleDownDelaySeconds: 30 + prePromotionAnalysis: + templates: + - templateName: health-check + args: + - name: service-name + value: motovault-preview + postPromotionAnalysis: + templates: + - templateName: performance-check + args: + - name: service-name + value: motovault-active + selector: + matchLabels: + app: motovault + template: + metadata: + labels: + app: motovault + spec: + containers: + - name: motovault + image: motovault:latest + # ... container specification +``` + +### Migration Validation Scripts + +```bash +#!/bin/bash +# Production migration validation script + +echo "Starting production migration validation..." + +# Validate database connectivity +echo "Checking database connectivity..." +kubectl exec -n motovault deployment/motovault-app -- \ + curl -f http://localhost:8080/health/ready || exit 1 + +# Validate MinIO connectivity +echo "Checking MinIO connectivity..." +kubectl exec -n motovault deployment/motovault-app -- \ + curl -f http://minio-service:9000/minio/health/live || exit 1 + +# Validate Redis connectivity +echo "Checking Redis connectivity..." +kubectl exec -n motovault redis-cluster-0 -- \ + redis-cli ping || exit 1 + +# Test critical user journeys +echo "Testing critical user journeys..." +python3 migration_tests.py --endpoint https://motovault.example.com + +# Validate performance metrics +echo "Checking performance metrics..." +response_time=$(curl -s "http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,rate(motovault_http_request_duration_seconds_bucket[5m]))" | jq -r '.data.result[0].value[1]') +if (( $(echo "$response_time > 2.0" | bc -l) )); then + echo "Performance degradation detected: ${response_time}s" + exit 1 +fi + +echo "Migration validation completed successfully" +``` + +### Rollback Procedures + +```bash +#!/bin/bash +# Emergency rollback script + +echo "Initiating emergency rollback..." + +# Switch traffic back to previous version +kubectl patch rollout motovault-rollout -n motovault \ + --type='merge' -p='{"spec":{"strategy":{"blueGreen":{"activeService":"motovault-previous"}}}}' + +# Scale down new version +kubectl scale deployment motovault-app-new --replicas=0 -n motovault + +# Restore database from last known good backup +BACKUP_TIMESTAMP=$(date -d "1 hour ago" +"%Y%m%d_%H0000") +./restore_database.sh "$BACKUP_TIMESTAMP" + +# Validate rollback success +curl -f https://motovault.example.com/health/ready + +echo "Rollback completed" +``` + +### Implementation Tasks + +#### 1. Execute phased traffic migration +```yaml +apiVersion: networking.istio.io/v1beta1 +kind: VirtualService +metadata: + name: motovault-traffic-split +spec: + http: + - match: + - headers: + x-canary: + exact: "true" + route: + - destination: + host: motovault-service + subset: v2 + weight: 100 + - route: + - destination: + host: motovault-service + subset: v1 + weight: 90 + - destination: + host: motovault-service + subset: v2 + weight: 10 +``` + +#### 2. Implement automated rollback triggers +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: AnalysisTemplate +metadata: + name: automated-rollback +spec: + metrics: + - name: error-rate + provider: + prometheus: + address: http://prometheus:9090 + query: rate(motovault_http_requests_total{status_code=~"5.."}[2m]) + successCondition: result[0] < 0.05 + failureLimit: 3 + - name: response-time + provider: + prometheus: + address: http://prometheus:9090 + query: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[2m])) + successCondition: result[0] < 2.0 + failureLimit: 3 +``` + +#### 3. Configure comprehensive monitoring during migration +- Real-time error rate monitoring +- Performance metric tracking +- User experience validation +- Resource utilization monitoring + +## Week-by-Week Breakdown + +### Week 13: Advanced Caching and Performance +- **Days 1-2**: Implement multi-level caching architecture +- **Days 3-4**: Optimize database queries and connection pooling +- **Days 5-7**: Configure CDN and response optimization + +### Week 14: Security Enhancements +- **Days 1-2**: Implement advanced security policies +- **Days 3-4**: Configure external secrets management +- **Days 5-7**: Set up security monitoring and scanning + +### Week 15: Production Migration +- **Days 1-2**: Execute database migration and validation +- **Days 3-4**: Perform blue-green deployment cutover +- **Days 5-7**: Monitor performance and user experience + +### Week 16: Optimization and Documentation +- **Days 1-3**: Performance tuning based on production metrics +- **Days 4-5**: Complete operational documentation +- **Days 6-7**: Team training and knowledge transfer + +## Success Criteria + +- [ ] Multi-layer caching reducing database load by 70% +- [ ] 95th percentile response time under 500ms +- [ ] Zero-downtime production migration +- [ ] Advanced security policies implemented and validated +- [ ] Comprehensive monitoring and alerting operational +- [ ] Team trained on new operational procedures +- [ ] Performance optimization achieving 10x scalability + +## Testing Requirements + +### Performance Validation +- Load testing with 10x expected traffic +- Database performance under stress +- Cache efficiency and hit ratios +- End-to-end response time validation + +### Security Testing +- Penetration testing of all endpoints +- Container security scanning +- Network policy validation +- Authentication and authorization testing + +### Migration Testing +- Complete migration dry runs +- Rollback procedure validation +- Data integrity verification +- User acceptance testing + +## Deliverables + +1. **Optimized Application** + - Multi-layer caching implementation + - Performance-optimized queries + - Security-hardened deployment + - Production-ready configuration + +2. **Migration Artifacts** + - Migration scripts and procedures + - Rollback automation + - Validation tools + - Performance baselines + +3. **Documentation** + - Operational runbooks + - Performance tuning guides + - Security procedures + - Training materials + +## Final Success Metrics + +### Technical Achievements +- **Availability**: 99.9% uptime achieved +- **Performance**: 95th percentile response time < 500ms +- **Scalability**: 10x user load capacity demonstrated +- **Security**: Zero critical vulnerabilities + +### Operational Achievements +- **Deployment**: Zero-downtime deployments enabled +- **Recovery**: RTO < 30 minutes, RPO < 5 minutes +- **Monitoring**: 100% observability coverage +- **Automation**: 90% reduction in manual operations + +### Business Value +- **User Experience**: No degradation during migration +- **Cost Efficiency**: Infrastructure costs optimized +- **Future Readiness**: Foundation for advanced features +- **Operational Excellence**: Reduced maintenance overhead + +--- + +**Previous Phase**: [Phase 3: Production Deployment](K8S-PHASE-3.md) +**Project Overview**: [Kubernetes Modernization Overview](K8S-OVERVIEW.md) \ No newline at end of file diff --git a/K8S-REFACTOR.md b/K8S-REFACTOR.md new file mode 100644 index 0000000..207480a --- /dev/null +++ b/K8S-REFACTOR.md @@ -0,0 +1,2009 @@ +# Kubernetes Modernization Plan for MotoVaultPro + +## Executive Summary + +This document outlines a comprehensive plan to modernize MotoVaultPro from a traditional self-hosted application to a cloud-native, highly available system running on Kubernetes. The modernization focuses on transforming the current monolithic ASP.NET Core application into a resilient, scalable platform capable of handling enterprise-level workloads while maintaining the existing feature set and user experience. + +### Key Objectives +- **High Availability**: Eliminate single points of failure through distributed architecture +- **Scalability**: Enable horizontal scaling to handle increased user loads +- **Resilience**: Implement fault tolerance and automatic recovery mechanisms +- **Cloud-Native**: Adopt Kubernetes-native patterns and best practices +- **Operational Excellence**: Improve monitoring, logging, and maintenance capabilities + +### Strategic Benefits +- **Reduced Downtime**: Multi-replica deployments with automatic failover +- **Improved Performance**: Distributed caching and optimized data access patterns +- **Enhanced Security**: Pod-level isolation and secret management +- **Cost Optimization**: Efficient resource utilization through auto-scaling +- **Future-Ready**: Foundation for microservices and advanced cloud features + +## Current Architecture Analysis + +### Existing System Overview +MotoVaultPro is currently deployed as a monolithic ASP.NET Core 8.0 application with the following characteristics: + +#### Application Architecture +- **Monolithic Design**: Single deployable unit containing all functionality +- **MVC Pattern**: Traditional Model-View-Controller architecture +- **Dual Database Support**: LiteDB (embedded) and PostgreSQL (external) +- **File Storage**: Local filesystem for document attachments +- **Session Management**: In-memory or cookie-based sessions +- **Configuration**: File-based configuration with environment variables + +#### Current Deployment Model +- **Single Instance**: Typically deployed as a single container or VM +- **Stateful**: Relies on local storage for files and embedded database +- **Limited Scalability**: Cannot horizontally scale due to state dependencies +- **Single Point of Failure**: No redundancy or automatic recovery + +#### Identified Limitations for Kubernetes +1. **State Dependencies**: LiteDB and local file storage prevent stateless operation +2. **Configuration Management**: File-based configuration not suitable for container orchestration +3. **Health Monitoring**: Lacks Kubernetes-compatible health check endpoints +4. **Logging**: Basic logging not optimized for centralized log aggregation +5. **Resource Management**: No resource constraints or auto-scaling capabilities +6. **Secret Management**: Sensitive configuration stored in plain text files + +## Target Architecture + +### Cloud-Native Design Principles +The modernized architecture will embrace the following cloud-native principles: + +#### Stateless Application Design +- **External State Storage**: All state moved to external, highly available services +- **Horizontal Scalability**: Multiple application replicas with load balancing +- **Configuration as Code**: All configuration externalized to ConfigMaps and Secrets +- **Ephemeral Containers**: Pods can be created, destroyed, and recreated without data loss + +#### Distributed Data Architecture +- **PostgreSQL Cluster**: Primary/replica configuration with automatic failover +- **MinIO High Availability**: Distributed object storage for file attachments +- **Redis Cluster**: Distributed caching and session storage +- **Backup Strategy**: Automated backups with point-in-time recovery + +#### Observability and Operations +- **Structured Logging**: JSON logging with correlation IDs for distributed tracing +- **Metrics Collection**: Prometheus-compatible metrics for monitoring +- **Health Checks**: Kubernetes-native readiness and liveness probes +- **Distributed Tracing**: OpenTelemetry integration for request flow analysis + +### High-Level Architecture Diagram +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Kubernetes Cluster │ +├─────────────────────────────────────────────────────────────────┤ +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ +│ │ MotoVault │ │ MotoVault │ │ MotoVault │ │ +│ │ Pod (1) │ │ Pod (2) │ │ Pod (3) │ │ +│ │ │ │ │ │ │ │ +│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ +│ │ │ │ │ +│ ┌─────────────────────────────────────────────────────────────┐ │ +│ │ Load Balancer Service │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ │ │ │ │ +├───────────┼─────────────────────┼─────────────────────┼──────────┤ +│ ┌────────▼──────┐ ┌─────────▼──────┐ ┌─────────▼──────┐ │ +│ │ PostgreSQL │ │ Redis Cluster │ │ MinIO Cluster │ │ +│ │ Primary │ │ (3 nodes) │ │ (4+ nodes) │ │ +│ │ + 2 Replicas │ │ │ │ Erasure Coded │ │ +│ └───────────────┘ └────────────────┘ └────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +## Detailed Implementation Phases + +### Phase 1: Core Kubernetes Readiness (Weeks 1-4) + +This phase focuses on making the application compatible with Kubernetes deployment patterns while maintaining existing functionality. + +#### 1.1 Configuration Externalization + +**Objective**: Move all configuration from files to Kubernetes-native configuration management. + +**Current State**: +- Configuration stored in `appsettings.json` and environment variables +- Database connection strings in configuration files +- Feature flags and application settings mixed with deployment configuration + +**Target State**: +- All configuration externalized to ConfigMaps and Secrets +- Environment-specific configuration separated from application code +- Sensitive data (passwords, API keys) managed through Kubernetes Secrets + +**Implementation Tasks**: +1. **Create ConfigMap templates** for non-sensitive configuration + ```yaml + apiVersion: v1 + kind: ConfigMap + metadata: + name: motovault-config + data: + APP_NAME: "MotoVaultPro" + LOG_LEVEL: "Information" + ENABLE_FEATURES: "OpenIDConnect,EmailNotifications" + CACHE_EXPIRY_MINUTES: "30" + ``` + +2. **Create Secret templates** for sensitive configuration + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: motovault-secrets + type: Opaque + data: + POSTGRES_CONNECTION: + MINIO_ACCESS_KEY: + MINIO_SECRET_KEY: + JWT_SECRET: + ``` + +3. **Modify application startup** to read from environment variables +4. **Remove file-based configuration** dependencies +5. **Implement configuration validation** at startup + +#### 1.2 Database Architecture Modernization + +**Objective**: Eliminate LiteDB dependency and optimize PostgreSQL usage for Kubernetes. + +**Current State**: +- Dual database support with LiteDB as default +- Single PostgreSQL connection for external database mode +- No connection pooling optimization for multiple instances + +**Target State**: +- PostgreSQL-only configuration with high availability +- Optimized connection pooling for horizontal scaling +- Database migration strategy for existing LiteDB installations + +**Implementation Tasks**: +1. **Remove LiteDB implementation** and dependencies +2. **Implement PostgreSQL HA configuration**: + ```csharp + services.AddDbContext(options => + { + options.UseNpgsql(connectionString, npgsqlOptions => + { + npgsqlOptions.EnableRetryOnFailure( + maxRetryCount: 3, + maxRetryDelay: TimeSpan.FromSeconds(5), + errorCodesToAdd: null); + }); + }); + ``` +3. **Add connection pooling configuration**: + ```csharp + // Configure connection pooling for multiple instances + services.Configure(options => + { + options.MaxPoolSize = 100; + options.MinPoolSize = 10; + options.ConnectionLifetime = 300; // 5 minutes + }); + ``` +4. **Create data migration tools** for LiteDB to PostgreSQL conversion +5. **Implement database health checks** for Kubernetes probes + +#### 1.3 Health Check Implementation + +**Objective**: Add Kubernetes-compatible health check endpoints for proper orchestration. + +**Current State**: +- No dedicated health check endpoints +- Application startup/shutdown not optimized for Kubernetes + +**Target State**: +- Comprehensive health checks for all dependencies +- Proper readiness and liveness probe endpoints +- Graceful shutdown handling for pod termination + +**Implementation Tasks**: +1. **Add health check middleware**: + ```csharp + // Program.cs + builder.Services.AddHealthChecks() + .AddNpgSql(connectionString, name: "database") + .AddRedis(redisConnectionString, name: "cache") + .AddCheck("minio"); + + app.MapHealthChecks("/health/ready", new HealthCheckOptions + { + Predicate = check => check.Tags.Contains("ready"), + ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse + }); + + app.MapHealthChecks("/health/live", new HealthCheckOptions + { + Predicate = _ => false // Only check if the app is responsive + }); + ``` + +2. **Implement custom health checks**: + ```csharp + public class MinIOHealthCheck : IHealthCheck + { + private readonly IMinioClient _minioClient; + + public async Task CheckHealthAsync( + HealthCheckContext context, + CancellationToken cancellationToken = default) + { + try + { + await _minioClient.ListBucketsAsync(cancellationToken); + return HealthCheckResult.Healthy("MinIO is accessible"); + } + catch (Exception ex) + { + return HealthCheckResult.Unhealthy("MinIO is not accessible", ex); + } + } + } + ``` + +3. **Add graceful shutdown handling**: + ```csharp + builder.Services.Configure(options => + { + options.ShutdownTimeout = TimeSpan.FromSeconds(30); + }); + ``` + +#### 1.4 Logging Enhancement + +**Objective**: Implement structured logging suitable for centralized log aggregation. + +**Current State**: +- Basic logging with simple string messages +- No correlation IDs for distributed tracing +- Log levels not optimized for production monitoring + +**Target State**: +- JSON-structured logging with correlation IDs +- Centralized log aggregation compatibility +- Performance and error metrics embedded in logs + +**Implementation Tasks**: +1. **Configure structured logging**: + ```csharp + builder.Services.AddLogging(loggingBuilder => + { + loggingBuilder.ClearProviders(); + loggingBuilder.AddJsonConsole(options => + { + options.IncludeScopes = true; + options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ"; + options.JsonWriterOptions = new JsonWriterOptions + { + Indented = false + }; + }); + }); + ``` + +2. **Add correlation ID middleware**: + ```csharp + public class CorrelationIdMiddleware + { + public async Task InvokeAsync(HttpContext context, RequestDelegate next) + { + var correlationId = context.Request.Headers["X-Correlation-ID"] + .FirstOrDefault() ?? Guid.NewGuid().ToString(); + + using var scope = _logger.BeginScope(new Dictionary + { + ["CorrelationId"] = correlationId, + ["UserId"] = context.User?.Identity?.Name + }); + + context.Response.Headers.Add("X-Correlation-ID", correlationId); + await next(context); + } + } + ``` + +3. **Implement performance logging** for critical operations + +### Phase 2: High Availability Infrastructure (Weeks 5-8) + +This phase focuses on implementing the supporting infrastructure required for high availability. + +#### 2.1 MinIO High Availability Setup + +**Objective**: Deploy a highly available MinIO cluster for file storage with automatic failover. + +**Architecture Overview**: +MinIO will be deployed as a distributed cluster with erasure coding for data protection and automatic healing capabilities. + +**MinIO Cluster Configuration**: +```yaml +# MinIO Tenant Configuration +apiVersion: minio.min.io/v2 +kind: Tenant +metadata: + name: motovault-minio + namespace: motovault +spec: + image: minio/minio:RELEASE.2024-01-16T16-07-38Z + creationDate: 2024-01-20T10:00:00Z + pools: + - servers: 4 + name: pool-0 + volumesPerServer: 4 + volumeClaimTemplate: + metadata: + name: data + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 100Gi + storageClassName: fast-ssd + mountPath: /export + subPath: /data + requestAutoCert: false + certConfig: + commonName: "" + organizationName: [] + dnsNames: [] + console: + image: minio/console:v0.22.5 + replicas: 2 + consoleSecret: + name: motovault-minio-console-secret + configuration: + name: motovault-minio-config + pools: + - servers: 4 + volumesPerServer: 4 + volumeClaimTemplate: + spec: + accessModes: [ "ReadWriteOnce" ] + resources: + requests: + storage: 100Gi +``` + +**Implementation Tasks**: +1. **Deploy MinIO Operator**: + ```bash + kubectl apply -k "github.com/minio/operator/resources" + ``` + +2. **Create MinIO cluster configuration** with erasure coding for data protection +3. **Configure backup policies** for disaster recovery +4. **Set up monitoring** with Prometheus metrics +5. **Create service endpoints** for application connectivity + +**MinIO High Availability Features**: +- **Erasure Coding**: Data is split across multiple drives with parity for automatic healing +- **Distributed Architecture**: No single point of failure +- **Automatic Healing**: Corrupted data is automatically detected and repaired +- **Load Balancing**: Built-in load balancing across cluster nodes +- **Bucket Policies**: Fine-grained access control for different data types + +#### 2.2 File Storage Abstraction Implementation + +**Objective**: Create an abstraction layer that allows seamless switching between local filesystem and MinIO object storage. + +**Current State**: +- Direct filesystem operations throughout the application +- File paths hardcoded in various controllers and services +- No abstraction for different storage backends + +**Target State**: +- Unified file storage interface +- Pluggable storage implementations +- Transparent migration between storage types + +**Implementation Tasks**: +1. **Define storage abstraction interface**: + ```csharp + public interface IFileStorageService + { + Task UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default); + Task DownloadFileAsync(string fileId, CancellationToken cancellationToken = default); + Task DeleteFileAsync(string fileId, CancellationToken cancellationToken = default); + Task GetFileMetadataAsync(string fileId, CancellationToken cancellationToken = default); + Task> ListFilesAsync(string prefix = null, CancellationToken cancellationToken = default); + Task GeneratePresignedUrlAsync(string fileId, TimeSpan expiration, CancellationToken cancellationToken = default); + } + + public class FileMetadata + { + public string Id { get; set; } + public string FileName { get; set; } + public string ContentType { get; set; } + public long Size { get; set; } + public DateTime CreatedDate { get; set; } + public DateTime ModifiedDate { get; set; } + public Dictionary Tags { get; set; } + } + ``` + +2. **Implement MinIO storage service**: + ```csharp + public class MinIOFileStorageService : IFileStorageService + { + private readonly IMinioClient _minioClient; + private readonly ILogger _logger; + private readonly string _bucketName; + + public MinIOFileStorageService(IMinioClient minioClient, IConfiguration configuration, ILogger logger) + { + _minioClient = minioClient; + _logger = logger; + _bucketName = configuration["MinIO:BucketName"] ?? "motovault-files"; + } + + public async Task UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default) + { + var fileId = $"{Guid.NewGuid()}/{fileName}"; + + try + { + await _minioClient.PutObjectAsync(new PutObjectArgs() + .WithBucket(_bucketName) + .WithObject(fileId) + .WithStreamData(fileStream) + .WithObjectSize(fileStream.Length) + .WithContentType(contentType) + .WithHeaders(new Dictionary + { + ["X-Amz-Meta-Original-Name"] = fileName, + ["X-Amz-Meta-Upload-Date"] = DateTime.UtcNow.ToString("O") + }), cancellationToken); + + _logger.LogInformation("File uploaded successfully: {FileId}", fileId); + return fileId; + } + catch (Exception ex) + { + _logger.LogError(ex, "Failed to upload file: {FileName}", fileName); + throw; + } + } + + // Additional method implementations... + } + ``` + +3. **Create fallback storage service** for graceful degradation: + ```csharp + public class FallbackFileStorageService : IFileStorageService + { + private readonly IFileStorageService _primaryService; + private readonly IFileStorageService _fallbackService; + private readonly ILogger _logger; + + // Implementation with automatic fallback logic + } + ``` + +4. **Update all file operations** to use the abstraction layer +5. **Implement file migration utility** for existing local files + +#### 2.3 PostgreSQL High Availability Configuration + +**Objective**: Set up a PostgreSQL cluster with automatic failover and read replicas. + +**Architecture Overview**: +PostgreSQL will be deployed using an operator (like CloudNativePG or Postgres Operator) to provide automated failover, backup, and scaling capabilities. + +**PostgreSQL Cluster Configuration**: +```yaml +apiVersion: postgresql.cnpg.io/v1 +kind: Cluster +metadata: + name: motovault-postgres + namespace: motovault +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + postgresql: + parameters: + max_connections: "200" + shared_buffers: "256MB" + effective_cache_size: "1GB" + maintenance_work_mem: "64MB" + checkpoint_completion_target: "0.9" + wal_buffers: "16MB" + default_statistics_target: "100" + random_page_cost: "1.1" + effective_io_concurrency: "200" + + resources: + requests: + memory: "2Gi" + cpu: "1000m" + limits: + memory: "4Gi" + cpu: "2000m" + + storage: + size: "100Gi" + storageClass: "fast-ssd" + + monitoring: + enabled: true + + backup: + retentionPolicy: "30d" + barmanObjectStore: + destinationPath: "s3://motovault-backups/postgres" + s3Credentials: + accessKeyId: + name: postgres-backup-credentials + key: ACCESS_KEY_ID + secretAccessKey: + name: postgres-backup-credentials + key: SECRET_ACCESS_KEY + wal: + retention: "5d" + data: + retention: "30d" + jobs: 1 +``` + +**Implementation Tasks**: +1. **Deploy PostgreSQL operator** (CloudNativePG recommended) +2. **Configure cluster with primary/replica setup** +3. **Set up automated backups** to MinIO or external storage +4. **Implement connection pooling** with PgBouncer +5. **Configure monitoring** and alerting for database health + +#### 2.4 Redis Cluster for Session Management + +**Objective**: Implement distributed session storage and caching using Redis cluster. + +**Current State**: +- In-memory session storage tied to individual application instances +- No distributed caching for expensive operations +- Configuration and translation data loaded on each application start + +**Target State**: +- Redis cluster for distributed session storage +- Centralized caching for frequently accessed data +- High availability with automatic failover + +**Redis Cluster Configuration**: +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: redis-cluster-config + namespace: motovault +data: + redis.conf: | + cluster-enabled yes + cluster-require-full-coverage no + cluster-node-timeout 15000 + cluster-config-file /data/nodes.conf + cluster-migration-barrier 1 + appendonly yes + appendfsync everysec + save 900 1 + save 300 10 + save 60 10000 + +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: redis-cluster + namespace: motovault +spec: + serviceName: redis-cluster + replicas: 6 + selector: + matchLabels: + app: redis-cluster + template: + metadata: + labels: + app: redis-cluster + spec: + containers: + - name: redis + image: redis:7-alpine + command: + - redis-server + - /etc/redis/redis.conf + ports: + - containerPort: 6379 + - containerPort: 16379 + resources: + requests: + memory: "512Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "500m" + volumeMounts: + - name: redis-config + mountPath: /etc/redis + - name: redis-data + mountPath: /data + volumes: + - name: redis-config + configMap: + name: redis-cluster-config + volumeClaimTemplates: + - metadata: + name: redis-data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 10Gi +``` + +**Implementation Tasks**: +1. **Deploy Redis cluster** with 6 nodes (3 masters, 3 replicas) +2. **Configure session storage**: + ```csharp + services.AddStackExchangeRedisCache(options => + { + options.Configuration = configuration.GetConnectionString("Redis"); + options.InstanceName = "MotoVault"; + }); + + services.AddSession(options => + { + options.IdleTimeout = TimeSpan.FromMinutes(30); + options.Cookie.HttpOnly = true; + options.Cookie.IsEssential = true; + options.Cookie.SecurePolicy = CookieSecurePolicy.Always; + }); + ``` + +3. **Implement distributed caching**: + ```csharp + public class CachedTranslationService : ITranslationService + { + private readonly IDistributedCache _cache; + private readonly ITranslationService _translationService; + private readonly ILogger _logger; + + public async Task GetTranslationAsync(string key, string language) + { + var cacheKey = $"translation:{language}:{key}"; + var cached = await _cache.GetStringAsync(cacheKey); + + if (cached != null) + { + return cached; + } + + var translation = await _translationService.GetTranslationAsync(key, language); + + await _cache.SetStringAsync(cacheKey, translation, new DistributedCacheEntryOptions + { + SlidingExpiration = TimeSpan.FromHours(1) + }); + + return translation; + } + } + ``` + +4. **Add cache monitoring** and performance metrics + +### Phase 3: Production Deployment (Weeks 9-12) + +This phase focuses on deploying the modernized application with proper production configurations and operational procedures. + +#### 3.1 Kubernetes Deployment Configuration + +**Objective**: Create production-ready Kubernetes manifests with proper resource management and high availability. + +**Application Deployment Configuration**: +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: motovault-app + namespace: motovault + labels: + app: motovault + version: v1.0.0 +spec: + replicas: 3 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: motovault + template: + metadata: + labels: + app: motovault + version: v1.0.0 + annotations: + prometheus.io/scrape: "true" + prometheus.io/path: "/metrics" + prometheus.io/port: "8080" + spec: + serviceAccountName: motovault-service-account + securityContext: + runAsNonRoot: true + runAsUser: 1000 + fsGroup: 2000 + affinity: + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - motovault + topologyKey: kubernetes.io/hostname + - weight: 50 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - motovault + topologyKey: topology.kubernetes.io/zone + containers: + - name: motovault + image: motovault:latest + imagePullPolicy: Always + ports: + - containerPort: 8080 + name: http + protocol: TCP + env: + - name: ASPNETCORE_ENVIRONMENT + value: "Production" + - name: ASPNETCORE_URLS + value: "http://+:8080" + envFrom: + - configMapRef: + name: motovault-config + - secretRef: + name: motovault-secrets + resources: + requests: + memory: "512Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "500m" + readinessProbe: + httpGet: + path: /health/ready + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + livenessProbe: + httpGet: + path: /health/live + port: 8080 + initialDelaySeconds: 30 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL + volumeMounts: + - name: tmp-volume + mountPath: /tmp + - name: app-logs + mountPath: /app/logs + volumes: + - name: tmp-volume + emptyDir: {} + - name: app-logs + emptyDir: {} + terminationGracePeriodSeconds: 30 + +--- +apiVersion: v1 +kind: Service +metadata: + name: motovault-service + namespace: motovault + labels: + app: motovault +spec: + type: ClusterIP + ports: + - port: 80 + targetPort: 8080 + protocol: TCP + name: http + selector: + app: motovault + +--- +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: motovault-pdb + namespace: motovault +spec: + minAvailable: 2 + selector: + matchLabels: + app: motovault +``` + +**Horizontal Pod Autoscaler Configuration**: +```yaml +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: motovault-hpa + namespace: motovault +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: motovault-app + minReplicas: 3 + maxReplicas: 10 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 + behavior: + scaleUp: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 100 + periodSeconds: 15 + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 10 + periodSeconds: 60 +``` + +#### 3.2 Ingress and TLS Configuration + +**Objective**: Configure secure external access with proper TLS termination and routing. + +**Ingress Configuration**: +```yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: motovault-ingress + namespace: motovault + annotations: + nginx.ingress.kubernetes.io/ssl-redirect: "true" + nginx.ingress.kubernetes.io/force-ssl-redirect: "true" + nginx.ingress.kubernetes.io/proxy-body-size: "50m" + nginx.ingress.kubernetes.io/proxy-read-timeout: "300" + nginx.ingress.kubernetes.io/proxy-send-timeout: "300" + cert-manager.io/cluster-issuer: "letsencrypt-prod" + nginx.ingress.kubernetes.io/rate-limit: "100" + nginx.ingress.kubernetes.io/rate-limit-window: "1m" +spec: + ingressClassName: nginx + tls: + - hosts: + - motovault.example.com + secretName: motovault-tls + rules: + - host: motovault.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: motovault-service + port: + number: 80 +``` + +#### 3.3 Monitoring and Observability Setup + +**Objective**: Implement comprehensive monitoring, logging, and alerting for production operations. + +**Prometheus ServiceMonitor Configuration**: +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: motovault-metrics + namespace: motovault + labels: + app: motovault +spec: + selector: + matchLabels: + app: motovault + endpoints: + - port: http + path: /metrics + interval: 30s + scrapeTimeout: 10s +``` + +**Application Metrics Implementation**: +```csharp +public class MetricsService +{ + private readonly Counter _httpRequestsTotal; + private readonly Histogram _httpRequestDuration; + private readonly Gauge _activeConnections; + private readonly Counter _databaseOperationsTotal; + private readonly Histogram _databaseOperationDuration; + + public MetricsService() + { + _httpRequestsTotal = Metrics.CreateCounter( + "motovault_http_requests_total", + "Total number of HTTP requests", + new[] { "method", "endpoint", "status_code" }); + + _httpRequestDuration = Metrics.CreateHistogram( + "motovault_http_request_duration_seconds", + "Duration of HTTP requests in seconds", + new[] { "method", "endpoint" }); + + _activeConnections = Metrics.CreateGauge( + "motovault_active_connections", + "Number of active database connections"); + + _databaseOperationsTotal = Metrics.CreateCounter( + "motovault_database_operations_total", + "Total number of database operations", + new[] { "operation", "table", "status" }); + + _databaseOperationDuration = Metrics.CreateHistogram( + "motovault_database_operation_duration_seconds", + "Duration of database operations in seconds", + new[] { "operation", "table" }); + } + + public void RecordHttpRequest(string method, string endpoint, int statusCode, double duration) + { + _httpRequestsTotal.WithLabels(method, endpoint, statusCode.ToString()).Inc(); + _httpRequestDuration.WithLabels(method, endpoint).Observe(duration); + } + + public void RecordDatabaseOperation(string operation, string table, bool success, double duration) + { + var status = success ? "success" : "error"; + _databaseOperationsTotal.WithLabels(operation, table, status).Inc(); + _databaseOperationDuration.WithLabels(operation, table).Observe(duration); + } +} +``` + +**Custom Grafana Dashboard Configuration**: +```json +{ + "dashboard": { + "title": "MotoVaultPro Application Dashboard", + "panels": [ + { + "title": "HTTP Request Rate", + "type": "graph", + "targets": [ + { + "expr": "rate(motovault_http_requests_total[5m])", + "legendFormat": "{{method}} {{endpoint}}" + } + ] + }, + { + "title": "Response Time Percentiles", + "type": "graph", + "targets": [ + { + "expr": "histogram_quantile(0.50, rate(motovault_http_request_duration_seconds_bucket[5m]))", + "legendFormat": "50th percentile" + }, + { + "expr": "histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m]))", + "legendFormat": "95th percentile" + } + ] + }, + { + "title": "Database Connection Pool", + "type": "singlestat", + "targets": [ + { + "expr": "motovault_active_connections", + "legendFormat": "Active Connections" + } + ] + }, + { + "title": "Error Rate", + "type": "graph", + "targets": [ + { + "expr": "rate(motovault_http_requests_total{status_code=~\"5..\"}[5m])", + "legendFormat": "5xx errors" + } + ] + } + ] + } +} +``` + +#### 3.4 Backup and Disaster Recovery + +**Objective**: Implement comprehensive backup strategies and disaster recovery procedures. + +**Velero Backup Configuration**: +```yaml +apiVersion: velero.io/v1 +kind: Schedule +metadata: + name: motovault-daily-backup + namespace: velero +spec: + schedule: "0 2 * * *" # Daily at 2 AM + template: + includedNamespaces: + - motovault + includedResources: + - "*" + storageLocation: default + ttl: 720h0m0s # 30 days + snapshotVolumes: true + +--- +apiVersion: velero.io/v1 +kind: Schedule +metadata: + name: motovault-weekly-backup + namespace: velero +spec: + schedule: "0 3 * * 0" # Weekly on Sunday at 3 AM + template: + includedNamespaces: + - motovault + includedResources: + - "*" + storageLocation: default + ttl: 2160h0m0s # 90 days + snapshotVolumes: true +``` + +**Database Backup Strategy**: +```bash +#!/bin/bash +# Automated database backup script + +BACKUP_DATE=$(date +%Y%m%d_%H%M%S) +BACKUP_FILE="motovault_backup_${BACKUP_DATE}.sql" +S3_BUCKET="motovault-backups" + +# Create database backup +kubectl exec -n motovault motovault-postgres-1 -- \ + pg_dump -U postgres motovault > "${BACKUP_FILE}" + +# Compress backup +gzip "${BACKUP_FILE}" + +# Upload to S3/MinIO +aws s3 cp "${BACKUP_FILE}.gz" "s3://${S3_BUCKET}/database/" + +# Clean up local file +rm "${BACKUP_FILE}.gz" + +# Retain only last 30 days of backups +aws s3api list-objects-v2 \ + --bucket "${S3_BUCKET}" \ + --prefix "database/" \ + --query 'Contents[?LastModified<=`'$(date -d "30 days ago" --iso-8601)'`].[Key]' \ + --output text | \ + xargs -I {} aws s3 rm "s3://${S3_BUCKET}/{}" +``` + +### Phase 4: Advanced Features and Optimization (Weeks 13-16) + +This phase focuses on advanced cloud-native features and performance optimization. + +#### 4.1 Advanced Caching Strategies + +**Objective**: Implement multi-layer caching for optimal performance and reduced database load. + +**Cache Architecture**: +``` +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ Browser │ │ CDN/Proxy │ │ Application │ +│ Cache │◄──►│ Cache │◄──►│ Memory Cache │ +│ (Static) │ │ (Static + │ │ (L1) │ +│ │ │ Dynamic) │ │ │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ + ┌─────────────────┐ + │ Redis Cache │ + │ (L2) │ + │ Distributed │ + └─────────────────┘ + │ + ┌─────────────────┐ + │ Database │ + │ (Source) │ + │ │ + └─────────────────┘ +``` + +**Implementation Details**: +```csharp +public class MultiLevelCacheService +{ + private readonly IMemoryCache _memoryCache; + private readonly IDistributedCache _distributedCache; + private readonly ILogger _logger; + + public async Task GetAsync(string key, Func> factory, TimeSpan? expiration = null) + { + // L1 Cache - Memory + if (_memoryCache.TryGetValue(key, out T cachedValue)) + { + _logger.LogDebug("Cache hit (L1): {Key}", key); + return cachedValue; + } + + // L2 Cache - Redis + var distributedValue = await _distributedCache.GetStringAsync(key); + if (distributedValue != null) + { + var deserializedValue = JsonSerializer.Deserialize(distributedValue); + _memoryCache.Set(key, deserializedValue, TimeSpan.FromMinutes(5)); // Short-lived L1 cache + _logger.LogDebug("Cache hit (L2): {Key}", key); + return deserializedValue; + } + + // Cache miss - fetch from source + _logger.LogDebug("Cache miss: {Key}", key); + var value = await factory(); + + // Store in both cache levels + var serializedValue = JsonSerializer.Serialize(value); + await _distributedCache.SetStringAsync(key, serializedValue, new DistributedCacheEntryOptions + { + SlidingExpiration = expiration ?? TimeSpan.FromHours(1) + }); + + _memoryCache.Set(key, value, TimeSpan.FromMinutes(5)); + + return value; + } +} +``` + +#### 4.2 Performance Optimization + +**Objective**: Optimize application performance for high-load scenarios. + +**Database Query Optimization**: +```csharp +public class OptimizedVehicleService +{ + private readonly IDbContextFactory _dbContextFactory; + private readonly IMemoryCache _cache; + + public async Task GetDashboardDataAsync(int userId, int vehicleId) + { + var cacheKey = $"dashboard:{userId}:{vehicleId}"; + + if (_cache.TryGetValue(cacheKey, out VehicleDashboardData cached)) + { + return cached; + } + + using var context = _dbContextFactory.CreateDbContext(); + + // Optimized single query with projections + var dashboardData = await context.Vehicles + .Where(v => v.Id == vehicleId && v.UserId == userId) + .Select(v => new VehicleDashboardData + { + Vehicle = v, + RecentServices = v.ServiceRecords + .OrderByDescending(s => s.Date) + .Take(5) + .ToList(), + UpcomingReminders = v.ReminderRecords + .Where(r => r.IsActive && r.DueDate > DateTime.Now) + .OrderBy(r => r.DueDate) + .Take(5) + .ToList(), + FuelEfficiency = v.GasRecords + .Where(g => g.Date >= DateTime.Now.AddMonths(-3)) + .Average(g => g.Efficiency), + TotalMileage = v.OdometerRecords + .OrderByDescending(o => o.Date) + .FirstOrDefault().Mileage ?? 0 + }) + .AsNoTracking() + .FirstOrDefaultAsync(); + + _cache.Set(cacheKey, dashboardData, TimeSpan.FromMinutes(15)); + return dashboardData; + } +} +``` + +**Connection Pool Optimization**: +```csharp +services.AddDbContextFactory(options => +{ + options.UseNpgsql(connectionString, npgsqlOptions => + { + npgsqlOptions.EnableRetryOnFailure( + maxRetryCount: 3, + maxRetryDelay: TimeSpan.FromSeconds(5), + errorCodesToAdd: null); + npgsqlOptions.CommandTimeout(30); + }); + + // Optimize for read-heavy workloads + options.EnableSensitiveDataLogging(false); + options.EnableServiceProviderCaching(); + options.EnableDetailedErrors(false); +}, ServiceLifetime.Singleton); + +// Configure connection pooling +services.Configure(builder => +{ + builder.MaxPoolSize = 100; + builder.MinPoolSize = 10; + builder.ConnectionLifetime = 300; + builder.ConnectionPruningInterval = 10; + builder.ConnectionIdleLifetime = 300; +}); +``` + +#### 4.3 Security Enhancements + +**Objective**: Implement advanced security features for production deployment. + +**Network Security Policies**: +```yaml +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: motovault-network-policy + namespace: motovault +spec: + podSelector: + matchLabels: + app: motovault + policyTypes: + - Ingress + - Egress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: nginx-ingress + ports: + - protocol: TCP + port: 8080 + egress: + - to: + - namespaceSelector: + matchLabels: + name: motovault + ports: + - protocol: TCP + port: 5432 # PostgreSQL + - protocol: TCP + port: 6379 # Redis + - protocol: TCP + port: 9000 # MinIO + - to: [] # Allow external HTTPS for OIDC + ports: + - protocol: TCP + port: 443 + - protocol: TCP + port: 80 +``` + +**Pod Security Standards**: +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: motovault + labels: + pod-security.kubernetes.io/enforce: restricted + pod-security.kubernetes.io/audit: restricted + pod-security.kubernetes.io/warn: restricted +``` + +**Secret Management with External Secrets Operator**: +```yaml +apiVersion: external-secrets.io/v1beta1 +kind: SecretStore +metadata: + name: vault-backend + namespace: motovault +spec: + provider: + vault: + server: "https://vault.example.com" + path: "secret" + version: "v2" + auth: + kubernetes: + mountPath: "kubernetes" + role: "motovault-role" + +--- +apiVersion: external-secrets.io/v1beta1 +kind: ExternalSecret +metadata: + name: motovault-secrets + namespace: motovault +spec: + refreshInterval: 1h + secretStoreRef: + name: vault-backend + kind: SecretStore + target: + name: motovault-secrets + creationPolicy: Owner + data: + - secretKey: POSTGRES_CONNECTION + remoteRef: + key: motovault/database + property: connection_string + - secretKey: JWT_SECRET + remoteRef: + key: motovault/auth + property: jwt_secret +``` + +## Migration Strategy + +### Pre-Migration Assessment + +**Current State Analysis**: +1. **Data Inventory**: Catalog all existing data, configurations, and file attachments +2. **Dependency Mapping**: Identify all external dependencies and integrations +3. **Performance Baseline**: Establish current performance metrics for comparison +4. **User Impact Assessment**: Analyze potential downtime and user experience changes + +**Migration Prerequisites**: +1. **Kubernetes Cluster Ready**: Properly configured cluster with required operators +2. **Infrastructure Deployed**: PostgreSQL, MinIO, and Redis clusters operational +3. **Backup Strategy**: Complete backup of current system and data +4. **Rollback Plan**: Detailed procedure for reverting to current system if needed + +### Migration Execution Plan + +#### Phase 1: Parallel Environment Setup (Week 1) +1. **Deploy target infrastructure** in parallel to existing system +2. **Configure monitoring and logging** for new environment +3. **Run initial data migration tests** with sample data +4. **Validate all health checks** and monitoring alerts + +#### Phase 2: Data Migration (Week 2) +1. **Initial data sync**: Migrate historical data during low-usage periods +2. **File migration**: Transfer all attachments to MinIO with validation +3. **Configuration migration**: Convert all settings to ConfigMaps/Secrets +4. **User data validation**: Verify data integrity and completeness + +#### Phase 3: Application Cutover (Week 3) +1. **Final data sync**: Synchronize any changes made during migration +2. **DNS cutover**: Redirect traffic to new Kubernetes deployment +3. **Monitor closely**: Watch for any issues or performance problems +4. **User acceptance testing**: Validate all functionality works correctly + +#### Phase 4: Optimization and Cleanup (Week 4) +1. **Performance tuning**: Optimize based on real-world usage patterns +2. **Clean up old infrastructure**: Decommission legacy deployment +3. **Update documentation**: Finalize operational procedures +4. **Training**: Train operations team on new procedures + +### Data Migration Tools + +**LiteDB to PostgreSQL Migration Utility**: +```csharp +public class DataMigrationService +{ + private readonly ILiteDatabase _liteDb; + private readonly IServiceProvider _serviceProvider; + private readonly ILogger _logger; + + public async Task MigrateAllDataAsync() + { + var result = new MigrationResult(); + + try + { + using var scope = _serviceProvider.CreateScope(); + var context = scope.ServiceProvider.GetRequiredService(); + + // Migrate users first (dependencies) + result.UsersProcessed = await MigrateUsersAsync(context); + + // Migrate vehicles + result.VehiclesProcessed = await MigrateVehiclesAsync(context); + + // Migrate all record types + result.ServiceRecordsProcessed = await MigrateServiceRecordsAsync(context); + result.GasRecordsProcessed = await MigrateGasRecordsAsync(context); + result.FilesProcessed = await MigrateFilesAsync(); + + await context.SaveChangesAsync(); + result.Success = true; + } + catch (Exception ex) + { + _logger.LogError(ex, "Migration failed"); + result.Success = false; + result.ErrorMessage = ex.Message; + } + + return result; + } + + private async Task MigrateFilesAsync() + { + var fileStorage = _serviceProvider.GetRequiredService(); + var filesProcessed = 0; + + var localFilesPath = "data/files"; + if (Directory.Exists(localFilesPath)) + { + var files = Directory.GetFiles(localFilesPath, "*", SearchOption.AllDirectories); + + foreach (var filePath in files) + { + using var fileStream = File.OpenRead(filePath); + var fileName = Path.GetFileName(filePath); + var contentType = GetContentType(fileName); + + await fileStorage.UploadFileAsync(fileStream, fileName, contentType); + filesProcessed++; + + _logger.LogInformation("Migrated file: {FileName}", fileName); + } + } + + return filesProcessed; + } +} +``` + +### Rollback Procedures + +**Emergency Rollback Plan**: +1. **Immediate Actions** (0-15 minutes): + - Redirect DNS back to original system + - Activate incident response team + - Begin root cause analysis + +2. **Data Consistency** (15-30 minutes): + - Verify data integrity in original system + - Sync any changes made during brief cutover period + - Validate all services are operational + +3. **Communication** (30-60 minutes): + - Notify stakeholders of rollback + - Provide status updates to users + - Document lessons learned + +4. **Post-Rollback Analysis** (1-24 hours): + - Complete root cause analysis + - Update migration plan based on findings + - Plan next migration attempt + +## Risk Assessment and Mitigation + +### Technical Risks + +#### High Impact Risks + +**1. Data Loss or Corruption** +- **Probability**: Low +- **Impact**: Critical +- **Mitigation**: + - Multiple backup strategies with point-in-time recovery + - Comprehensive data validation during migration + - Parallel running systems during cutover + - Automated data integrity checks + +**2. Extended Downtime During Migration** +- **Probability**: Medium +- **Impact**: High +- **Mitigation**: + - Phased migration approach with minimal downtime windows + - Blue-green deployment strategy + - Comprehensive rollback procedures + - 24/7 monitoring during cutover + +**3. Performance Degradation** +- **Probability**: Medium +- **Impact**: Medium +- **Mitigation**: + - Extensive load testing before migration + - Performance monitoring and alerting + - Auto-scaling capabilities + - Database query optimization + +#### Medium Impact Risks + +**4. Integration Failures** +- **Probability**: Medium +- **Impact**: Medium +- **Mitigation**: + - Thorough integration testing + - Circuit breaker patterns for external dependencies + - Graceful degradation for non-critical features + - Health check monitoring + +**5. Security Vulnerabilities** +- **Probability**: Low +- **Impact**: High +- **Mitigation**: + - Security scanning of all container images + - Network policies and Pod Security Standards + - Secret management best practices + - Regular security audits + +### Operational Risks + +**6. Team Knowledge Gaps** +- **Probability**: Medium +- **Impact**: Medium +- **Mitigation**: + - Comprehensive training program + - Detailed operational documentation + - On-call procedures and runbooks + - Knowledge transfer sessions + +**7. Infrastructure Capacity Issues** +- **Probability**: Low +- **Impact**: Medium +- **Mitigation**: + - Capacity planning and resource monitoring + - Auto-scaling policies + - Resource quotas and limits + - Infrastructure as Code for rapid scaling + +### Business Risks + +**8. User Adoption Challenges** +- **Probability**: Low +- **Impact**: Medium +- **Mitigation**: + - Transparent communication about changes + - User training and documentation + - Phased rollout to minimize impact + - User feedback collection and response + +## Testing Strategy + +### Test Environment Architecture + +**Multi-Environment Strategy**: +``` +Development → Staging → Pre-Production → Production + ↓ ↓ ↓ ↓ + Unit Tests Integration Load Testing Monitoring + API Tests UI Tests Security Alerting + DB Tests E2E Tests Performance Backup Tests +``` + +### Comprehensive Testing Plan + +#### Unit Testing +- **Coverage Target**: 80% code coverage minimum +- **Focus Areas**: Business logic, data access layer, API endpoints +- **Test Framework**: xUnit with Moq for dependency injection testing +- **Automated Execution**: Run on every commit and pull request + +#### Integration Testing +- **Database Integration**: Test all repository implementations +- **External Service Integration**: MinIO, Redis, PostgreSQL connectivity +- **API Integration**: Full request/response cycle testing +- **Authentication Testing**: All authentication flows and authorization rules + +#### Load Testing +- **Tools**: k6 or Artillery for load generation +- **Scenarios**: + - Normal load: 100 concurrent users + - Peak load: 500 concurrent users + - Stress test: 1000+ concurrent users +- **Metrics**: Response time, throughput, error rate, resource utilization + +#### Security Testing +- **Container Security**: Scan images for vulnerabilities +- **Network Security**: Validate network policies and isolation +- **Authentication**: Test all authentication and authorization scenarios +- **Data Protection**: Verify encryption at rest and in transit + +#### Disaster Recovery Testing +- **Database Failover**: Test automatic failover scenarios +- **Application Recovery**: Pod failure and recovery testing +- **Backup Restoration**: Full system restoration from backups +- **Network Partitioning**: Test behavior during network issues + +### Performance Testing Scenarios + +**Load Testing Script Example**: +```javascript +import http from 'k6/http'; +import { check, sleep } from 'k6'; + +export let options = { + stages: [ + { duration: '2m', target: 20 }, // Ramp up + { duration: '5m', target: 20 }, // Stay at 20 users + { duration: '2m', target: 50 }, // Ramp up to 50 + { duration: '5m', target: 50 }, // Stay at 50 + { duration: '2m', target: 100 }, // Ramp up to 100 + { duration: '5m', target: 100 }, // Stay at 100 + { duration: '2m', target: 0 }, // Ramp down + ], + thresholds: { + http_req_duration: ['p(95)<500'], // 95% of requests under 500ms + http_req_failed: ['rate<0.1'], // Error rate under 10% + }, +}; + +export default function() { + // Login + let loginResponse = http.post('https://motovault.example.com/api/auth/login', { + username: 'testuser', + password: 'testpass' + }); + + check(loginResponse, { + 'login successful': (r) => r.status === 200, + }); + + let authToken = loginResponse.json('token'); + + // Dashboard load + let dashboardResponse = http.get('https://motovault.example.com/api/dashboard', { + headers: { Authorization: `Bearer ${authToken}` }, + }); + + check(dashboardResponse, { + 'dashboard loaded': (r) => r.status === 200, + 'response time < 500ms': (r) => r.timings.duration < 500, + }); + + sleep(1); +} +``` + +## Operational Procedures + +### Monitoring and Alerting + +#### Application Metrics +```yaml +# Prometheus AlertManager Rules +groups: +- name: motovault.rules + rules: + - alert: HighErrorRate + expr: rate(motovault_http_requests_total{status_code=~"5.."}[5m]) > 0.1 + for: 2m + labels: + severity: critical + annotations: + summary: "High error rate detected" + description: "Error rate is {{ $value }}% for the last 5 minutes" + + - alert: HighResponseTime + expr: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m])) > 2 + for: 5m + labels: + severity: warning + annotations: + summary: "High response time detected" + description: "95th percentile response time is {{ $value }}s" + + - alert: DatabaseConnectionPoolExhaustion + expr: motovault_active_connections > 80 + for: 2m + labels: + severity: warning + annotations: + summary: "Database connection pool nearly exhausted" + description: "Active connections: {{ $value }}/100" + + - alert: PodCrashLooping + expr: rate(kube_pod_container_status_restarts_total{namespace="motovault"}[15m]) > 0 + for: 5m + labels: + severity: critical + annotations: + summary: "Pod is crash looping" + description: "Pod {{ $labels.pod }} is restarting frequently" +``` + +#### Infrastructure Monitoring +- **Node Resources**: CPU, memory, disk usage across all nodes +- **Network Performance**: Latency, throughput, packet loss +- **Storage Performance**: IOPS, latency for persistent volumes +- **Kubernetes Health**: API server, etcd, scheduler performance + +### Backup and Recovery Procedures + +#### Automated Backup Schedule +```bash +# Daily backup script +#!/bin/bash +set -e + +TIMESTAMP=$(date +%Y%m%d_%H%M%S) +BACKUP_NAMESPACE="motovault" + +# Database backup +echo "Starting database backup at $(date)" +kubectl exec -n $BACKUP_NAMESPACE motovault-postgres-1 -- \ + pg_dump -U postgres motovault | \ + gzip > "database_backup_${TIMESTAMP}.sql.gz" + +# MinIO backup (metadata and small files) +echo "Starting MinIO backup at $(date)" +mc mirror motovault-minio/motovault-files backup/minio_${TIMESTAMP}/ + +# Kubernetes resources backup +echo "Starting Kubernetes backup at $(date)" +velero backup create "motovault-${TIMESTAMP}" \ + --include-namespaces motovault \ + --wait + +# Upload to remote storage +echo "Uploading backups to remote storage" +aws s3 cp "database_backup_${TIMESTAMP}.sql.gz" s3://motovault-backups/daily/ +aws s3 sync "backup/minio_${TIMESTAMP}/" s3://motovault-backups/minio/${TIMESTAMP}/ + +# Cleanup local files older than 7 days +find backup/ -name "*.gz" -mtime +7 -delete +find backup/minio_* -mtime +7 -exec rm -rf {} \; + +echo "Backup completed successfully at $(date)" +``` + +#### Recovery Procedures +```bash +# Full system recovery script +#!/bin/bash +set -e + +BACKUP_DATE=$1 +if [ -z "$BACKUP_DATE" ]; then + echo "Usage: $0 " + echo "Example: $0 20240120_020000" + exit 1 +fi + +# Stop application +echo "Scaling down application..." +kubectl scale deployment motovault-app --replicas=0 -n motovault + +# Restore database +echo "Restoring database from backup..." +aws s3 cp "s3://motovault-backups/daily/database_backup_${BACKUP_DATE}.sql.gz" . +gunzip "database_backup_${BACKUP_DATE}.sql.gz" +kubectl exec -i motovault-postgres-1 -n motovault -- \ + psql -U postgres -d motovault < "database_backup_${BACKUP_DATE}.sql" + +# Restore MinIO data +echo "Restoring MinIO data..." +aws s3 sync "s3://motovault-backups/minio/${BACKUP_DATE}/" /tmp/minio_restore/ +mc mirror /tmp/minio_restore/ motovault-minio/motovault-files/ + +# Restart application +echo "Scaling up application..." +kubectl scale deployment motovault-app --replicas=3 -n motovault + +# Verify health +echo "Waiting for application to be ready..." +kubectl wait --for=condition=ready pod -l app=motovault -n motovault --timeout=300s + +echo "Recovery completed successfully" +``` + +### Maintenance Procedures + +#### Rolling Updates +```yaml +# Zero-downtime deployment strategy +apiVersion: argoproj.io/v1alpha1 +kind: Rollout +metadata: + name: motovault-rollout + namespace: motovault +spec: + replicas: 5 + strategy: + canary: + steps: + - setWeight: 20 + - pause: {duration: 1m} + - setWeight: 40 + - pause: {duration: 2m} + - setWeight: 60 + - pause: {duration: 2m} + - setWeight: 80 + - pause: {duration: 2m} + analysis: + templates: + - templateName: success-rate + args: + - name: service-name + value: motovault-service + canaryService: motovault-canary-service + stableService: motovault-stable-service + selector: + matchLabels: + app: motovault + template: + metadata: + labels: + app: motovault + spec: + containers: + - name: motovault + image: motovault:latest + # ... container spec +``` + +#### Scaling Procedures +- **Horizontal Scaling**: Use HPA for automatic scaling based on metrics +- **Vertical Scaling**: Monitor resource usage and adjust requests/limits +- **Database Scaling**: Add read replicas for read-heavy workloads +- **Storage Scaling**: Monitor MinIO usage and add nodes as needed + +## Implementation Timeline + +### Detailed 16-Week Schedule + +#### Weeks 1-4: Foundation Phase +**Week 1: Environment Setup** +- Day 1-2: Kubernetes cluster setup and configuration +- Day 3-4: Deploy PostgreSQL operator and cluster +- Day 5-7: Deploy MinIO operator and configure HA cluster + +**Week 2: Redis and Monitoring** +- Day 1-3: Deploy Redis cluster with sentinel configuration +- Day 4-5: Set up Prometheus and Grafana +- Day 6-7: Configure initial monitoring dashboards + +**Week 3: Application Changes** +- Day 1-2: Remove LiteDB dependencies +- Day 3-4: Implement configuration externalization +- Day 5-7: Add health check endpoints + +**Week 4: File Storage Abstraction** +- Day 1-3: Implement IFileStorageService interface +- Day 4-5: Create MinIO implementation +- Day 6-7: Add fallback mechanisms + +#### Weeks 5-8: Core Implementation +**Week 5: Database Integration** +- Day 1-3: Optimize PostgreSQL connections +- Day 4-5: Implement connection pooling +- Day 6-7: Add database health checks + +**Week 6: Session and Caching** +- Day 1-2: Implement Redis session storage +- Day 3-4: Add distributed caching layer +- Day 5-7: Implement multi-level caching + +**Week 7: Observability** +- Day 1-3: Add structured logging +- Day 4-5: Implement Prometheus metrics +- Day 6-7: Add distributed tracing + +**Week 8: Security Implementation** +- Day 1-2: Configure Pod Security Standards +- Day 3-4: Implement network policies +- Day 5-7: Set up secret management + +#### Weeks 9-12: Production Deployment +**Week 9: Kubernetes Manifests** +- Day 1-3: Create production Kubernetes manifests +- Day 4-5: Configure HPA and resource limits +- Day 6-7: Set up ingress and TLS + +**Week 10: Backup and Recovery** +- Day 1-3: Implement backup strategies +- Day 4-5: Create recovery procedures +- Day 6-7: Test disaster recovery scenarios + +**Week 11: Load Testing** +- Day 1-3: Create load testing scenarios +- Day 4-5: Execute performance tests +- Day 6-7: Optimize based on results + +**Week 12: Migration Preparation** +- Day 1-3: Create data migration tools +- Day 4-5: Test migration procedures +- Day 6-7: Prepare rollback plans + +#### Weeks 13-16: Advanced Features +**Week 13: Performance Optimization** +- Day 1-3: Implement advanced caching strategies +- Day 4-5: Optimize database queries +- Day 6-7: Fine-tune resource allocation + +**Week 14: Advanced Security** +- Day 1-3: Implement external secret management +- Day 4-5: Add security scanning to CI/CD +- Day 6-7: Configure advanced network policies + +**Week 15: Production Migration** +- Day 1-2: Execute data migration +- Day 3-4: Perform application cutover +- Day 5-7: Monitor and optimize + +**Week 16: Optimization and Documentation** +- Day 1-3: Performance tuning based on production usage +- Day 4-5: Update operational documentation +- Day 6-7: Conduct team training + +### Success Criteria + +#### Technical Success Metrics +- **Availability**: 99.9% uptime (no more than 8.76 hours downtime per year) +- **Performance**: 95th percentile response time under 500ms +- **Scalability**: Ability to handle 10x current user load +- **Recovery**: RTO < 1 hour, RPO < 15 minutes + +#### Operational Success Metrics +- **Deployment Frequency**: Enable weekly deployments with zero downtime +- **Mean Time to Recovery**: < 30 minutes for critical issues +- **Change Failure Rate**: < 5% of deployments require rollback +- **Monitoring Coverage**: 100% of critical services monitored + +#### Business Success Metrics +- **User Satisfaction**: No degradation in user experience +- **Cost Efficiency**: Infrastructure costs within 20% of current spending +- **Maintenance Overhead**: Reduced operational maintenance time by 50% +- **Future Readiness**: Foundation for future enhancements and scaling + +--- + +**Document Version**: 1.0 +**Last Updated**: January 2025 +**Author**: MotoVaultPro Modernization Team +**Status**: Draft for Review + +--- + +This comprehensive plan provides a detailed roadmap for modernizing MotoVaultPro to run efficiently on Kubernetes with high availability, scalability, and operational excellence. The phased approach ensures minimal risk while delivering maximum benefits for future growth and reliability. \ No newline at end of file