Fixed Dark Mode

This commit is contained in:
Eric Gullickson
2025-07-28 09:39:17 -05:00
parent 4391cf11ed
commit 01a03263c9
455 changed files with 143757 additions and 0 deletions

308
docs/K8S-OVERVIEW.md Normal file
View File

@@ -0,0 +1,308 @@
# Kubernetes Modernization Plan for MotoVaultPro
## Executive Summary
This document provides an overview of the comprehensive plan to modernize MotoVaultPro from a traditional self-hosted application to a cloud-native, highly available system running on Kubernetes. The modernization focuses on transforming the current monolithic ASP.NET Core application into a resilient, scalable platform capable of handling enterprise-level workloads while maintaining the existing feature set and user experience.
### Key Objectives
- **High Availability**: Eliminate single points of failure through distributed architecture
- **Scalability**: Enable horizontal scaling to handle increased user loads
- **Resilience**: Implement fault tolerance and automatic recovery mechanisms
- **Cloud-Native**: Adopt Kubernetes-native patterns and best practices
- **Operational Excellence**: Improve monitoring, logging, and maintenance capabilities
### Strategic Benefits
- **Reduced Downtime**: Multi-replica deployments with automatic failover
- **Improved Performance**: Distributed caching and optimized data access patterns
- **Enhanced Security**: Pod-level isolation and secret management
- **Cost Optimization**: Efficient resource utilization through auto-scaling
- **Future-Ready**: Foundation for microservices and advanced cloud features
## Current Architecture Analysis
### Existing System Overview
MotoVaultPro is currently deployed as a monolithic ASP.NET Core 8.0 application with the following characteristics:
#### Application Architecture
- **Monolithic Design**: Single deployable unit containing all functionality
- **MVC Pattern**: Traditional Model-View-Controller architecture
- **Dual Database Support**: LiteDB (embedded) and PostgreSQL (external)
- **File Storage**: Local filesystem for document attachments
- **Session Management**: In-memory or cookie-based sessions
- **Configuration**: File-based configuration with environment variables
#### Identified Limitations for Kubernetes
1. **State Dependencies**: LiteDB and local file storage prevent stateless operation
2. **Configuration Management**: File-based configuration not suitable for container orchestration
3. **Health Monitoring**: Lacks Kubernetes-compatible health check endpoints
4. **Logging**: Basic logging not optimized for centralized log aggregation
5. **Resource Management**: No resource constraints or auto-scaling capabilities
6. **Secret Management**: Sensitive configuration stored in plain text files
## Target Architecture
### Cloud-Native Design Principles
The modernized architecture will embrace the following cloud-native principles:
#### Stateless Application Design
- **External State Storage**: All state moved to external, highly available services
- **Horizontal Scalability**: Multiple application replicas with load balancing
- **Configuration as Code**: All configuration externalized to ConfigMaps and Secrets
- **Ephemeral Containers**: Pods can be created, destroyed, and recreated without data loss
#### Distributed Data Architecture
- **PostgreSQL Cluster**: Primary/replica configuration with automatic failover
- **MinIO High Availability**: Distributed object storage for file attachments
- **Redis Cluster**: Distributed caching and session storage
- **Backup Strategy**: Automated backups with point-in-time recovery
#### Observability and Operations
- **Structured Logging**: JSON logging with correlation IDs for distributed tracing
- **Metrics Collection**: Prometheus-compatible metrics for monitoring
- **Health Checks**: Kubernetes-native readiness and liveness probes
- **Distributed Tracing**: OpenTelemetry integration for request flow analysis
### High-Level Architecture Diagram
```
┌─────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ MotoVault │ │ MotoVault │ │ MotoVault │ │
│ │ Pod (1) │ │ Pod (2) │ │ Pod (3) │ │
│ │ │ │ │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │ │ │ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Load Balancer Service │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │ │ │
├───────────┼─────────────────────┼─────────────────────┼──────────┤
│ ┌────────▼──────┐ ┌─────────▼──────┐ ┌─────────▼──────┐ │
│ │ PostgreSQL │ │ Redis Cluster │ │ MinIO Cluster │ │
│ │ Primary │ │ (3 nodes) │ │ (4+ nodes) │ │
│ │ + 2 Replicas │ │ │ │ Erasure Coded │ │
│ └───────────────┘ └────────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
## Implementation Phases Overview
The modernization is structured in four distinct phases, each building upon the previous phase to ensure a smooth and risk-managed transition:
### [Phase 1: Core Kubernetes Readiness](K8S-PHASE-1.md) (Weeks 1-4)
**Objective**: Make the application compatible with Kubernetes deployment patterns.
**Key Deliverables**:
- Configuration externalization to ConfigMaps and Secrets
- Removal of LiteDB dependencies
- PostgreSQL connection pooling optimization
- Kubernetes health check endpoints
- Structured logging implementation
**Success Criteria**:
- Application starts using only environment variables
- Health checks return appropriate status codes
- Database migrations work seamlessly
- Structured JSON logging operational
### [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md) (Weeks 5-8)
**Objective**: Deploy highly available supporting infrastructure.
**Key Deliverables**:
- MinIO distributed object storage cluster
- File storage abstraction layer
- PostgreSQL HA cluster with automated failover
- Redis cluster for distributed sessions and caching
- Comprehensive monitoring setup
**Success Criteria**:
- MinIO cluster operational with erasure coding
- PostgreSQL cluster with automatic failover
- Redis cluster providing distributed sessions
- All file operations using object storage
- Infrastructure monitoring and alerting active
### [Phase 3: Production Deployment](K8S-PHASE-3.md) (Weeks 9-12)
**Objective**: Deploy to production with security, monitoring, and backup strategies.
**Key Deliverables**:
- Production Kubernetes manifests with HPA
- Secure ingress with automated TLS certificates
- Comprehensive application and infrastructure monitoring
- Automated backup and disaster recovery procedures
- Migration tools and procedures
**Success Criteria**:
- Production deployment with 99.9% availability target
- Secure external access with TLS
- Monitoring dashboards and alerting operational
- Backup and recovery procedures validated
- Migration dry runs successful
### [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md) (Weeks 13-16)
**Objective**: Implement advanced features and optimize for scale and performance.
**Key Deliverables**:
- Multi-layer caching (Memory, Redis, CDN)
- Advanced performance optimizations
- Enhanced security features and compliance
- Production migration execution
- Operational excellence and automation
**Success Criteria**:
- Multi-layer caching reducing database load by 70%
- 95th percentile response time under 500ms
- Zero-downtime production migration completed
- Advanced security policies implemented
- Team trained on new operational procedures
## Migration Strategy
### Pre-Migration Assessment
1. **Data Inventory**: Catalog all existing data, configurations, and file attachments
2. **Dependency Mapping**: Identify all external dependencies and integrations
3. **Performance Baseline**: Establish current performance metrics for comparison
4. **User Impact Assessment**: Analyze potential downtime and user experience changes
### Migration Execution Plan
#### Blue-Green Deployment Strategy
- Parallel environment setup to minimize risk
- Gradual traffic migration with automated rollback
- Comprehensive validation at each step
- Minimal downtime through DNS cutover
#### Data Migration Approach
- Initial bulk data migration during low-usage periods
- Incremental synchronization during cutover
- Automated validation and integrity checks
- Point-in-time recovery capabilities
## Risk Assessment and Mitigation
### High Impact Risks
**Data Loss or Corruption**
- **Probability**: Low | **Impact**: Critical
- **Mitigation**: Multiple backup strategies, parallel systems, automated validation
**Extended Downtime During Migration**
- **Probability**: Medium | **Impact**: High
- **Mitigation**: Blue-green deployment, comprehensive rollback procedures
**Performance Degradation**
- **Probability**: Medium | **Impact**: Medium
- **Mitigation**: Load testing, performance monitoring, auto-scaling
### Mitigation Strategies
- Comprehensive testing at each phase
- Automated rollback procedures
- Parallel running systems during transition
- 24/7 monitoring during critical periods
## Success Metrics
### Technical Success Criteria
- **Availability**: 99.9% uptime (≤ 8.76 hours downtime/year)
- **Performance**: 95th percentile response time < 500ms
- **Scalability**: Handle 10x current user load
- **Recovery**: RTO < 1 hour, RPO < 15 minutes
### Operational Success Criteria
- **Deployment Frequency**: Weekly deployments with zero downtime
- **Mean Time to Recovery**: < 30 minutes for critical issues
- **Change Failure Rate**: < 5% of deployments require rollback
- **Monitoring Coverage**: 100% of critical services monitored
### Business Success Criteria
- **User Satisfaction**: No degradation in user experience
- **Cost Efficiency**: Infrastructure costs within 20% of current spending
- **Maintenance Overhead**: 50% reduction in operational maintenance time
- **Future Readiness**: Foundation for advanced features and scaling
## Implementation Timeline
### 16-Week Detailed Schedule
**Weeks 1-4**: [Phase 1 - Core Kubernetes Readiness](K8S-PHASE-1.md)
- Application configuration externalization
- Database architecture modernization
- Health checks and logging implementation
**Weeks 5-8**: [Phase 2 - High Availability Infrastructure](K8S-PHASE-2.md)
- MinIO and PostgreSQL HA deployment
- File storage abstraction
- Redis cluster implementation
**Weeks 9-12**: [Phase 3 - Production Deployment](K8S-PHASE-3.md)
- Production Kubernetes deployment
- Security and monitoring implementation
- Backup and recovery procedures
**Weeks 13-16**: [Phase 4 - Advanced Features](K8S-PHASE-4.md)
- Performance optimization
- Security enhancements
- Production migration execution
## Team Requirements
### Skills and Training
- **Kubernetes Administration**: Container orchestration and cluster management
- **Cloud-Native Development**: Microservices patterns and distributed systems
- **Monitoring and Observability**: Prometheus, Grafana, and logging systems
- **Security**: Container security, network policies, and secret management
### Operational Procedures
- **Deployment Automation**: CI/CD pipelines and GitOps workflows
- **Incident Response**: Monitoring, alerting, and escalation procedures
- **Backup and Recovery**: Automated backup validation and recovery testing
- **Performance Management**: Capacity planning and scaling procedures
## Getting Started
### Prerequisites
- Kubernetes cluster (development/staging/production)
- Container registry for Docker images
- Persistent storage classes
- Network policies and ingress controller
- Monitoring infrastructure (Prometheus/Grafana)
### Phase 1 Quick Start
1. Review [Phase 1 implementation guide](K8S-PHASE-1.md)
2. Set up development Kubernetes environment
3. Create ConfigMap and Secret templates
4. Begin application configuration externalization
5. Remove LiteDB dependencies
### Next Steps
After completing Phase 1, proceed with:
- [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md)
- [Phase 3: Production Deployment](K8S-PHASE-3.md)
- [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md)
## Support and Documentation
### Additional Resources
- **Architecture Documentation**: See [docs/architecture.md](docs/architecture.md)
- **Development Guidelines**: Follow existing code conventions and patterns
- **Testing Strategy**: Comprehensive testing at each phase
- **Security Guidelines**: Container and Kubernetes security best practices
### Team Contacts
- **Project Lead**: Kubernetes modernization coordination
- **DevOps Team**: Infrastructure and deployment automation
- **Security Team**: Security policies and compliance validation
- **QA Team**: Testing and validation procedures
---
**Document Version**: 1.0
**Last Updated**: January 2025
**Status**: Implementation Ready
This comprehensive modernization plan provides a structured approach to transforming MotoVaultPro into a cloud-native, highly available application running on Kubernetes. Each phase builds upon the previous one, ensuring minimal risk while delivering maximum benefits for future growth and reliability.

3416
docs/K8S-PHASE-1-DETAILED.md Normal file

File diff suppressed because it is too large Load Diff

365
docs/K8S-PHASE-1.md Normal file
View File

@@ -0,0 +1,365 @@
# Phase 1: Core Kubernetes Readiness (Weeks 1-4)
This phase focuses on making the application compatible with Kubernetes deployment patterns while maintaining existing functionality.
## Overview
The primary goal of Phase 1 is to transform MotoVaultPro from a traditional self-hosted application into a Kubernetes-ready application. This involves removing state dependencies, externalizing configuration, implementing health checks, and modernizing the database architecture.
## Key Objectives
- **Configuration Externalization**: Move all configuration from files to Kubernetes-native management
- **Database Modernization**: Eliminate LiteDB dependency and optimize PostgreSQL usage
- **Health Check Implementation**: Add Kubernetes-compatible health check endpoints
- **Logging Enhancement**: Implement structured logging for centralized log aggregation
## 1.1 Configuration Externalization
**Objective**: Move all configuration from files to Kubernetes-native configuration management.
**Current State**:
- Configuration stored in `appsettings.json` and environment variables
- Database connection strings in configuration files
- Feature flags and application settings mixed with deployment configuration
**Target State**:
- All configuration externalized to ConfigMaps and Secrets
- Environment-specific configuration separated from application code
- Sensitive data (passwords, API keys) managed through Kubernetes Secrets
### Implementation Tasks
#### 1. Create ConfigMap templates for non-sensitive configuration
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: motovault-config
data:
APP_NAME: "MotoVaultPro"
LOG_LEVEL: "Information"
ENABLE_FEATURES: "OpenIDConnect,EmailNotifications"
CACHE_EXPIRY_MINUTES: "30"
```
#### 2. Create Secret templates for sensitive configuration
```yaml
apiVersion: v1
kind: Secret
metadata:
name: motovault-secrets
type: Opaque
data:
POSTGRES_CONNECTION: <base64-encoded-connection-string>
MINIO_ACCESS_KEY: <base64-encoded-access-key>
MINIO_SECRET_KEY: <base64-encoded-secret-key>
JWT_SECRET: <base64-encoded-jwt-secret>
```
#### 3. Modify application startup to read from environment variables
- Update `Program.cs` to prioritize environment variables over file configuration
- Remove dependencies on `appsettings.json` for runtime configuration
- Implement configuration validation at startup
#### 4. Remove file-based configuration dependencies
- Update all services to use IConfiguration instead of direct file access
- Ensure all configuration is injectable through dependency injection
#### 5. Implement configuration validation at startup
- Add startup checks to ensure all required configuration is present
- Fail fast if critical configuration is missing
## 1.2 Database Architecture Modernization
**Objective**: Eliminate LiteDB dependency and optimize PostgreSQL usage for Kubernetes.
**Current State**:
- Dual database support with LiteDB as default
- Single PostgreSQL connection for external database mode
- No connection pooling optimization for multiple instances
**Target State**:
- PostgreSQL-only configuration with high availability
- Optimized connection pooling for horizontal scaling
- Database migration strategy for existing LiteDB installations
### Implementation Tasks
#### 1. Remove LiteDB implementation and dependencies
```csharp
// Remove all LiteDB-related code from:
// - External/Implementations/LiteDB/
// - Remove LiteDB package references
// - Update dependency injection to only register PostgreSQL implementations
```
#### 2. Implement PostgreSQL HA configuration
```csharp
services.AddDbContext<MotoVaultContext>(options =>
{
options.UseNpgsql(connectionString, npgsqlOptions =>
{
npgsqlOptions.EnableRetryOnFailure(
maxRetryCount: 3,
maxRetryDelay: TimeSpan.FromSeconds(5),
errorCodesToAdd: null);
});
});
```
#### 3. Add connection pooling configuration
```csharp
// Configure connection pooling for multiple instances
services.Configure<NpgsqlConnectionStringBuilder>(options =>
{
options.MaxPoolSize = 100;
options.MinPoolSize = 10;
options.ConnectionLifetime = 300; // 5 minutes
});
```
#### 4. Create data migration tools for LiteDB to PostgreSQL conversion
- Develop utility to export data from LiteDB format
- Create import scripts for PostgreSQL
- Ensure data integrity during migration
#### 5. Implement database health checks for Kubernetes probes
```csharp
public class DatabaseHealthCheck : IHealthCheck
{
private readonly IDbContextFactory<MotoVaultContext> _contextFactory;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
using var dbContext = _contextFactory.CreateDbContext();
await dbContext.Database.CanConnectAsync(cancellationToken);
return HealthCheckResult.Healthy("Database connection successful");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Database connection failed", ex);
}
}
}
```
## 1.3 Health Check Implementation
**Objective**: Add Kubernetes-compatible health check endpoints for proper orchestration.
**Current State**:
- No dedicated health check endpoints
- Application startup/shutdown not optimized for Kubernetes
**Target State**:
- Comprehensive health checks for all dependencies
- Proper readiness and liveness probe endpoints
- Graceful shutdown handling for pod termination
### Implementation Tasks
#### 1. Add health check middleware
```csharp
// Program.cs
builder.Services.AddHealthChecks()
.AddNpgSql(connectionString, name: "database")
.AddRedis(redisConnectionString, name: "cache")
.AddCheck<MinIOHealthCheck>("minio");
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready"),
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = _ => false // Only check if the app is responsive
});
```
#### 2. Implement custom health checks
```csharp
public class MinIOHealthCheck : IHealthCheck
{
private readonly IMinioClient _minioClient;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await _minioClient.ListBucketsAsync(cancellationToken);
return HealthCheckResult.Healthy("MinIO is accessible");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("MinIO is not accessible", ex);
}
}
}
```
#### 3. Add graceful shutdown handling
```csharp
builder.Services.Configure<HostOptions>(options =>
{
options.ShutdownTimeout = TimeSpan.FromSeconds(30);
});
```
## 1.4 Logging Enhancement
**Objective**: Implement structured logging suitable for centralized log aggregation.
**Current State**:
- Basic logging with simple string messages
- No correlation IDs for distributed tracing
- Log levels not optimized for production monitoring
**Target State**:
- JSON-structured logging with correlation IDs
- Centralized log aggregation compatibility
- Performance and error metrics embedded in logs
### Implementation Tasks
#### 1. Configure structured logging
```csharp
builder.Services.AddLogging(loggingBuilder =>
{
loggingBuilder.ClearProviders();
loggingBuilder.AddJsonConsole(options =>
{
options.IncludeScopes = true;
options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ";
options.JsonWriterOptions = new JsonWriterOptions
{
Indented = false
};
});
});
```
#### 2. Add correlation ID middleware
```csharp
public class CorrelationIdMiddleware
{
public async Task InvokeAsync(HttpContext context, RequestDelegate next)
{
var correlationId = context.Request.Headers["X-Correlation-ID"]
.FirstOrDefault() ?? Guid.NewGuid().ToString();
using var scope = _logger.BeginScope(new Dictionary<string, object>
{
["CorrelationId"] = correlationId,
["UserId"] = context.User?.Identity?.Name
});
context.Response.Headers.Add("X-Correlation-ID", correlationId);
await next(context);
}
}
```
#### 3. Implement performance logging for critical operations
- Add timing information to database operations
- Log request/response metrics
- Include user context in all log entries
## Week-by-Week Breakdown
### Week 1: Environment Setup and Configuration
- **Days 1-2**: Set up development Kubernetes environment
- **Days 3-4**: Create ConfigMap and Secret templates
- **Days 5-7**: Modify application to read from environment variables
### Week 2: Database Migration
- **Days 1-3**: Remove LiteDB dependencies
- **Days 4-5**: Implement PostgreSQL connection pooling
- **Days 6-7**: Create data migration utilities
### Week 3: Health Checks and Monitoring
- **Days 1-3**: Implement health check endpoints
- **Days 4-5**: Add custom health checks for dependencies
- **Days 6-7**: Test health check functionality
### Week 4: Logging and Documentation
- **Days 1-3**: Implement structured logging
- **Days 4-5**: Add correlation ID middleware
- **Days 6-7**: Document changes and prepare for Phase 2
## Success Criteria
- [ ] Application starts successfully using only environment variables
- [ ] All LiteDB dependencies removed
- [ ] PostgreSQL connection pooling configured and tested
- [ ] Health check endpoints return appropriate status
- [ ] Structured JSON logging implemented
- [ ] Data migration tool successfully converts LiteDB to PostgreSQL
- [ ] Application can be deployed to Kubernetes without file dependencies
## Testing Requirements
### Unit Tests
- Configuration validation logic
- Health check implementations
- Database connection handling
### Integration Tests
- End-to-end application startup with external configuration
- Database connectivity and migration
- Health check endpoint responses
### Manual Testing
- Deploy to development Kubernetes cluster
- Verify all functionality works without local file dependencies
- Test health check endpoints with kubectl
## Deliverables
1. **Updated Application Code**
- Removed LiteDB dependencies
- Externalized configuration
- Added health checks
- Implemented structured logging
2. **Kubernetes Manifests**
- ConfigMap templates
- Secret templates
- Basic deployment configuration for testing
3. **Migration Tools**
- LiteDB to PostgreSQL data migration utility
- Configuration migration scripts
4. **Documentation**
- Updated deployment instructions
- Configuration reference
- Health check endpoint documentation
## Dependencies
- Kubernetes cluster (development environment)
- PostgreSQL instance for testing
- Docker registry for container images
## Risks and Mitigations
### Risk: Data Loss During Migration
**Mitigation**: Comprehensive backup strategy and thorough testing of migration tools
### Risk: Configuration Errors
**Mitigation**: Configuration validation at startup and extensive testing
### Risk: Performance Degradation
**Mitigation**: Performance testing and gradual rollout with monitoring
---
**Next Phase**: [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md)

742
docs/K8S-PHASE-2.md Normal file
View File

@@ -0,0 +1,742 @@
# Phase 2: High Availability Infrastructure (Weeks 5-8)
This phase focuses on implementing the supporting infrastructure required for high availability, including MinIO clusters, PostgreSQL HA setup, Redis clusters, and file storage abstraction.
## Overview
Phase 2 transforms MotoVaultPro's supporting infrastructure from single-instance services to highly available, distributed systems. This phase establishes the foundation for true high availability by eliminating all single points of failure in the data layer.
## Key Objectives
- **MinIO High Availability**: Deploy distributed object storage with erasure coding
- **File Storage Abstraction**: Create unified interface for file operations
- **PostgreSQL HA**: Implement primary/replica configuration with automated failover
- **Redis Cluster**: Deploy distributed caching and session storage
- **Data Migration**: Seamless transition from local storage to distributed systems
## 2.1 MinIO High Availability Setup
**Objective**: Deploy a highly available MinIO cluster for file storage with automatic failover.
**Architecture Overview**:
MinIO will be deployed as a distributed cluster with erasure coding for data protection and automatic healing capabilities.
### MinIO Cluster Configuration
```yaml
# MinIO Tenant Configuration
apiVersion: minio.min.io/v2
kind: Tenant
metadata:
name: motovault-minio
namespace: motovault
spec:
image: minio/minio:RELEASE.2024-01-16T16-07-38Z
creationDate: 2024-01-20T10:00:00Z
pools:
- servers: 4
name: pool-0
volumesPerServer: 4
volumeClaimTemplate:
metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: fast-ssd
mountPath: /export
subPath: /data
requestAutoCert: false
certConfig:
commonName: ""
organizationName: []
dnsNames: []
console:
image: minio/console:v0.22.5
replicas: 2
consoleSecret:
name: motovault-minio-console-secret
configuration:
name: motovault-minio-config
```
### Implementation Tasks
#### 1. Deploy MinIO Operator
```bash
kubectl apply -k "github.com/minio/operator/resources"
```
#### 2. Create MinIO cluster configuration with erasure coding
- Configure 4+ nodes for optimal erasure coding
- Set up data protection with automatic healing
- Configure storage classes for performance
#### 3. Configure backup policies for disaster recovery
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: minio-backup-policy
data:
backup-policy.json: |
{
"rules": [
{
"id": "motovault-backup",
"status": "Enabled",
"transition": {
"days": 30,
"storage_class": "GLACIER"
}
}
]
}
```
#### 4. Set up monitoring with Prometheus metrics
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: minio-metrics
spec:
selector:
matchLabels:
app: minio
endpoints:
- port: http-minio
path: /minio/v2/metrics/cluster
```
#### 5. Create service endpoints for application connectivity
```yaml
apiVersion: v1
kind: Service
metadata:
name: minio-service
spec:
selector:
app: minio
ports:
- name: http
port: 9000
targetPort: 9000
- name: console
port: 9001
targetPort: 9001
```
### MinIO High Availability Features
- **Erasure Coding**: Data is split across multiple drives with parity for automatic healing
- **Distributed Architecture**: No single point of failure
- **Automatic Healing**: Corrupted data is automatically detected and repaired
- **Load Balancing**: Built-in load balancing across cluster nodes
- **Bucket Policies**: Fine-grained access control for different data types
## 2.2 File Storage Abstraction Implementation
**Objective**: Create an abstraction layer that allows seamless switching between local filesystem and MinIO object storage.
**Current State**:
- Direct filesystem operations throughout the application
- File paths hardcoded in various controllers and services
- No abstraction for different storage backends
**Target State**:
- Unified file storage interface
- Pluggable storage implementations
- Transparent migration between storage types
### Implementation Tasks
#### 1. Define storage abstraction interface
```csharp
public interface IFileStorageService
{
Task<string> UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default);
Task<Stream> DownloadFileAsync(string fileId, CancellationToken cancellationToken = default);
Task<bool> DeleteFileAsync(string fileId, CancellationToken cancellationToken = default);
Task<FileMetadata> GetFileMetadataAsync(string fileId, CancellationToken cancellationToken = default);
Task<IEnumerable<FileMetadata>> ListFilesAsync(string prefix = null, CancellationToken cancellationToken = default);
Task<string> GeneratePresignedUrlAsync(string fileId, TimeSpan expiration, CancellationToken cancellationToken = default);
}
public class FileMetadata
{
public string Id { get; set; }
public string FileName { get; set; }
public string ContentType { get; set; }
public long Size { get; set; }
public DateTime CreatedDate { get; set; }
public DateTime ModifiedDate { get; set; }
public Dictionary<string, string> Tags { get; set; }
}
```
#### 2. Implement MinIO storage service
```csharp
public class MinIOFileStorageService : IFileStorageService
{
private readonly IMinioClient _minioClient;
private readonly ILogger<MinIOFileStorageService> _logger;
private readonly string _bucketName;
public MinIOFileStorageService(IMinioClient minioClient, IConfiguration configuration, ILogger<MinIOFileStorageService> logger)
{
_minioClient = minioClient;
_logger = logger;
_bucketName = configuration["MinIO:BucketName"] ?? "motovault-files";
}
public async Task<string> UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default)
{
var fileId = $"{Guid.NewGuid()}/{fileName}";
try
{
await _minioClient.PutObjectAsync(new PutObjectArgs()
.WithBucket(_bucketName)
.WithObject(fileId)
.WithStreamData(fileStream)
.WithObjectSize(fileStream.Length)
.WithContentType(contentType)
.WithHeaders(new Dictionary<string, string>
{
["X-Amz-Meta-Original-Name"] = fileName,
["X-Amz-Meta-Upload-Date"] = DateTime.UtcNow.ToString("O")
}), cancellationToken);
_logger.LogInformation("File uploaded successfully: {FileId}", fileId);
return fileId;
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to upload file: {FileName}", fileName);
throw;
}
}
public async Task<Stream> DownloadFileAsync(string fileId, CancellationToken cancellationToken = default)
{
try
{
var memoryStream = new MemoryStream();
await _minioClient.GetObjectAsync(new GetObjectArgs()
.WithBucket(_bucketName)
.WithObject(fileId)
.WithCallbackStream(stream => stream.CopyTo(memoryStream)), cancellationToken);
memoryStream.Position = 0;
return memoryStream;
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to download file: {FileId}", fileId);
throw;
}
}
// Additional method implementations...
}
```
#### 3. Create fallback storage service for graceful degradation
```csharp
public class FallbackFileStorageService : IFileStorageService
{
private readonly IFileStorageService _primaryService;
private readonly IFileStorageService _fallbackService;
private readonly ILogger<FallbackFileStorageService> _logger;
public FallbackFileStorageService(
IFileStorageService primaryService,
IFileStorageService fallbackService,
ILogger<FallbackFileStorageService> logger)
{
_primaryService = primaryService;
_fallbackService = fallbackService;
_logger = logger;
}
public async Task<string> UploadFileAsync(Stream fileStream, string fileName, string contentType, CancellationToken cancellationToken = default)
{
try
{
return await _primaryService.UploadFileAsync(fileStream, fileName, contentType, cancellationToken);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Primary storage failed, falling back to secondary storage");
fileStream.Position = 0; // Reset stream position
return await _fallbackService.UploadFileAsync(fileStream, fileName, contentType, cancellationToken);
}
}
// Implementation with automatic fallback logic for other methods...
}
```
#### 4. Update all file operations to use the abstraction layer
- Replace direct File.WriteAllBytes, File.ReadAllBytes calls
- Update all controllers to use IFileStorageService
- Modify attachment handling in vehicle records
#### 5. Implement file migration utility for existing local files
```csharp
public class FileMigrationService
{
private readonly IFileStorageService _targetStorage;
private readonly ILogger<FileMigrationService> _logger;
public async Task<MigrationResult> MigrateLocalFilesAsync(string localPath)
{
var result = new MigrationResult();
var files = Directory.GetFiles(localPath, "*", SearchOption.AllDirectories);
foreach (var filePath in files)
{
try
{
using var fileStream = File.OpenRead(filePath);
var fileName = Path.GetFileName(filePath);
var contentType = GetContentType(fileName);
var fileId = await _targetStorage.UploadFileAsync(fileStream, fileName, contentType);
result.ProcessedFiles.Add(new MigratedFile
{
OriginalPath = filePath,
NewFileId = fileId,
Success = true
});
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to migrate file: {FilePath}", filePath);
result.ProcessedFiles.Add(new MigratedFile
{
OriginalPath = filePath,
Success = false,
Error = ex.Message
});
}
}
return result;
}
}
```
## 2.3 PostgreSQL High Availability Configuration
**Objective**: Set up a PostgreSQL cluster with automatic failover and read replicas.
**Architecture Overview**:
PostgreSQL will be deployed using an operator (like CloudNativePG or Postgres Operator) to provide automated failover, backup, and scaling capabilities.
### PostgreSQL Cluster Configuration
```yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: motovault-postgres
namespace: motovault
spec:
instances: 3
primaryUpdateStrategy: unsupervised
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
effective_cache_size: "1GB"
maintenance_work_mem: "64MB"
checkpoint_completion_target: "0.9"
wal_buffers: "16MB"
default_statistics_target: "100"
random_page_cost: "1.1"
effective_io_concurrency: "200"
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
storage:
size: "100Gi"
storageClass: "fast-ssd"
monitoring:
enabled: true
backup:
retentionPolicy: "30d"
barmanObjectStore:
destinationPath: "s3://motovault-backups/postgres"
s3Credentials:
accessKeyId:
name: postgres-backup-credentials
key: ACCESS_KEY_ID
secretAccessKey:
name: postgres-backup-credentials
key: SECRET_ACCESS_KEY
wal:
retention: "5d"
data:
retention: "30d"
jobs: 1
```
### Implementation Tasks
#### 1. Deploy PostgreSQL operator (CloudNativePG recommended)
```bash
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.20/releases/cnpg-1.20.1.yaml
```
#### 2. Configure cluster with primary/replica setup
- 3-node cluster with automatic failover
- Read-write split capability
- Streaming replication configuration
#### 3. Set up automated backups to MinIO or external storage
```yaml
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: motovault-postgres-backup
spec:
schedule: "0 2 * * *" # Daily at 2 AM
backupOwnerReference: self
cluster:
name: motovault-postgres
```
#### 4. Implement connection pooling with PgBouncer
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pgbouncer
spec:
replicas: 2
selector:
matchLabels:
app: pgbouncer
template:
spec:
containers:
- name: pgbouncer
image: pgbouncer/pgbouncer:latest
env:
- name: DATABASES_HOST
value: motovault-postgres-rw
- name: DATABASES_PORT
value: "5432"
- name: DATABASES_DATABASE
value: motovault
- name: POOL_MODE
value: session
- name: MAX_CLIENT_CONN
value: "1000"
- name: DEFAULT_POOL_SIZE
value: "25"
```
#### 5. Configure monitoring and alerting for database health
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: postgres-metrics
spec:
selector:
matchLabels:
app.kubernetes.io/name: cloudnative-pg
endpoints:
- port: metrics
path: /metrics
```
## 2.4 Redis Cluster for Session Management
**Objective**: Implement distributed session storage and caching using Redis cluster.
**Current State**:
- In-memory session storage tied to individual application instances
- No distributed caching for expensive operations
- Configuration and translation data loaded on each application start
**Target State**:
- Redis cluster for distributed session storage
- Centralized caching for frequently accessed data
- High availability with automatic failover
### Redis Cluster Configuration
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster-config
namespace: motovault
data:
redis.conf: |
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly yes
appendfsync everysec
save 900 1
save 300 10
save 60 10000
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
namespace: motovault
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:7-alpine
command:
- redis-server
- /etc/redis/redis.conf
ports:
- containerPort: 6379
- containerPort: 16379
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumeMounts:
- name: redis-config
mountPath: /etc/redis
- name: redis-data
mountPath: /data
volumes:
- name: redis-config
configMap:
name: redis-cluster-config
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
```
### Implementation Tasks
#### 1. Deploy Redis cluster with 6 nodes (3 masters, 3 replicas)
```bash
# Initialize Redis cluster after deployment
kubectl exec -it redis-cluster-0 -- redis-cli --cluster create \
redis-cluster-0.redis-cluster:6379 \
redis-cluster-1.redis-cluster:6379 \
redis-cluster-2.redis-cluster:6379 \
redis-cluster-3.redis-cluster:6379 \
redis-cluster-4.redis-cluster:6379 \
redis-cluster-5.redis-cluster:6379 \
--cluster-replicas 1
```
#### 2. Configure session storage
```csharp
services.AddStackExchangeRedisCache(options =>
{
options.Configuration = configuration.GetConnectionString("Redis");
options.InstanceName = "MotoVault";
});
services.AddSession(options =>
{
options.IdleTimeout = TimeSpan.FromMinutes(30);
options.Cookie.HttpOnly = true;
options.Cookie.IsEssential = true;
options.Cookie.SecurePolicy = CookieSecurePolicy.Always;
});
```
#### 3. Implement distributed caching
```csharp
public class CachedTranslationService : ITranslationService
{
private readonly IDistributedCache _cache;
private readonly ITranslationService _translationService;
private readonly ILogger<CachedTranslationService> _logger;
public async Task<string> GetTranslationAsync(string key, string language)
{
var cacheKey = $"translation:{language}:{key}";
var cached = await _cache.GetStringAsync(cacheKey);
if (cached != null)
{
return cached;
}
var translation = await _translationService.GetTranslationAsync(key, language);
await _cache.SetStringAsync(cacheKey, translation, new DistributedCacheEntryOptions
{
SlidingExpiration = TimeSpan.FromHours(1)
});
return translation;
}
}
```
#### 4. Add cache monitoring and performance metrics
```csharp
public class CacheMetricsService
{
private readonly Counter _cacheHits;
private readonly Counter _cacheMisses;
private readonly Histogram _cacheOperationDuration;
public CacheMetricsService()
{
_cacheHits = Metrics.CreateCounter(
"motovault_cache_hits_total",
"Total cache hits",
new[] { "cache_type" });
_cacheMisses = Metrics.CreateCounter(
"motovault_cache_misses_total",
"Total cache misses",
new[] { "cache_type" });
_cacheOperationDuration = Metrics.CreateHistogram(
"motovault_cache_operation_duration_seconds",
"Cache operation duration",
new[] { "operation", "cache_type" });
}
}
```
## Week-by-Week Breakdown
### Week 5: MinIO Deployment
- **Days 1-2**: Deploy MinIO operator and configure basic cluster
- **Days 3-4**: Implement file storage abstraction interface
- **Days 5-7**: Create MinIO storage service implementation
### Week 6: File Migration and PostgreSQL HA
- **Days 1-2**: Complete file storage abstraction and migration tools
- **Days 3-4**: Deploy PostgreSQL operator and HA cluster
- **Days 5-7**: Configure connection pooling and backup strategies
### Week 7: Redis Cluster and Caching
- **Days 1-3**: Deploy Redis cluster and configure session storage
- **Days 4-5**: Implement distributed caching layer
- **Days 6-7**: Add cache monitoring and performance metrics
### Week 8: Integration and Testing
- **Days 1-3**: End-to-end testing of all HA components
- **Days 4-5**: Performance testing and optimization
- **Days 6-7**: Documentation and preparation for Phase 3
## Success Criteria
- [ ] MinIO cluster operational with erasure coding
- [ ] File storage abstraction implemented and tested
- [ ] PostgreSQL HA cluster with automatic failover
- [ ] Redis cluster providing distributed sessions
- [ ] All file operations migrated to object storage
- [ ] Comprehensive monitoring for all infrastructure components
- [ ] Backup and recovery procedures validated
## Testing Requirements
### Infrastructure Tests
- MinIO cluster failover scenarios
- PostgreSQL primary/replica failover
- Redis cluster node failure recovery
- Network partition handling
### Application Integration Tests
- File upload/download through abstraction layer
- Session persistence across application restarts
- Cache performance and invalidation
- Database connection pool behavior
### Performance Tests
- File storage throughput and latency
- Database query performance with connection pooling
- Cache hit/miss ratios and response times
## Deliverables
1. **Infrastructure Components**
- MinIO HA cluster configuration
- PostgreSQL HA cluster with operator
- Redis cluster deployment
- Monitoring and alerting setup
2. **Application Updates**
- File storage abstraction implementation
- Session management configuration
- Distributed caching integration
- Connection pooling optimization
3. **Migration Tools**
- File migration utility
- Database migration scripts
- Configuration migration helpers
4. **Documentation**
- Infrastructure architecture diagrams
- Operational procedures
- Monitoring and alerting guides
## Dependencies
- Kubernetes cluster with sufficient resources
- Storage classes for persistent volumes
- Prometheus and Grafana for monitoring
- Network connectivity between components
## Risks and Mitigations
### Risk: Data Corruption During File Migration
**Mitigation**: Checksum validation and parallel running of old/new systems
### Risk: Database Failover Issues
**Mitigation**: Extensive testing of failover scenarios and automated recovery
### Risk: Cache Inconsistency
**Mitigation**: Proper cache invalidation strategies and monitoring
---
**Previous Phase**: [Phase 1: Core Kubernetes Readiness](K8S-PHASE-1.md)
**Next Phase**: [Phase 3: Production Deployment](K8S-PHASE-3.md)

862
docs/K8S-PHASE-3.md Normal file
View File

@@ -0,0 +1,862 @@
# Phase 3: Production Deployment (Weeks 9-12)
This phase focuses on deploying the modernized application with proper production configurations, monitoring, backup strategies, and operational procedures.
## Overview
Phase 3 transforms the development-ready Kubernetes application into a production-grade system with comprehensive monitoring, automated backup and recovery, secure ingress, and operational excellence. This phase ensures the system is ready for enterprise-level workloads with proper security, performance, and reliability guarantees.
## Key Objectives
- **Production Kubernetes Deployment**: Configure scalable, secure deployment manifests
- **Ingress and TLS Configuration**: Secure external access with proper routing
- **Comprehensive Monitoring**: Application and infrastructure observability
- **Backup and Disaster Recovery**: Automated backup strategies and recovery procedures
- **Migration Execution**: Seamless transition from legacy system
## 3.1 Kubernetes Deployment Configuration
**Objective**: Create production-ready Kubernetes manifests with proper resource management and high availability.
### Application Deployment Configuration
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: motovault-app
namespace: motovault
labels:
app: motovault
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: motovault
template:
metadata:
labels:
app: motovault
version: v1.0.0
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080"
spec:
serviceAccountName: motovault-service-account
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- motovault
topologyKey: kubernetes.io/hostname
- weight: 50
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- motovault
topologyKey: topology.kubernetes.io/zone
containers:
- name: motovault
image: motovault:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: ASPNETCORE_ENVIRONMENT
value: "Production"
- name: ASPNETCORE_URLS
value: "http://+:8080"
envFrom:
- configMapRef:
name: motovault-config
- secretRef:
name: motovault-secrets
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumeMounts:
- name: tmp-volume
mountPath: /tmp
- name: app-logs
mountPath: /app/logs
volumes:
- name: tmp-volume
emptyDir: {}
- name: app-logs
emptyDir: {}
terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
name: motovault-service
namespace: motovault
labels:
app: motovault
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
selector:
app: motovault
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: motovault-pdb
namespace: motovault
spec:
minAvailable: 2
selector:
matchLabels:
app: motovault
```
### Horizontal Pod Autoscaler Configuration
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: motovault-hpa
namespace: motovault
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: motovault-app
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
```
### Implementation Tasks
#### 1. Create production namespace with security policies
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: motovault
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
```
#### 2. Configure resource quotas and limits
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: motovault-quota
namespace: motovault
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
persistentvolumeclaims: "10"
pods: "20"
```
#### 3. Set up service accounts and RBAC
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: motovault-service-account
namespace: motovault
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: motovault-role
namespace: motovault
rules:
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: motovault-rolebinding
namespace: motovault
subjects:
- kind: ServiceAccount
name: motovault-service-account
namespace: motovault
roleRef:
kind: Role
name: motovault-role
apiGroup: rbac.authorization.k8s.io
```
#### 4. Configure pod anti-affinity for high availability
- Spread pods across nodes and availability zones
- Ensure no single point of failure
- Optimize for both performance and availability
#### 5. Implement rolling update strategy with zero downtime
- Configure progressive rollout with health checks
- Automatic rollback on failure
- Canary deployment capabilities
## 3.2 Ingress and TLS Configuration
**Objective**: Configure secure external access with proper TLS termination and routing.
### Ingress Configuration
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: motovault-ingress
namespace: motovault
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
ingressClassName: nginx
tls:
- hosts:
- motovault.example.com
secretName: motovault-tls
rules:
- host: motovault.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: motovault-service
port:
number: 80
```
### TLS Certificate Management
```yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@motovault.example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
```
### Implementation Tasks
#### 1. Deploy cert-manager for automated TLS
```bash
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
```
#### 2. Configure Let's Encrypt for SSL certificates
- Automated certificate provisioning and renewal
- DNS-01 or HTTP-01 challenge configuration
- Certificate monitoring and alerting
#### 3. Set up WAF and DDoS protection
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: motovault-ingress-policy
namespace: motovault
spec:
podSelector:
matchLabels:
app: motovault
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: nginx-ingress
ports:
- protocol: TCP
port: 8080
```
#### 4. Configure rate limiting and security headers
- Request rate limiting per IP
- Security headers (HSTS, CSP, etc.)
- Request size limitations
#### 5. Set up health check endpoints for load balancer
- Configure ingress health checks
- Implement graceful degradation
- Monitor certificate expiration
## 3.3 Monitoring and Observability Setup
**Objective**: Implement comprehensive monitoring, logging, and alerting for production operations.
### Prometheus ServiceMonitor Configuration
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: motovault-metrics
namespace: motovault
labels:
app: motovault
spec:
selector:
matchLabels:
app: motovault
endpoints:
- port: http
path: /metrics
interval: 30s
scrapeTimeout: 10s
```
### Application Metrics Implementation
```csharp
public class MetricsService
{
private readonly Counter _httpRequestsTotal;
private readonly Histogram _httpRequestDuration;
private readonly Gauge _activeConnections;
private readonly Counter _databaseOperationsTotal;
private readonly Histogram _databaseOperationDuration;
public MetricsService()
{
_httpRequestsTotal = Metrics.CreateCounter(
"motovault_http_requests_total",
"Total number of HTTP requests",
new[] { "method", "endpoint", "status_code" });
_httpRequestDuration = Metrics.CreateHistogram(
"motovault_http_request_duration_seconds",
"Duration of HTTP requests in seconds",
new[] { "method", "endpoint" });
_activeConnections = Metrics.CreateGauge(
"motovault_active_connections",
"Number of active database connections");
_databaseOperationsTotal = Metrics.CreateCounter(
"motovault_database_operations_total",
"Total number of database operations",
new[] { "operation", "table", "status" });
_databaseOperationDuration = Metrics.CreateHistogram(
"motovault_database_operation_duration_seconds",
"Duration of database operations in seconds",
new[] { "operation", "table" });
}
public void RecordHttpRequest(string method, string endpoint, int statusCode, double duration)
{
_httpRequestsTotal.WithLabels(method, endpoint, statusCode.ToString()).Inc();
_httpRequestDuration.WithLabels(method, endpoint).Observe(duration);
}
public void RecordDatabaseOperation(string operation, string table, bool success, double duration)
{
var status = success ? "success" : "error";
_databaseOperationsTotal.WithLabels(operation, table, status).Inc();
_databaseOperationDuration.WithLabels(operation, table).Observe(duration);
}
}
```
### Grafana Dashboard Configuration
```json
{
"dashboard": {
"title": "MotoVaultPro Application Dashboard",
"panels": [
{
"title": "HTTP Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(motovault_http_requests_total[5m])",
"legendFormat": "{{method}} {{endpoint}}"
}
]
},
{
"title": "Response Time Percentiles",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.50, rate(motovault_http_request_duration_seconds_bucket[5m]))",
"legendFormat": "50th percentile"
},
{
"expr": "histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
},
{
"title": "Database Connection Pool",
"type": "singlestat",
"targets": [
{
"expr": "motovault_active_connections",
"legendFormat": "Active Connections"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(motovault_http_requests_total{status_code=~\"5..\"}[5m])",
"legendFormat": "5xx errors"
}
]
}
]
}
}
```
### Alert Manager Configuration
```yaml
groups:
- name: motovault.rules
rules:
- alert: HighErrorRate
expr: rate(motovault_http_requests_total{status_code=~"5.."}[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }}% for the last 5 minutes"
- alert: HighResponseTime
expr: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m])) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High response time detected"
description: "95th percentile response time is {{ $value }}s"
- alert: DatabaseConnectionPoolExhaustion
expr: motovault_active_connections > 80
for: 2m
labels:
severity: warning
annotations:
summary: "Database connection pool nearly exhausted"
description: "Active connections: {{ $value }}/100"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total{namespace="motovault"}[15m]) > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Pod is crash looping"
description: "Pod {{ $labels.pod }} is restarting frequently"
```
### Implementation Tasks
#### 1. Deploy Prometheus and Grafana stack
```bash
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml
```
#### 2. Configure application metrics endpoints
- Add Prometheus metrics middleware
- Implement custom business metrics
- Configure metric collection intervals
#### 3. Set up centralized logging with structured logs
```csharp
builder.Services.AddLogging(loggingBuilder =>
{
loggingBuilder.AddJsonConsole(options =>
{
options.JsonWriterOptions = new JsonWriterOptions { Indented = false };
options.IncludeScopes = true;
options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ";
});
});
```
#### 4. Create operational dashboards and alerts
- Application performance dashboards
- Infrastructure monitoring dashboards
- Business metrics and KPIs
- Alert routing and escalation
#### 5. Implement distributed tracing
```csharp
services.AddOpenTelemetry()
.WithTracing(builder =>
{
builder
.AddAspNetCoreInstrumentation()
.AddNpgsql()
.AddRedisInstrumentation()
.AddJaegerExporter();
});
```
## 3.4 Backup and Disaster Recovery
**Objective**: Implement comprehensive backup strategies and disaster recovery procedures.
### Velero Backup Configuration
```yaml
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: motovault-daily-backup
namespace: velero
spec:
schedule: "0 2 * * *" # Daily at 2 AM
template:
includedNamespaces:
- motovault
includedResources:
- "*"
storageLocation: default
ttl: 720h0m0s # 30 days
snapshotVolumes: true
---
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: motovault-weekly-backup
namespace: velero
spec:
schedule: "0 3 * * 0" # Weekly on Sunday at 3 AM
template:
includedNamespaces:
- motovault
includedResources:
- "*"
storageLocation: default
ttl: 2160h0m0s # 90 days
snapshotVolumes: true
```
### Database Backup Strategy
```bash
#!/bin/bash
# Automated database backup script
BACKUP_DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="motovault_backup_${BACKUP_DATE}.sql"
S3_BUCKET="motovault-backups"
# Create database backup
kubectl exec -n motovault motovault-postgres-1 -- \
pg_dump -U postgres motovault > "${BACKUP_FILE}"
# Compress backup
gzip "${BACKUP_FILE}"
# Upload to S3/MinIO
aws s3 cp "${BACKUP_FILE}.gz" "s3://${S3_BUCKET}/database/"
# Clean up local file
rm "${BACKUP_FILE}.gz"
# Retain only last 30 days of backups
aws s3api list-objects-v2 \
--bucket "${S3_BUCKET}" \
--prefix "database/" \
--query 'Contents[?LastModified<=`'$(date -d "30 days ago" --iso-8601)'`].[Key]' \
--output text | \
xargs -I {} aws s3 rm "s3://${S3_BUCKET}/{}"
```
### Disaster Recovery Procedures
```bash
#!/bin/bash
# Full system recovery script
BACKUP_DATE=$1
if [ -z "$BACKUP_DATE" ]; then
echo "Usage: $0 <backup_date>"
echo "Example: $0 20240120_020000"
exit 1
fi
# Stop application
echo "Scaling down application..."
kubectl scale deployment motovault-app --replicas=0 -n motovault
# Restore database
echo "Restoring database from backup..."
aws s3 cp "s3://motovault-backups/database/database_backup_${BACKUP_DATE}.sql.gz" .
gunzip "database_backup_${BACKUP_DATE}.sql.gz"
kubectl exec -i motovault-postgres-1 -n motovault -- \
psql -U postgres -d motovault < "database_backup_${BACKUP_DATE}.sql"
# Restore MinIO data
echo "Restoring MinIO data..."
aws s3 sync "s3://motovault-backups/minio/${BACKUP_DATE}/" /tmp/minio_restore/
mc mirror /tmp/minio_restore/ motovault-minio/motovault-files/
# Restart application
echo "Scaling up application..."
kubectl scale deployment motovault-app --replicas=3 -n motovault
# Verify health
echo "Waiting for application to be ready..."
kubectl wait --for=condition=ready pod -l app=motovault -n motovault --timeout=300s
echo "Recovery completed successfully"
```
### Implementation Tasks
#### 1. Deploy Velero for Kubernetes backup
```bash
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket motovault-backups \
--backup-location-config region=us-west-2 \
--snapshot-location-config region=us-west-2
```
#### 2. Configure automated database backups
- Point-in-time recovery setup
- Incremental backup strategies
- Cross-region backup replication
#### 3. Implement MinIO backup synchronization
- Automated file backup to external storage
- Metadata backup and restoration
- Verification of backup integrity
#### 4. Create disaster recovery runbooks
- Step-by-step recovery procedures
- RTO/RPO definitions and testing
- Contact information and escalation procedures
#### 5. Set up backup monitoring and alerting
```yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: backup-alerts
spec:
groups:
- name: backup.rules
rules:
- alert: BackupFailed
expr: velero_backup_failure_total > 0
labels:
severity: critical
annotations:
summary: "Backup operation failed"
description: "Velero backup has failed"
```
## Week-by-Week Breakdown
### Week 9: Production Kubernetes Configuration
- **Days 1-2**: Create production deployment manifests
- **Days 3-4**: Configure HPA, PDB, and resource quotas
- **Days 5-7**: Set up RBAC and security policies
### Week 10: Ingress and TLS Setup
- **Days 1-2**: Deploy and configure ingress controller
- **Days 3-4**: Set up cert-manager and TLS certificates
- **Days 5-7**: Configure security policies and rate limiting
### Week 11: Monitoring and Observability
- **Days 1-3**: Deploy Prometheus and Grafana stack
- **Days 4-5**: Configure application metrics and dashboards
- **Days 6-7**: Set up alerting and notification channels
### Week 12: Backup and Migration Preparation
- **Days 1-3**: Deploy and configure backup solutions
- **Days 4-5**: Create migration scripts and procedures
- **Days 6-7**: Execute migration dry runs and validation
## Success Criteria
- [ ] Production Kubernetes deployment with 99.9% availability
- [ ] Secure ingress with automated TLS certificate management
- [ ] Comprehensive monitoring with alerting
- [ ] Automated backup and recovery procedures tested
- [ ] Migration procedures validated and documented
- [ ] Security policies and network controls implemented
- [ ] Performance baselines established and monitored
## Testing Requirements
### Production Readiness Tests
- Load testing under expected traffic patterns
- Failover testing for all components
- Security penetration testing
- Backup and recovery validation
### Performance Tests
- Application response time under load
- Database performance with connection pooling
- Cache performance and hit ratios
- Network latency and throughput
### Security Tests
- Container image vulnerability scanning
- Network policy validation
- Authentication and authorization testing
- TLS configuration verification
## Deliverables
1. **Production Deployment**
- Complete Kubernetes manifests
- Security configurations
- Monitoring and alerting setup
- Backup and recovery procedures
2. **Documentation**
- Operational runbooks
- Security procedures
- Monitoring guides
- Disaster recovery plans
3. **Migration Tools**
- Data migration scripts
- Validation tools
- Rollback procedures
## Dependencies
- Production Kubernetes cluster
- External storage for backups
- DNS management for ingress
- Certificate authority for TLS
- Monitoring infrastructure
## Risks and Mitigations
### Risk: Extended Downtime During Migration
**Mitigation**: Blue-green deployment strategy with comprehensive rollback plan
### Risk: Data Integrity Issues
**Mitigation**: Extensive validation and parallel running during transition
### Risk: Performance Degradation
**Mitigation**: Load testing and gradual traffic migration
---
**Previous Phase**: [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md)
**Next Phase**: [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md)

885
docs/K8S-PHASE-4.md Normal file
View File

@@ -0,0 +1,885 @@
# Phase 4: Advanced Features and Optimization (Weeks 13-16)
This phase focuses on advanced cloud-native features, performance optimization, security enhancements, and final production migration.
## Overview
Phase 4 elevates MotoVaultPro to a truly cloud-native application with enterprise-grade features including advanced caching strategies, performance optimization, enhanced security, and seamless production migration. This phase ensures the system is optimized for scale, security, and operational excellence.
## Key Objectives
- **Advanced Caching Strategies**: Multi-layer caching for optimal performance
- **Performance Optimization**: Database and application tuning for high load
- **Security Enhancements**: Advanced security features and compliance
- **Production Migration**: Final cutover and optimization
- **Operational Excellence**: Advanced monitoring and automation
## 4.1 Advanced Caching Strategies
**Objective**: Implement multi-layer caching for optimal performance and reduced database load.
### Cache Architecture
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Browser │ │ CDN/Proxy │ │ Application │
│ Cache │◄──►│ Cache │◄──►│ Memory Cache │
│ (Static) │ │ (Static + │ │ (L1) │
│ │ │ Dynamic) │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
┌─────────────────┐
│ Redis Cache │
│ (L2) │
│ Distributed │
└─────────────────┘
┌─────────────────┐
│ Database │
│ (Source) │
│ │
└─────────────────┘
```
### Multi-Level Cache Service Implementation
```csharp
public class MultiLevelCacheService
{
private readonly IMemoryCache _memoryCache;
private readonly IDistributedCache _distributedCache;
private readonly ILogger<MultiLevelCacheService> _logger;
public async Task<T> GetAsync<T>(string key, Func<Task<T>> factory, TimeSpan? expiration = null)
{
// L1 Cache - Memory
if (_memoryCache.TryGetValue(key, out T cachedValue))
{
_logger.LogDebug("Cache hit (L1): {Key}", key);
return cachedValue;
}
// L2 Cache - Redis
var distributedValue = await _distributedCache.GetStringAsync(key);
if (distributedValue != null)
{
var deserializedValue = JsonSerializer.Deserialize<T>(distributedValue);
_memoryCache.Set(key, deserializedValue, TimeSpan.FromMinutes(5)); // Short-lived L1 cache
_logger.LogDebug("Cache hit (L2): {Key}", key);
return deserializedValue;
}
// Cache miss - fetch from source
_logger.LogDebug("Cache miss: {Key}", key);
var value = await factory();
// Store in both cache levels
var serializedValue = JsonSerializer.Serialize(value);
await _distributedCache.SetStringAsync(key, serializedValue, new DistributedCacheEntryOptions
{
SlidingExpiration = expiration ?? TimeSpan.FromHours(1)
});
_memoryCache.Set(key, value, TimeSpan.FromMinutes(5));
return value;
}
}
```
### Cache Invalidation Strategy
```csharp
public class CacheInvalidationService
{
private readonly IDistributedCache _distributedCache;
private readonly IMemoryCache _memoryCache;
private readonly ILogger<CacheInvalidationService> _logger;
public async Task InvalidatePatternAsync(string pattern)
{
// Implement cache invalidation using Redis key pattern matching
var keys = await GetKeysMatchingPatternAsync(pattern);
var tasks = keys.Select(async key =>
{
await _distributedCache.RemoveAsync(key);
_memoryCache.Remove(key);
_logger.LogDebug("Invalidated cache key: {Key}", key);
});
await Task.WhenAll(tasks);
}
public async Task InvalidateVehicleDataAsync(int vehicleId)
{
var patterns = new[]
{
$"vehicle:{vehicleId}:*",
$"dashboard:{vehicleId}:*",
$"reports:{vehicleId}:*"
};
foreach (var pattern in patterns)
{
await InvalidatePatternAsync(pattern);
}
}
}
```
### Implementation Tasks
#### 1. Implement intelligent cache warming
```csharp
public class CacheWarmupService : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
await WarmupFrequentlyAccessedData();
await Task.Delay(TimeSpan.FromHours(1), stoppingToken);
}
}
private async Task WarmupFrequentlyAccessedData()
{
// Pre-load dashboard data for active users
var activeUsers = await GetActiveUsersAsync();
var warmupTasks = activeUsers.Select(async user =>
{
await _cacheService.GetAsync($"dashboard:{user.Id}",
() => _dashboardService.GetDashboardDataAsync(user.Id));
});
await Task.WhenAll(warmupTasks);
}
}
```
#### 2. Configure CDN integration for static assets
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: motovault-cdn-ingress
annotations:
nginx.ingress.kubernetes.io/configuration-snippet: |
add_header Cache-Control "public, max-age=31536000, immutable";
add_header X-Cache-Status $upstream_cache_status;
spec:
rules:
- host: cdn.motovault.example.com
http:
paths:
- path: /static
pathType: Prefix
backend:
service:
name: motovault-service
port:
number: 80
```
#### 3. Implement cache monitoring and metrics
```csharp
public class CacheMetricsMiddleware
{
private readonly Counter _cacheHits;
private readonly Counter _cacheMisses;
private readonly Histogram _cacheLatency;
public async Task InvokeAsync(HttpContext context, RequestDelegate next)
{
var stopwatch = Stopwatch.StartNew();
// Track cache operations during request
context.Response.OnStarting(() =>
{
var cacheStatus = context.Response.Headers["X-Cache-Status"].FirstOrDefault();
if (cacheStatus == "HIT")
_cacheHits.Inc();
else if (cacheStatus == "MISS")
_cacheMisses.Inc();
_cacheLatency.Observe(stopwatch.Elapsed.TotalSeconds);
return Task.CompletedTask;
});
await next(context);
}
}
```
## 4.2 Performance Optimization
**Objective**: Optimize application performance for high-load scenarios.
### Database Query Optimization
```csharp
public class OptimizedVehicleService
{
private readonly IDbContextFactory<MotoVaultContext> _dbContextFactory;
private readonly IMemoryCache _cache;
public async Task<VehicleDashboardData> GetDashboardDataAsync(int userId, int vehicleId)
{
var cacheKey = $"dashboard:{userId}:{vehicleId}";
if (_cache.TryGetValue(cacheKey, out VehicleDashboardData cached))
{
return cached;
}
using var context = _dbContextFactory.CreateDbContext();
// Optimized single query with projections
var dashboardData = await context.Vehicles
.Where(v => v.Id == vehicleId && v.UserId == userId)
.Select(v => new VehicleDashboardData
{
Vehicle = v,
RecentServices = v.ServiceRecords
.OrderByDescending(s => s.Date)
.Take(5)
.ToList(),
UpcomingReminders = v.ReminderRecords
.Where(r => r.IsActive && r.DueDate > DateTime.Now)
.OrderBy(r => r.DueDate)
.Take(5)
.ToList(),
FuelEfficiency = v.GasRecords
.Where(g => g.Date >= DateTime.Now.AddMonths(-3))
.Average(g => g.Efficiency),
TotalMileage = v.OdometerRecords
.OrderByDescending(o => o.Date)
.FirstOrDefault().Mileage ?? 0
})
.AsNoTracking()
.FirstOrDefaultAsync();
_cache.Set(cacheKey, dashboardData, TimeSpan.FromMinutes(15));
return dashboardData;
}
}
```
### Connection Pool Optimization
```csharp
services.AddDbContextFactory<MotoVaultContext>(options =>
{
options.UseNpgsql(connectionString, npgsqlOptions =>
{
npgsqlOptions.EnableRetryOnFailure(
maxRetryCount: 3,
maxRetryDelay: TimeSpan.FromSeconds(5),
errorCodesToAdd: null);
npgsqlOptions.CommandTimeout(30);
});
// Optimize for read-heavy workloads
options.EnableSensitiveDataLogging(false);
options.EnableServiceProviderCaching();
options.EnableDetailedErrors(false);
}, ServiceLifetime.Singleton);
// Configure connection pooling
services.Configure<NpgsqlConnectionStringBuilder>(builder =>
{
builder.MaxPoolSize = 100;
builder.MinPoolSize = 10;
builder.ConnectionLifetime = 300;
builder.ConnectionPruningInterval = 10;
builder.ConnectionIdleLifetime = 300;
});
```
### Application Performance Optimization
```csharp
public class PerformanceOptimizationService
{
// Implement bulk operations for data modifications
public async Task<BulkUpdateResult> BulkUpdateServiceRecordsAsync(
List<ServiceRecord> records)
{
using var context = _dbContextFactory.CreateDbContext();
// Use EF Core bulk operations
context.AttachRange(records);
context.UpdateRange(records);
var affectedRows = await context.SaveChangesAsync();
// Invalidate related cache entries
var vehicleIds = records.Select(r => r.VehicleId).Distinct();
foreach (var vehicleId in vehicleIds)
{
await _cacheInvalidation.InvalidateVehicleDataAsync(vehicleId);
}
return new BulkUpdateResult { AffectedRows = affectedRows };
}
// Implement read-through cache for expensive calculations
public async Task<FuelEfficiencyReport> GetFuelEfficiencyReportAsync(
int vehicleId,
DateTime startDate,
DateTime endDate)
{
var cacheKey = $"fuel_report:{vehicleId}:{startDate:yyyyMM}:{endDate:yyyyMM}";
return await _multiLevelCache.GetAsync(cacheKey, async () =>
{
using var context = _dbContextFactory.CreateDbContext();
var gasRecords = await context.GasRecords
.Where(g => g.VehicleId == vehicleId &&
g.Date >= startDate &&
g.Date <= endDate)
.AsNoTracking()
.ToListAsync();
return CalculateFuelEfficiencyReport(gasRecords);
}, TimeSpan.FromHours(6));
}
}
```
### Implementation Tasks
#### 1. Implement database indexing strategy
```sql
-- Create optimized indexes for common queries
CREATE INDEX CONCURRENTLY idx_gasrecords_vehicle_date
ON gas_records(vehicle_id, date DESC);
CREATE INDEX CONCURRENTLY idx_servicerecords_vehicle_date
ON service_records(vehicle_id, date DESC);
CREATE INDEX CONCURRENTLY idx_reminderrecords_active_due
ON reminder_records(is_active, due_date)
WHERE is_active = true;
-- Partial indexes for better performance
CREATE INDEX CONCURRENTLY idx_vehicles_active_users
ON vehicles(user_id)
WHERE is_active = true;
```
#### 2. Configure response compression and bundling
```csharp
builder.Services.AddResponseCompression(options =>
{
options.Providers.Add<GzipCompressionProvider>();
options.Providers.Add<BrotliCompressionProvider>();
options.MimeTypes = ResponseCompressionDefaults.MimeTypes.Concat(
new[] { "application/json", "text/css", "application/javascript" });
});
builder.Services.Configure<GzipCompressionProviderOptions>(options =>
{
options.Level = CompressionLevel.Optimal;
});
```
#### 3. Implement request batching for API endpoints
```csharp
[HttpPost("batch")]
public async Task<IActionResult> BatchOperations([FromBody] BatchRequest request)
{
var results = new List<BatchResult>();
// Execute operations in parallel where possible
var tasks = request.Operations.Select(async operation =>
{
try
{
var result = await ExecuteOperationAsync(operation);
return new BatchResult { Success = true, Data = result };
}
catch (Exception ex)
{
return new BatchResult { Success = false, Error = ex.Message };
}
});
results.AddRange(await Task.WhenAll(tasks));
return Ok(new { Results = results });
}
```
## 4.3 Security Enhancements
**Objective**: Implement advanced security features for production deployment.
### Network Security Policies
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: motovault-network-policy
namespace: motovault
spec:
podSelector:
matchLabels:
app: motovault
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: nginx-ingress
ports:
- protocol: TCP
port: 8080
egress:
- to:
- namespaceSelector:
matchLabels:
name: motovault
ports:
- protocol: TCP
port: 5432 # PostgreSQL
- protocol: TCP
port: 6379 # Redis
- protocol: TCP
port: 9000 # MinIO
- to: [] # Allow external HTTPS for OIDC
ports:
- protocol: TCP
port: 443
- protocol: TCP
port: 80
```
### Pod Security Standards
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: motovault
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
```
### External Secrets Management
```yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-backend
namespace: motovault
spec:
provider:
vault:
server: "https://vault.example.com"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "motovault-role"
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: motovault-secrets
namespace: motovault
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: motovault-secrets
creationPolicy: Owner
data:
- secretKey: POSTGRES_CONNECTION
remoteRef:
key: motovault/database
property: connection_string
- secretKey: JWT_SECRET
remoteRef:
key: motovault/auth
property: jwt_secret
```
### Application Security Enhancements
```csharp
public class SecurityMiddleware
{
public async Task InvokeAsync(HttpContext context, RequestDelegate next)
{
// Add security headers
context.Response.Headers.Add("X-Content-Type-Options", "nosniff");
context.Response.Headers.Add("X-Frame-Options", "DENY");
context.Response.Headers.Add("X-XSS-Protection", "1; mode=block");
context.Response.Headers.Add("Referrer-Policy", "strict-origin-when-cross-origin");
context.Response.Headers.Add("Permissions-Policy", "geolocation=(), microphone=(), camera=()");
// Content Security Policy
var csp = "default-src 'self'; " +
"script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; " +
"style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; " +
"img-src 'self' data: https:; " +
"connect-src 'self';";
context.Response.Headers.Add("Content-Security-Policy", csp);
await next(context);
}
}
```
### Implementation Tasks
#### 1. Implement container image scanning
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: security-scan
spec:
entrypoint: scan-workflow
templates:
- name: scan-workflow
steps:
- - name: trivy-scan
template: trivy-container-scan
- - name: publish-results
template: publish-scan-results
- name: trivy-container-scan
container:
image: aquasec/trivy:latest
command: [trivy]
args: ["image", "--exit-code", "1", "--severity", "HIGH,CRITICAL", "motovault:latest"]
```
#### 2. Configure security monitoring and alerting
```yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: security-alerts
spec:
groups:
- name: security.rules
rules:
- alert: HighFailedLoginAttempts
expr: rate(motovault_failed_login_attempts_total[5m]) > 10
labels:
severity: warning
annotations:
summary: "High number of failed login attempts"
description: "{{ $value }} failed login attempts per second"
- alert: SuspiciousNetworkActivity
expr: rate(container_network_receive_bytes_total{namespace="motovault"}[5m]) > 1e8
labels:
severity: critical
annotations:
summary: "Unusual network activity detected"
```
#### 3. Implement rate limiting and DDoS protection
```csharp
services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
options.AddFixedWindowLimiter("api", limiterOptions =>
{
limiterOptions.PermitLimit = 100;
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 10;
});
options.AddSlidingWindowLimiter("login", limiterOptions =>
{
limiterOptions.PermitLimit = 5;
limiterOptions.Window = TimeSpan.FromMinutes(5);
limiterOptions.SegmentsPerWindow = 5;
});
});
```
## 4.4 Production Migration Execution
**Objective**: Execute seamless production migration with minimal downtime.
### Blue-Green Deployment Strategy
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: motovault-rollout
namespace: motovault
spec:
replicas: 5
strategy:
blueGreen:
activeService: motovault-active
previewService: motovault-preview
autoPromotionEnabled: false
scaleDownDelaySeconds: 30
prePromotionAnalysis:
templates:
- templateName: health-check
args:
- name: service-name
value: motovault-preview
postPromotionAnalysis:
templates:
- templateName: performance-check
args:
- name: service-name
value: motovault-active
selector:
matchLabels:
app: motovault
template:
metadata:
labels:
app: motovault
spec:
containers:
- name: motovault
image: motovault:latest
# ... container specification
```
### Migration Validation Scripts
```bash
#!/bin/bash
# Production migration validation script
echo "Starting production migration validation..."
# Validate database connectivity
echo "Checking database connectivity..."
kubectl exec -n motovault deployment/motovault-app -- \
curl -f http://localhost:8080/health/ready || exit 1
# Validate MinIO connectivity
echo "Checking MinIO connectivity..."
kubectl exec -n motovault deployment/motovault-app -- \
curl -f http://minio-service:9000/minio/health/live || exit 1
# Validate Redis connectivity
echo "Checking Redis connectivity..."
kubectl exec -n motovault redis-cluster-0 -- \
redis-cli ping || exit 1
# Test critical user journeys
echo "Testing critical user journeys..."
python3 migration_tests.py --endpoint https://motovault.example.com
# Validate performance metrics
echo "Checking performance metrics..."
response_time=$(curl -s "http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,rate(motovault_http_request_duration_seconds_bucket[5m]))" | jq -r '.data.result[0].value[1]')
if (( $(echo "$response_time > 2.0" | bc -l) )); then
echo "Performance degradation detected: ${response_time}s"
exit 1
fi
echo "Migration validation completed successfully"
```
### Rollback Procedures
```bash
#!/bin/bash
# Emergency rollback script
echo "Initiating emergency rollback..."
# Switch traffic back to previous version
kubectl patch rollout motovault-rollout -n motovault \
--type='merge' -p='{"spec":{"strategy":{"blueGreen":{"activeService":"motovault-previous"}}}}'
# Scale down new version
kubectl scale deployment motovault-app-new --replicas=0 -n motovault
# Restore database from last known good backup
BACKUP_TIMESTAMP=$(date -d "1 hour ago" +"%Y%m%d_%H0000")
./restore_database.sh "$BACKUP_TIMESTAMP"
# Validate rollback success
curl -f https://motovault.example.com/health/ready
echo "Rollback completed"
```
### Implementation Tasks
#### 1. Execute phased traffic migration
```yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: motovault-traffic-split
spec:
http:
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: motovault-service
subset: v2
weight: 100
- route:
- destination:
host: motovault-service
subset: v1
weight: 90
- destination:
host: motovault-service
subset: v2
weight: 10
```
#### 2. Implement automated rollback triggers
```yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: automated-rollback
spec:
metrics:
- name: error-rate
provider:
prometheus:
address: http://prometheus:9090
query: rate(motovault_http_requests_total{status_code=~"5.."}[2m])
successCondition: result[0] < 0.05
failureLimit: 3
- name: response-time
provider:
prometheus:
address: http://prometheus:9090
query: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[2m]))
successCondition: result[0] < 2.0
failureLimit: 3
```
#### 3. Configure comprehensive monitoring during migration
- Real-time error rate monitoring
- Performance metric tracking
- User experience validation
- Resource utilization monitoring
## Week-by-Week Breakdown
### Week 13: Advanced Caching and Performance
- **Days 1-2**: Implement multi-level caching architecture
- **Days 3-4**: Optimize database queries and connection pooling
- **Days 5-7**: Configure CDN and response optimization
### Week 14: Security Enhancements
- **Days 1-2**: Implement advanced security policies
- **Days 3-4**: Configure external secrets management
- **Days 5-7**: Set up security monitoring and scanning
### Week 15: Production Migration
- **Days 1-2**: Execute database migration and validation
- **Days 3-4**: Perform blue-green deployment cutover
- **Days 5-7**: Monitor performance and user experience
### Week 16: Optimization and Documentation
- **Days 1-3**: Performance tuning based on production metrics
- **Days 4-5**: Complete operational documentation
- **Days 6-7**: Team training and knowledge transfer
## Success Criteria
- [ ] Multi-layer caching reducing database load by 70%
- [ ] 95th percentile response time under 500ms
- [ ] Zero-downtime production migration
- [ ] Advanced security policies implemented and validated
- [ ] Comprehensive monitoring and alerting operational
- [ ] Team trained on new operational procedures
- [ ] Performance optimization achieving 10x scalability
## Testing Requirements
### Performance Validation
- Load testing with 10x expected traffic
- Database performance under stress
- Cache efficiency and hit ratios
- End-to-end response time validation
### Security Testing
- Penetration testing of all endpoints
- Container security scanning
- Network policy validation
- Authentication and authorization testing
### Migration Testing
- Complete migration dry runs
- Rollback procedure validation
- Data integrity verification
- User acceptance testing
## Deliverables
1. **Optimized Application**
- Multi-layer caching implementation
- Performance-optimized queries
- Security-hardened deployment
- Production-ready configuration
2. **Migration Artifacts**
- Migration scripts and procedures
- Rollback automation
- Validation tools
- Performance baselines
3. **Documentation**
- Operational runbooks
- Performance tuning guides
- Security procedures
- Training materials
## Final Success Metrics
### Technical Achievements
- **Availability**: 99.9% uptime achieved
- **Performance**: 95th percentile response time < 500ms
- **Scalability**: 10x user load capacity demonstrated
- **Security**: Zero critical vulnerabilities
### Operational Achievements
- **Deployment**: Zero-downtime deployments enabled
- **Recovery**: RTO < 30 minutes, RPO < 5 minutes
- **Monitoring**: 100% observability coverage
- **Automation**: 90% reduction in manual operations
### Business Value
- **User Experience**: No degradation during migration
- **Cost Efficiency**: Infrastructure costs optimized
- **Future Readiness**: Foundation for advanced features
- **Operational Excellence**: Reduced maintenance overhead
---
**Previous Phase**: [Phase 3: Production Deployment](K8S-PHASE-3.md)
**Project Overview**: [Kubernetes Modernization Overview](K8S-OVERVIEW.md)

2009
docs/K8S-REFACTOR.md Normal file

File diff suppressed because it is too large Load Diff

185
docs/MOBILE.md Normal file
View File

@@ -0,0 +1,185 @@
# Mobile Experience Improvement Plan for Add Fuel Record Screen
## Analysis Summary
The current add fuel record screen has significant mobile UX issues that create pain points for users on mobile devices. The interface feels like a shrunken desktop version rather than a mobile-first experience.
## Critical Mobile UX Issues Identified
### 1. Modal Size and Viewport Problems
- Uses Bootstrap's default modal sizing without mobile optimization
- No mobile-specific modal sizing classes or responsive adjustments
- **File Location**: `/Views/Vehicle/Gas/_GasModal.cshtml`
### 2. Touch Target Size Issues
- Small "+" button for odometer increment (44px minimum not met)
- Small close button in header
- Form switch toggles too small for reliable touch interaction
- **File Locations**:
- `/Views/Vehicle/Gas/_GasModal.cshtml` (lines 69, 99, 51, 48, 106, 110)
### 3. Dense Two-Column Layout Problems
- Advanced mode uses `col-md-6` layout creating cramped display
- Fields become too narrow for comfortable text input
- Second column with file upload becomes nearly unusable
- **File Location**: `/Views/Vehicle/Gas/_GasModal.cshtml` (lines 59, 139)
### 4. Complex Header Layout on Mobile
- Modal header contains multiple elements in cramped flex layout
- Toggle labels may wrap or get cut off
- Mode switch becomes hard to understand and use
- **File Location**: `/Views/Vehicle/Gas/_GasModal.cshtml` (lines 44-53)
### 5. Input Field Accessibility Issues
- Decimal inputs with custom key interceptors interfere with mobile keyboards
- Multi-select dropdown for tags difficult on mobile
- File upload interface unusable in narrow mobile view
- **File Locations**:
- `/Views/Vehicle/Gas/_GasModal.cshtml` (lines 74, 103, 117, 127, 130-135)
- `/wwwroot/js/gasrecord.js`
### 6. Modal Footer Button Layout
- Multiple buttons including conditional "Delete" button create touch conflicts
- Risk of accidental deletion or difficulty reaching primary action
- **File Location**: `/Views/Vehicle/Gas/_GasModal.cshtml` (line 155)
### 7. Form Mode Switching UX
- Simple/Advanced mode toggle jarring on mobile
- Content suddenly appears/disappears
- Users might not understand mode switching capability
- **File Location**: `/wwwroot/js/gasrecord.js` (lines 509-536)
### 8. Keyboard and Input Mode Issues
- Mixed input types with custom JavaScript key handlers
- Mobile keyboards may not behave predictably
- **File Locations**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/js/gasrecord.js`
### 9. Date Picker Mobile Issues
- Bootstrap datepicker doesn't provide optimal mobile experience
- Native mobile date pickers would be better
- **File Location**: `/wwwroot/js/gasrecord.js` (lines 6, 29)
### 10. No Progressive Enhancement for Mobile
- No mobile-specific CSS classes or touch-friendly spacing
- No mobile-optimized layouts
- **File Locations**:
- `/wwwroot/css/site.css`
- `/Views/Vehicle/Gas/_GasModal.cshtml`
## Mobile Experience Improvement Plan
### Priority 1: Critical Mobile UX Fixes
#### 1. Mobile-First Modal Design
- Implement full-screen modal on mobile devices
- Add slide-up animation for native app feel
- Create mobile-specific modal header with simplified layout
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/css/site.css`
- `/wwwroot/js/gasrecord.js`
#### 2. Touch Target Optimization
- Increase all interactive elements to minimum 44px
- Add larger padding around buttons and form controls
- Implement touch-friendly spacing between elements
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/css/site.css`
#### 3. Single-Column Mobile Layout
- Force single-column layout on mobile regardless of mode
- Stack all form fields vertically with proper spacing
- Move file upload and notes to dedicated sections
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/css/site.css`
### Priority 2: Input and Interaction Improvements
#### 4. Mobile-Optimized Inputs
- Replace Bootstrap datepicker with native HTML5 date input on mobile
- Simplify tag selection with mobile-friendly chip input
- Improve number input keyboards with proper `inputmode` attributes
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/js/gasrecord.js`
#### 5. Form Mode Simplification
- Default to Simple mode on mobile
- Make mode toggle more prominent and clear
- Add smooth transitions between modes
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/js/gasrecord.js`
- `/Controllers/Vehicle/GasController.cs`
### Priority 3: Enhanced Mobile Features
#### 6. Bottom Sheet Pattern
- Implement native-style bottom sheet for mobile
- Add swipe-to-dismiss gesture
- Include pull handle for better UX
- **Files to Modify**:
- `/Views/Vehicle/Gas/_GasModal.cshtml`
- `/wwwroot/css/site.css`
- `/wwwroot/js/gasrecord.js`
#### 7. Mobile-Specific CSS Improvements
- Add mobile breakpoint styles
- Implement proper touch feedback
- Optimize form field sizing for mobile keyboards
- **Files to Modify**:
- `/wwwroot/css/site.css`
#### 8. Progressive Enhancement
- Add mobile detection for conditional features
- Implement haptic feedback where supported
- Add mobile-specific validation styling
- **Files to Modify**:
- `/wwwroot/js/gasrecord.js`
- `/wwwroot/js/shared.js`
- `/Views/Shared/_Layout.cshtml`
## Implementation Strategy
### Phase 1: Modal and Layout Fixes (Priority 1 items)
- Focus on making the most impactful changes first
- Ensure mobile modal feels native and intuitive
- Implement proper touch targets and single-column layout
### Phase 2: Input Optimizations (Priority 2 items)
- Optimize form inputs for mobile interaction
- Simplify complex form elements
- Improve mode switching experience
### Phase 3: Advanced Mobile Features (Priority 3 items)
- Add sophisticated mobile interaction patterns
- Implement progressive enhancement
- Add mobile-specific features and feedback
## Key Files for Mobile Improvements
### Primary Files:
- `/Views/Vehicle/Gas/_GasModal.cshtml` - Main modal template
- `/wwwroot/js/gasrecord.js` - Modal behavior and form handling
- `/wwwroot/css/site.css` - Styling and responsive design
### Supporting Files:
- `/Controllers/Vehicle/GasController.cs` - Server-side logic
- `/Views/Shared/_Layout.cshtml` - Global mobile configuration
- `/wwwroot/js/shared.js` - Shared JavaScript utilities
## Success Metrics
- Touch target compliance (minimum 44px)
- Single-column layout on mobile breakpoints
- Native mobile input patterns
- Improved task completion rates on mobile
- Reduced user friction and abandonment
## Notes
This plan maintains existing functionality while transforming the mobile experience from a desktop-centric interface to a mobile-first, touch-optimized experience that feels native and intuitive on mobile devices.