motovaultpro/docs/K8S-PHASE-3.md

# Phase 3: Production Deployment (Weeks 9-12)

This phase focuses on deploying the modernized application with proper production configurations, monitoring, backup strategies, and operational procedures.

## Overview

Phase 3 transforms the development-ready Kubernetes application into a production-grade system with comprehensive monitoring, automated backup and recovery, secure ingress, and operational excellence. This phase ensures the system is ready for enterprise-level workloads with proper security, performance, and reliability guarantees.

## Key Objectives

- **Production Kubernetes Deployment**: Configure scalable, secure deployment manifests
- **Ingress and TLS Configuration**: Secure external access with proper routing
- **Comprehensive Monitoring**: Application and infrastructure observability
- **Backup and Disaster Recovery**: Automated backup strategies and recovery procedures
- **Migration Execution**: Seamless transition from legacy system

## 3.1 Kubernetes Deployment Configuration

**Objective**: Create production-ready Kubernetes manifests with proper resource management and high availability.

### Application Deployment Configuration

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: motovault-app
  namespace: motovault
  labels:
    app: motovault
    version: v1.0.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: motovault
  template:
    metadata:
      labels:
        app: motovault
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/metrics"
        prometheus.io/port: "8080"
    spec:
      serviceAccountName: motovault-service-account
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - motovault
              topologyKey: kubernetes.io/hostname
          - weight: 50
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - motovault
              topologyKey: topology.kubernetes.io/zone
      containers:
      - name: motovault
        image: motovault:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        env:
        - name: ASPNETCORE_ENVIRONMENT
          value: "Production"
        - name: ASPNETCORE_URLS
          value: "http://+:8080"
        envFrom:
        - configMapRef:
            name: motovault-config
        - secretRef:
            name: motovault-secrets
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: tmp-volume
          mountPath: /tmp
        - name: app-logs
          mountPath: /app/logs
      volumes:
      - name: tmp-volume
        emptyDir: {}
      - name: app-logs
        emptyDir: {}
      terminationGracePeriodSeconds: 30

---
apiVersion: v1
kind: Service
metadata:
  name: motovault-service
  namespace: motovault
  labels:
    app: motovault
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: motovault

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: motovault-pdb
  namespace: motovault
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: motovault
```

### Horizontal Pod Autoscaler Configuration

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: motovault-hpa
  namespace: motovault
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: motovault-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
```

### Implementation Tasks

#### 1. Create production namespace with security policies
```yaml
apiVersion: v1
kind: Namespace
metadata:
  name: motovault
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
```

#### 2. Configure resource quotas and limits
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: motovault-quota
  namespace: motovault
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    persistentvolumeclaims: "10"
    pods: "20"
```

#### 3. Set up service accounts and RBAC
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: motovault-service-account
  namespace: motovault
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: motovault-role
  namespace: motovault
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: motovault-rolebinding
  namespace: motovault
subjects:
- kind: ServiceAccount
  name: motovault-service-account
  namespace: motovault
roleRef:
  kind: Role
  name: motovault-role
  apiGroup: rbac.authorization.k8s.io
```

#### 4. Configure pod anti-affinity for high availability
- Spread pods across nodes and availability zones
- Ensure no single point of failure
- Optimize for both performance and availability

#### 5. Implement rolling update strategy with zero downtime
- Configure progressive rollout with health checks
- Automatic rollback on failure
- Canary deployment capabilities

## 3.2 Ingress and TLS Configuration

**Objective**: Configure secure external access with proper TLS termination and routing.

### Ingress Configuration

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: motovault-ingress
  namespace: motovault
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - motovault.example.com
    secretName: motovault-tls
  rules:
  - host: motovault.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: motovault-service
            port:
              number: 80
```

### TLS Certificate Management

```yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@motovault.example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
```

### Implementation Tasks

#### 1. Deploy cert-manager for automated TLS
```bash
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
```

#### 2. Configure Let's Encrypt for SSL certificates
- Automated certificate provisioning and renewal
- DNS-01 or HTTP-01 challenge configuration
- Certificate monitoring and alerting

#### 3. Set up WAF and DDoS protection
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: motovault-ingress-policy
  namespace: motovault
spec:
  podSelector:
    matchLabels:
      app: motovault
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: nginx-ingress
    ports:
    - protocol: TCP
      port: 8080
```

#### 4. Configure rate limiting and security headers
- Request rate limiting per IP
- Security headers (HSTS, CSP, etc.)
- Request size limitations

#### 5. Set up health check endpoints for load balancer
- Configure ingress health checks
- Implement graceful degradation
- Monitor certificate expiration

## 3.3 Monitoring and Observability Setup

**Objective**: Implement comprehensive monitoring, logging, and alerting for production operations.

### Prometheus ServiceMonitor Configuration

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: motovault-metrics
  namespace: motovault
  labels:
    app: motovault
spec:
  selector:
    matchLabels:
      app: motovault
  endpoints:
  - port: http
    path: /metrics
    interval: 30s
    scrapeTimeout: 10s
```

### Application Metrics Implementation

```csharp
public class MetricsService
{
    private readonly Counter _httpRequestsTotal;
    private readonly Histogram _httpRequestDuration;
    private readonly Gauge _activeConnections;
    private readonly Counter _databaseOperationsTotal;
    private readonly Histogram _databaseOperationDuration;

    public MetricsService()
    {
        _httpRequestsTotal = Metrics.CreateCounter(
            "motovault_http_requests_total",
            "Total number of HTTP requests",
            new[] { "method", "endpoint", "status_code" });

        _httpRequestDuration = Metrics.CreateHistogram(
            "motovault_http_request_duration_seconds",
            "Duration of HTTP requests in seconds",
            new[] { "method", "endpoint" });

        _activeConnections = Metrics.CreateGauge(
            "motovault_active_connections",
            "Number of active database connections");

        _databaseOperationsTotal = Metrics.CreateCounter(
            "motovault_database_operations_total",
            "Total number of database operations",
            new[] { "operation", "table", "status" });

        _databaseOperationDuration = Metrics.CreateHistogram(
            "motovault_database_operation_duration_seconds",
            "Duration of database operations in seconds",
            new[] { "operation", "table" });
    }

    public void RecordHttpRequest(string method, string endpoint, int statusCode, double duration)
    {
        _httpRequestsTotal.WithLabels(method, endpoint, statusCode.ToString()).Inc();
        _httpRequestDuration.WithLabels(method, endpoint).Observe(duration);
    }

    public void RecordDatabaseOperation(string operation, string table, bool success, double duration)
    {
        var status = success ? "success" : "error";
        _databaseOperationsTotal.WithLabels(operation, table, status).Inc();
        _databaseOperationDuration.WithLabels(operation, table).Observe(duration);
    }
}
```

### Grafana Dashboard Configuration

```json
{
  "dashboard": {
    "title": "MotoVaultPro Application Dashboard",
    "panels": [
      {
        "title": "HTTP Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(motovault_http_requests_total[5m])",
            "legendFormat": "{{method}} {{endpoint}}"
          }
        ]
      },
      {
        "title": "Response Time Percentiles",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.50, rate(motovault_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "50th percentile"
          },
          {
            "expr": "histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          }
        ]
      },
      {
        "title": "Database Connection Pool",
        "type": "singlestat",
        "targets": [
          {
            "expr": "motovault_active_connections",
            "legendFormat": "Active Connections"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(motovault_http_requests_total{status_code=~\"5..\"}[5m])",
            "legendFormat": "5xx errors"
          }
        ]
      }
    ]
  }
}
```

### Alert Manager Configuration

```yaml
groups:
- name: motovault.rules
  rules:
  - alert: HighErrorRate
    expr: rate(motovault_http_requests_total{status_code=~"5.."}[5m]) > 0.1
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "High error rate detected"
      description: "Error rate is {{ $value }}% for the last 5 minutes"

  - alert: HighResponseTime
    expr: histogram_quantile(0.95, rate(motovault_http_request_duration_seconds_bucket[5m])) > 2
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High response time detected"
      description: "95th percentile response time is {{ $value }}s"

  - alert: DatabaseConnectionPoolExhaustion
    expr: motovault_active_connections > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "Database connection pool nearly exhausted"
      description: "Active connections: {{ $value }}/100"

  - alert: PodCrashLooping
    expr: rate(kube_pod_container_status_restarts_total{namespace="motovault"}[15m]) > 0
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Pod is crash looping"
      description: "Pod {{ $labels.pod }} is restarting frequently"
```

### Implementation Tasks

#### 1. Deploy Prometheus and Grafana stack
```bash
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml
```

#### 2. Configure application metrics endpoints
- Add Prometheus metrics middleware
- Implement custom business metrics
- Configure metric collection intervals

#### 3. Set up centralized logging with structured logs
```csharp
builder.Services.AddLogging(loggingBuilder =>
{
    loggingBuilder.AddJsonConsole(options =>
    {
        options.JsonWriterOptions = new JsonWriterOptions { Indented = false };
        options.IncludeScopes = true;
        options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ";
    });
});
```

#### 4. Create operational dashboards and alerts
- Application performance dashboards
- Infrastructure monitoring dashboards
- Business metrics and KPIs
- Alert routing and escalation

#### 5. Implement distributed tracing
```csharp
services.AddOpenTelemetry()
    .WithTracing(builder =>
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddNpgsql()
            .AddRedisInstrumentation()
            .AddJaegerExporter();
    });
```

## 3.4 Backup and Disaster Recovery

**Objective**: Implement comprehensive backup strategies and disaster recovery procedures.

### Velero Backup Configuration

```yaml
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: motovault-daily-backup
  namespace: velero
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  template:
    includedNamespaces:
    - motovault
    includedResources:
    - "*"
    storageLocation: default
    ttl: 720h0m0s  # 30 days
    snapshotVolumes: true

---
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: motovault-weekly-backup
  namespace: velero
spec:
  schedule: "0 3 * * 0"  # Weekly on Sunday at 3 AM
  template:
    includedNamespaces:
    - motovault
    includedResources:
    - "*"
    storageLocation: default
    ttl: 2160h0m0s  # 90 days
    snapshotVolumes: true
```

### Database Backup Strategy

```bash
#!/bin/bash
# Automated database backup script

BACKUP_DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="motovault_backup_${BACKUP_DATE}.sql"
S3_BUCKET="motovault-backups"

# Create database backup
kubectl exec -n motovault motovault-postgres-1 -- \
  pg_dump -U postgres motovault > "${BACKUP_FILE}"

# Compress backup
gzip "${BACKUP_FILE}"

# Upload to S3/MinIO
aws s3 cp "${BACKUP_FILE}.gz" "s3://${S3_BUCKET}/database/"

# Clean up local file
rm "${BACKUP_FILE}.gz"

# Retain only last 30 days of backups
aws s3api list-objects-v2 \
  --bucket "${S3_BUCKET}" \
  --prefix "database/" \
  --query 'Contents[?LastModified<=`'$(date -d "30 days ago" --iso-8601)'`].[Key]' \
  --output text | \
  xargs -I {} aws s3 rm "s3://${S3_BUCKET}/{}"
```

### Disaster Recovery Procedures

```bash
#!/bin/bash
# Full system recovery script

BACKUP_DATE=$1
if [ -z "$BACKUP_DATE" ]; then
  echo "Usage: $0 <backup_date>"
  echo "Example: $0 20240120_020000"
  exit 1
fi

# Stop application
echo "Scaling down application..."
kubectl scale deployment motovault-app --replicas=0 -n motovault

# Restore database
echo "Restoring database from backup..."
aws s3 cp "s3://motovault-backups/database/database_backup_${BACKUP_DATE}.sql.gz" .
gunzip "database_backup_${BACKUP_DATE}.sql.gz"
kubectl exec -i motovault-postgres-1 -n motovault -- \
  psql -U postgres -d motovault < "database_backup_${BACKUP_DATE}.sql"

# Restore MinIO data
echo "Restoring MinIO data..."
aws s3 sync "s3://motovault-backups/minio/${BACKUP_DATE}/" /tmp/minio_restore/
mc mirror /tmp/minio_restore/ motovault-minio/motovault-files/

# Restart application
echo "Scaling up application..."
kubectl scale deployment motovault-app --replicas=3 -n motovault

# Verify health
echo "Waiting for application to be ready..."
kubectl wait --for=condition=ready pod -l app=motovault -n motovault --timeout=300s

echo "Recovery completed successfully"
```

### Implementation Tasks

#### 1. Deploy Velero for Kubernetes backup
```bash
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.7.0 \
  --bucket motovault-backups \
  --backup-location-config region=us-west-2 \
  --snapshot-location-config region=us-west-2
```

#### 2. Configure automated database backups
- Point-in-time recovery setup
- Incremental backup strategies
- Cross-region backup replication

#### 3. Implement MinIO backup synchronization
- Automated file backup to external storage
- Metadata backup and restoration
- Verification of backup integrity

#### 4. Create disaster recovery runbooks
- Step-by-step recovery procedures
- RTO/RPO definitions and testing
- Contact information and escalation procedures

#### 5. Set up backup monitoring and alerting
```yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: backup-alerts
spec:
  groups:
  - name: backup.rules
    rules:
    - alert: BackupFailed
      expr: velero_backup_failure_total > 0
      labels:
        severity: critical
      annotations:
        summary: "Backup operation failed"
        description: "Velero backup has failed"
```

## Week-by-Week Breakdown

### Week 9: Production Kubernetes Configuration
- **Days 1-2**: Create production deployment manifests
- **Days 3-4**: Configure HPA, PDB, and resource quotas
- **Days 5-7**: Set up RBAC and security policies

### Week 10: Ingress and TLS Setup
- **Days 1-2**: Deploy and configure ingress controller
- **Days 3-4**: Set up cert-manager and TLS certificates
- **Days 5-7**: Configure security policies and rate limiting

### Week 11: Monitoring and Observability
- **Days 1-3**: Deploy Prometheus and Grafana stack
- **Days 4-5**: Configure application metrics and dashboards
- **Days 6-7**: Set up alerting and notification channels

### Week 12: Backup and Migration Preparation
- **Days 1-3**: Deploy and configure backup solutions
- **Days 4-5**: Create migration scripts and procedures
- **Days 6-7**: Execute migration dry runs and validation

## Success Criteria

- [ ] Production Kubernetes deployment with 99.9% availability
- [ ] Secure ingress with automated TLS certificate management
- [ ] Comprehensive monitoring with alerting
- [ ] Automated backup and recovery procedures tested
- [ ] Migration procedures validated and documented
- [ ] Security policies and network controls implemented
- [ ] Performance baselines established and monitored

## Testing Requirements

### Production Readiness Tests
- Load testing under expected traffic patterns
- Failover testing for all components
- Security penetration testing
- Backup and recovery validation

### Performance Tests
- Application response time under load
- Database performance with connection pooling
- Cache performance and hit ratios
- Network latency and throughput

### Security Tests
- Container image vulnerability scanning
- Network policy validation
- Authentication and authorization testing
- TLS configuration verification

## Deliverables

1. **Production Deployment**
   - Complete Kubernetes manifests
   - Security configurations
   - Monitoring and alerting setup
   - Backup and recovery procedures

2. **Documentation**
   - Operational runbooks
   - Security procedures
   - Monitoring guides
   - Disaster recovery plans

3. **Migration Tools**
   - Data migration scripts
   - Validation tools
   - Rollback procedures

## Dependencies

- Production Kubernetes cluster
- External storage for backups
- DNS management for ingress
- Certificate authority for TLS
- Monitoring infrastructure

## Risks and Mitigations

### Risk: Extended Downtime During Migration
**Mitigation**: Blue-green deployment strategy with comprehensive rollback plan

### Risk: Data Integrity Issues
**Mitigation**: Extensive validation and parallel running during transition

### Risk: Performance Degradation
**Mitigation**: Load testing and gradual traffic migration

---

**Previous Phase**: [Phase 2: High Availability Infrastructure](K8S-PHASE-2.md)
**Next Phase**: [Phase 4: Advanced Features and Optimization](K8S-PHASE-4.md)