chore: update OCR tests and documentation (refs #121 )

Add engine abstraction tests and update docs to reflect PaddleOCR primary architecture with optional Google Vision cloud fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: workflow contract
2026-02-07 11:42:51 -06:00 · 2026-02-07 11:32:36 -06:00 · 2026-02-07 11:29:16 -06:00 · 2026-02-07 11:17:44 -06:00 · 2026-02-07 11:12:08 -06:00 · 2026-02-07 10:56:27 -06:00
24 changed files with 1478 additions and 220 deletions
--- a/.ai/context.json
+++ b/.ai/context.json
@@ -108,7 +108,7 @@
    },
    "mvp-ocr": {
      "type": "ocr_service",
-      "description": "Python-based OCR for document text extraction",
+      "description": "Python OCR service with pluggable engine abstraction (PaddleOCR PP-OCRv4 primary, optional Google Vision cloud fallback, Tesseract backward compat)",
      "port": 8000
    },
    "mvp-loki": {
--- a/.ai/workflow-contract.json
+++ b/.ai/workflow-contract.json
@@ -45,7 +45,7 @@
    "parent_issue": "The original feature issue. Tracks overall status. Only the parent gets status label transitions.",
    "sub_issue_title_format": "{type}: {summary} (#{parent_index})",
    "sub_issue_body": "First line must be 'Relates to #{parent_index}'. Each sub-issue is a self-contained unit of work.",
-    "sub_issue_labels": "status/backlog + same type/* as parent. Sub-issues stay in backlog; parent issue tracks status.",
+    "sub_issue_labels": "status/in-progress + same type/* as parent. Sub-issues move to in-progress as they are worked on.",
    "sub_issue_milestone": "Same sprint milestone as parent.",
    "rules": [
      "ONE branch for the parent issue. Never create branches per sub-issue.",
--- a/docker-compose.prod.yml
+++ b/docker-compose.prod.yml
@@ -38,13 +38,17 @@ services:
      STRIPE_ENTERPRISE_MONTHLY_PRICE_ID: prod_Toj8xGEui9jl6j
      STRIPE_ENTERPRISE_YEARLY_PRICE_ID: prod_Toj9A7A773xrdn
-  # OCR - Production log level
+  # OCR - Production log level + engine config
  mvp-ocr:
    environment:
      LOG_LEVEL: error
      REDIS_HOST: mvp-redis
      REDIS_PORT: 6379
      REDIS_DB: 1
      OCR_PRIMARY_ENGINE: paddleocr
      OCR_FALLBACK_ENGINE: ${OCR_FALLBACK_ENGINE:-none}
      OCR_FALLBACK_THRESHOLD: ${OCR_FALLBACK_THRESHOLD:-0.6}
      GOOGLE_VISION_KEY_PATH: /run/secrets/google-vision-key.json
  # PostgreSQL - Remove dev ports, production log level
  mvp-postgres:
--- a/docker-compose.staging.yml
+++ b/docker-compose.staging.yml
@@ -63,6 +63,15 @@ services:
  mvp-ocr:
    image: ${OCR_IMAGE:-git.motovaultpro.com/egullickson/ocr:latest}
    container_name: mvp-ocr-staging
    environment:
      LOG_LEVEL: debug
      REDIS_HOST: mvp-redis
      REDIS_PORT: 6379
      REDIS_DB: 1
      OCR_PRIMARY_ENGINE: paddleocr
      OCR_FALLBACK_ENGINE: ${OCR_FALLBACK_ENGINE:-none}
      OCR_FALLBACK_THRESHOLD: ${OCR_FALLBACK_THRESHOLD:-0.6}
      GOOGLE_VISION_KEY_PATH: /run/secrets/google-vision-key.json
  # ========================================
  # PostgreSQL (Staging - Separate Database)
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -193,8 +193,16 @@ services:
      REDIS_HOST: mvp-redis
      REDIS_PORT: 6379
      REDIS_DB: 1
      # OCR engine configuration (PaddleOCR primary, cloud fallback optional)
      OCR_PRIMARY_ENGINE: paddleocr
      OCR_FALLBACK_ENGINE: ${OCR_FALLBACK_ENGINE:-none}
      OCR_FALLBACK_THRESHOLD: ${OCR_FALLBACK_THRESHOLD:-0.6}
      GOOGLE_VISION_KEY_PATH: /run/secrets/google-vision-key.json
    volumes:
      - /tmp/vin-debug:/tmp/vin-debug
      # Optional: Uncomment to enable Google Vision cloud fallback.
      # Requires: secrets/app/google-vision-key.json and OCR_FALLBACK_ENGINE=google_vision
      # - ./secrets/app/google-vision-key.json:/run/secrets/google-vision-key.json:ro
    networks:
      - backend
      - database
--- a/docs/CLAUDE.md
+++ b/docs/CLAUDE.md
@@ -18,5 +18,5 @@
 | `AUDIT.md` | Audit documentation | Security audits, compliance |
 | `MVP-COLOR-SCHEME.md` | Color scheme reference | UI styling decisions |
 | `LOGGING.md` | Unified logging system | Log levels, correlation IDs, Grafana |
-| `ocr-pipeline-tech-stack.md` | OCR pipeline technology decisions | OCR architecture, Tesseract setup |
+| `ocr-pipeline-tech-stack.md` | OCR pipeline technology decisions | OCR architecture, PaddleOCR engine abstraction |
 | `TIER-GATING.md` | Subscription tier gating rules | Feature access by tier, vehicle limits |
--- a/docs/ocr-pipeline-tech-stack.md
+++ b/docs/ocr-pipeline-tech-stack.md
@@ -118,35 +118,48 @@
        │       ├─────────────────────────────────────────────────────────┤
        │       │                                                         │
        │       │   ┌─────────────────────────────────────────────────┐   │
-        │       │   │  5a. Primary OCR: Tesseract 5.x                 │   │
+        │       │   │  5a. Engine Abstraction Layer                    │   │
-        │       │   │                                                 │   │
+        │       │   │                                                  │   │
-        │       │   │  • Engine: LSTM (--oem 1)                       │   │
+        │       │   │  OcrEngine ABC -> PaddleOcrEngine (primary)      │   │
-        │       │   │  • Page segmentation: Auto (--psm 3)            │   │
+        │       │   │                -> CloudEngine (optional fallback) │   │
-        │       │   │  • Output: hOCR with word confidence            │   │
+        │       │   │                -> TesseractEngine (backward compat)│  │
        │       │   │                -> HybridEngine (primary+fallback) │   │
        │       │   └─────────────────────────────────────────────────┘   │
        │       │                         │                               │
        │       │                         ▼                               │
        │       │   ┌─────────────────────────────────────────────────┐   │
        │       │   │  5b. Primary OCR: PaddleOCR PP-OCRv4             │   │
        │       │   │                                                  │   │
        │       │   │  • Scene text detection + angle classification   │   │
        │       │   │  • CPU-only, models baked into Docker image      │   │
        │       │   │  • Normalized output: text, confidence, word boxes│  │
        │       │   └─────────────────────────────────────────────────┘   │
        │       │                         │                               │
        │       │                         ▼                               │
        │       │                 ┌───────────────┐                       │
        │       │                 │  Confidence   │                       │
-        │       │                 │    > 80% ?    │                       │
+        │       │                 │   >= 60% ?    │                       │
        │       │                 └───────────────┘                       │
        │       │                    │         │                          │
-        │       │              YES ──┘         └── NO                     │
+        │       │              YES ──┘         └── NO (and cloud enabled) │
        │       │               │                   │                     │
        │       │               │                   ▼                     │
        │       │               │   ┌─────────────────────────────────┐   │
-        │       │               │   │  5b. Fallback: PaddleOCR        │   │
+        │       │               │   │  5c. Optional Cloud Fallback     │   │
-        │       │               │   │                                 │   │
+        │       │               │   │      (Google Vision API)         │   │
-        │       │               │   │  • Better for degraded images   │   │
+        │       │               │   │                                  │   │
-        │       │               │   │  • Better table detection       │   │
+        │       │               │   │  • Disabled by default           │   │
-        │       │               │   │  • Slower but more accurate     │   │
+        │       │               │   │  • 5-second timeout guard        │   │
        │       │               │   │  • Returns higher-confidence     │   │
        │       │               │   │    result of primary vs fallback │   │
        │       │               │   └─────────────────────────────────┘   │
        │       │               │                   │                     │
        │       │               ▼                   ▼                     │
        │       │         ┌─────────────────────────────────┐             │
-        │       │         │  5c. Result Merging             │             │
+        │       │         │  5d. HybridEngine Result        │             │
-        │       │         │  • Merge by bounding box        │             │
+        │       │         │  • Compare confidences          │             │
        │       │         │  • Keep highest confidence      │             │
        │       │         │  • Graceful fallback on error   │             │
        │       │         └─────────────────────────────────┘             │
        │       │                                                         │
        │       └─────────────────────────────────────────────────────────┘
@@ -257,10 +270,10 @@
 | Component              | Tool                  | Purpose                              |
 |------------------------|-----------------------|--------------------------------------|
-| **Primary OCR**        | Tesseract 5.x         | Fast, reliable text extraction       |
+| **Primary OCR**        | PaddleOCR PP-OCRv4    | Highest accuracy scene text, CPU-only |
-| **Python Binding**     | pytesseract           | Tesseract Python wrapper             |
+| **Cloud Fallback**     | Google Vision API     | Optional cloud fallback (disabled by default) |
-| **Fallback OCR**       | PaddleOCR             | Higher accuracy, better tables       |
+| **Backward Compat**    | Tesseract 5.x / pytesseract | Legacy engine, configurable via env var |
-| **Layout Analysis**    | PaddleOCR / LayoutParser | Document structure detection      |
+| **Engine Abstraction** | `OcrEngine` ABC       | Pluggable engine interface in `ocr/app/engines/` |
 ### Data Extraction
@@ -291,85 +304,93 @@
 fastapi>=0.100.0
 uvicorn[standard]>=0.23.0
 python-multipart>=0.0.6
-
+pydantic>=2.0.0
 # Task Queue
 celery>=5.3.0
 redis>=4.6.0
 # File Detection & Handling
 python-magic>=0.4.27
 pillow>=10.0.0
 pillow-heif>=0.13.0
 # PDF Processing
 pymupdf>=1.23.0
 # Image Preprocessing
 opencv-python-headless>=4.8.0
 deskew>=1.4.0
 scikit-image>=0.21.0
 numpy>=1.24.0
 # OCR Engines
 pytesseract>=0.3.10
-paddlepaddle>=2.5.0
+paddlepaddle>=2.6.0
-paddleocr>=2.7.0
+paddleocr>=2.8.0
 google-cloud-vision>=3.7.0
-# Table Extraction
+# PDF Processing
-img2table>=1.2.0
+PyMuPDF>=1.23.0
 camelot-py[cv]>=0.11.0
-# NLP & Data
+# Redis for job queue
-spacy>=3.6.0
+redis>=5.0.0
 pandas>=2.0.0
-# Storage & Database
+# HTTP client for callbacks
-boto3>=1.28.0
+httpx>=0.24.0
-psycopg2-binary>=2.9.0
+
-sqlalchemy>=2.0.0
+# Testing
 pytest>=7.4.0
 pytest-asyncio>=0.21.0
 ```
 ### System Package Requirements (Ubuntu/Debian)
 ```bash
-# Tesseract OCR
+# Tesseract OCR (backward compatibility engine)
-apt-get install tesseract-ocr tesseract-ocr-eng libtesseract-dev
+apt-get install tesseract-ocr tesseract-ocr-eng
 # PaddlePaddle OpenMP runtime
 apt-get install libgomp1
 # HEIC Support
-apt-get install libheif-examples libheif-dev
+apt-get install libheif1 libheif-dev
-# OpenCV dependencies
+# GLib (OpenCV dependency)
-apt-get install libgl1-mesa-glx libglib2.0-0
+apt-get install libglib2.0-0
-# PDF rendering dependencies
+# File type detection
-apt-get install libmupdf-dev mupdf-tools
+apt-get install libmagic1
 # Image processing
 apt-get install libmagic1 ghostscript
 # Camelot dependencies
 apt-get install ghostscript python3-tk
 ```
 ### Environment Variables
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `OCR_PRIMARY_ENGINE` | `paddleocr` | Primary OCR engine (`paddleocr`, `tesseract`) |
 | `OCR_CONFIDENCE_THRESHOLD` | `0.6` | Minimum confidence threshold |
 | `OCR_FALLBACK_ENGINE` | `none` | Fallback engine (`google_vision`, `none`) |
 | `OCR_FALLBACK_THRESHOLD` | `0.6` | Confidence below this triggers fallback |
 | `GOOGLE_VISION_KEY_PATH` | `/run/secrets/google-vision-key.json` | Path to Google Vision service account key |
 ---
 ## DOCKERFILE
 ```dockerfile
-FROM python:3.11-slim
+# Primary engine: PaddleOCR PP-OCRv4 (models baked into image)
 # Backward compat: Tesseract 5.x (optional, via TesseractEngine)
 # Cloud fallback: Google Vision (optional, requires API key at runtime)
 FROM python:3.13-slim
 # System dependencies
 # - tesseract-ocr/eng: Backward-compatible OCR engine
 # - libgomp1: OpenMP runtime required by PaddlePaddle
 # - libheif1/libheif-dev: HEIF image support (iPhone photos)
 # - libglib2.0-0: GLib shared library (OpenCV dependency)
 # - libmagic1: File type detection
 # - curl: Health check endpoint
 RUN apt-get update && apt-get install -y --no-install-recommends \
    tesseract-ocr \
    tesseract-ocr-eng \
-    libtesseract-dev \
+    libgomp1 \
-    libheif-examples \
+    libheif1 \
    libheif-dev \
    libgl1-mesa-glx \
    libglib2.0-0 \
    libmagic1 \
-    ghostscript \
+    curl \
    poppler-utils \
    && rm -rf /var/lib/apt/lists/*
 # Python dependencies
@@ -377,11 +398,9 @@ WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
-# Download spaCy model
+# Pre-download PaddleOCR PP-OCRv4 models during build (not at runtime)
-RUN python -m spacy download en_core_web_sm
+RUN python -c "from paddleocr import PaddleOCR; PaddleOCR(use_angle_cls=True, lang='en', use_gpu=False, show_log=False)" \
-
+    && echo "PaddleOCR PP-OCRv4 models downloaded and verified"
 # Download PaddleOCR models (cached in image)
 RUN python -c "from paddleocr import PaddleOCR; PaddleOCR(use_angle_cls=True, lang='en')"
 COPY . .
--- a/frontend/.claude/tdd-guard/data/test.json
+++ b/frontend/.claude/tdd-guard/data/test.json
@@ -1,5 +1,116 @@
 {
-  "testModules": [],
+  "testModules": [
    {
      "moduleId": "/Users/egullickson/Documents/Technology/coding/motovaultpro/frontend/src/shared/components/CameraCapture/CameraCapture.test.tsx",
      "tests": [
        {
          "name": "shows loading state while requesting permission",
          "fullName": "CameraCapture Permission handling shows loading state while requesting permission",
          "state": "passed"
        },
        {
          "name": "shows error when permission denied",
          "fullName": "CameraCapture Permission handling shows error when permission denied",
          "state": "passed"
        },
        {
          "name": "shows error when camera unavailable",
          "fullName": "CameraCapture Permission handling shows error when camera unavailable",
          "state": "passed"
        },
        {
          "name": "shows viewfinder when camera access granted",
          "fullName": "CameraCapture Viewfinder shows viewfinder when camera access granted",
          "state": "passed"
        },
        {
          "name": "shows cancel button in viewfinder",
          "fullName": "CameraCapture Viewfinder shows cancel button in viewfinder",
          "state": "passed"
        },
        {
          "name": "calls onCancel when cancel button clicked",
          "fullName": "CameraCapture Viewfinder calls onCancel when cancel button clicked",
          "state": "passed"
        },
        {
          "name": "shows VIN guidance when guidanceType is vin",
          "fullName": "CameraCapture Guidance overlay shows VIN guidance when guidanceType is vin",
          "state": "passed"
        },
        {
          "name": "shows receipt guidance when guidanceType is receipt",
          "fullName": "CameraCapture Guidance overlay shows receipt guidance when guidanceType is receipt",
          "state": "passed"
        },
        {
          "name": "shows upload file button in viewfinder",
          "fullName": "CameraCapture File fallback shows upload file button in viewfinder",
          "state": "passed"
        },
        {
          "name": "switches to file fallback when upload file clicked",
          "fullName": "CameraCapture File fallback switches to file fallback when upload file clicked",
          "state": "passed"
        },
        {
          "name": "renders upload area",
          "fullName": "FileInputFallback renders upload area",
          "state": "passed"
        },
        {
          "name": "shows accepted formats",
          "fullName": "FileInputFallback shows accepted formats",
          "state": "passed"
        },
        {
          "name": "shows max file size",
          "fullName": "FileInputFallback shows max file size",
          "state": "passed"
        },
        {
          "name": "calls onCancel when cancel clicked",
          "fullName": "FileInputFallback calls onCancel when cancel clicked",
          "state": "passed"
        },
        {
          "name": "shows error for invalid file type",
          "fullName": "FileInputFallback shows error for invalid file type",
          "state": "passed"
        },
        {
          "name": "shows error for file too large",
          "fullName": "FileInputFallback shows error for file too large",
          "state": "passed"
        },
        {
          "name": "calls onFileSelect with valid file",
          "fullName": "FileInputFallback calls onFileSelect with valid file",
          "state": "passed"
        },
        {
          "name": "renders nothing when type is none",
          "fullName": "GuidanceOverlay renders nothing when type is none",
          "state": "passed"
        },
        {
          "name": "renders VIN guidance with correct description",
          "fullName": "GuidanceOverlay renders VIN guidance with correct description",
          "state": "passed"
        },
        {
          "name": "renders receipt guidance with correct description",
          "fullName": "GuidanceOverlay renders receipt guidance with correct description",
          "state": "passed"
        },
        {
          "name": "renders document guidance with correct description",
          "fullName": "GuidanceOverlay renders document guidance with correct description",
          "state": "passed"
        }
      ]
    }
  ],
  "unhandledErrors": [],
-  "reason": "failed"
+  "reason": "passed"
 }
--- a/frontend/src/shared/components/CameraCapture/useImageCrop.ts
+++ b/frontend/src/shared/components/CameraCapture/useImageCrop.ts
@@ -95,10 +95,6 @@ export function useImageCrop(options: UseImageCropOptions = {}): UseImageCropRet
  const drawOriginRef = useRef({ x: 0, y: 0 });
  const cropAreaRef = useRef(cropArea);
  useEffect(() => {
    cropAreaRef.current = cropArea;
  }, [cropArea]);
  const setCropArea = useCallback(
    (area: CropArea) => {
      setCropAreaState(getAspectRatioAdjustedCrop(area));
@@ -177,7 +173,9 @@ export function useImageCrop(options: UseImageCropOptions = {}): UseImageCropRet
      startPosRef.current = { x: clientX, y: clientY };
      drawOriginRef.current = { x, y };
-      setCropAreaState({ x, y, width: 0, height: 0 });
+      const initial = { x, y, width: 0, height: 0 };
      setCropAreaState(initial);
      cropAreaRef.current = initial;
      isDrawingRef.current = true;
      activeHandleRef.current = null;
@@ -203,18 +201,24 @@ export function useImageCrop(options: UseImageCropOptions = {}): UseImageCropRet
        const originX = drawOriginRef.current.x;
        const originY = drawOriginRef.current.y;
-        let newCrop: CropArea = {
+        const drawnWidth = Math.abs(currentX - originX);
        const drawnHeight = aspectRatio
          ? drawnWidth / aspectRatio
          : Math.abs(currentY - originY);
        let drawnY = Math.min(originY, currentY);
        // Clamp so crop doesn't exceed container bounds when aspect ratio forces height
        if (aspectRatio && drawnY + drawnHeight > 100) {
          drawnY = Math.max(0, 100 - drawnHeight);
        }
        const newCrop: CropArea = {
          x: Math.min(originX, currentX),
-          y: Math.min(originY, currentY),
+          y: drawnY,
-          width: Math.abs(currentX - originX),
+          width: drawnWidth,
-          height: Math.abs(currentY - originY),
+          height: drawnHeight,
        };
        if (aspectRatio) {
          newCrop.height = newCrop.width / aspectRatio;
        }
        setCropAreaState(newCrop);
        cropAreaRef.current = newCrop;
        return;
      }
@@ -303,7 +307,9 @@ export function useImageCrop(options: UseImageCropOptions = {}): UseImageCropRet
          break;
      }
-      setCropAreaState(constrainCrop(newCrop));
+      const constrained = constrainCrop(newCrop);
      setCropAreaState(constrained);
      cropAreaRef.current = constrained;
    },
    [isDragging, constrainCrop, aspectRatio]
  );
@@ -312,13 +318,17 @@ export function useImageCrop(options: UseImageCropOptions = {}): UseImageCropRet
    if (isDrawingRef.current) {
      isDrawingRef.current = false;
      const area = cropAreaRef.current;
-      if (area.width >= minSize && area.height >= minSize) {
+      // When aspect ratio constrains one dimension, only check the free dimension
      const meetsMinSize = aspectRatio
        ? area.width >= minSize
        : area.width >= minSize && area.height >= minSize;
      if (meetsMinSize) {
        setCropDrawn(true);
      }
    }
    activeHandleRef.current = null;
    setIsDragging(false);
-  }, [minSize]);
+  }, [minSize, aspectRatio]);
  // Add global event listeners for drag
  useEffect(() => {
--- a/ocr/CLAUDE.md
+++ b/ocr/CLAUDE.md
@@ -1,10 +1,12 @@
 # ocr/
 Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Pluggable engine abstraction in `app/engines/`.
 ## Files
 | File | What | When to read |
 | ---- | ---- | ------------ |
-| `Dockerfile` | Container build definition | Docker builds, deployment |
+| `Dockerfile` | Container build (PaddleOCR models baked in) | Docker builds, deployment |
 | `requirements.txt` | Python dependencies | Adding dependencies |
 ## Subdirectories
@@ -12,4 +14,5 @@
 | Directory | What | When to read |
 | --------- | ---- | ------------ |
 | `app/` | FastAPI application source | OCR endpoint development |
 | `app/engines/` | Engine abstraction layer (OcrEngine ABC, factory, hybrid) | Adding or changing OCR engines |
 | `tests/` | Test suite | Adding or modifying tests |
--- a/ocr/Dockerfile
+++ b/ocr/Dockerfile
@@ -1,5 +1,9 @@
 # Production Dockerfile for MotoVaultPro OCR Service
 # Uses mirrored base images from Gitea Package Registry
 #
 # Primary engine: PaddleOCR PP-OCRv4 (models baked into image)
 # Backward compat: Tesseract 5.x (optional, via TesseractEngine)
 # Cloud fallback: Google Vision (optional, requires API key at runtime)
 # Build argument for registry (defaults to Gitea mirrors, falls back to Docker Hub)
 ARG REGISTRY_MIRRORS=git.motovaultpro.com/egullickson/mirrors
@@ -7,10 +11,16 @@ ARG REGISTRY_MIRRORS=git.motovaultpro.com/egullickson/mirrors
 FROM ${REGISTRY_MIRRORS}/python:3.13-slim
 # System dependencies
 # - tesseract-ocr/eng: Backward-compatible OCR engine (used by TesseractEngine)
 # - libgomp1: OpenMP runtime required by PaddlePaddle
 # - libheif1/libheif-dev: HEIF image support (iPhone photos)
 # - libglib2.0-0: GLib shared library (OpenCV dependency)
 # - libmagic1: File type detection
 # - curl: Health check endpoint
 RUN apt-get update && apt-get install -y --no-install-recommends \
    tesseract-ocr \
    tesseract-ocr-eng \
-    libtesseract-dev \
+    libgomp1 \
    libheif1 \
    libheif-dev \
    libglib2.0-0 \
@@ -23,6 +33,12 @@ WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 # Pre-download PaddleOCR PP-OCRv4 models during build (not at runtime).
 # Models are baked into the image so container starts are fast and
 # no network access is needed at runtime for model download.
 RUN python -c "from paddleocr import PaddleOCR; PaddleOCR(use_angle_cls=True, lang='en', use_gpu=False, show_log=False)" \
    && echo "PaddleOCR PP-OCRv4 models downloaded and verified"
 COPY . .
 EXPOSE 8000
--- a/ocr/app/CLAUDE.md
+++ b/ocr/app/CLAUDE.md
@@ -12,6 +12,7 @@
 | Directory | What | When to read |
 | --------- | ---- | ------------ |
 | `engines/` | OCR engine abstraction (PaddleOCR primary, Google Vision fallback, Tesseract compat) | Engine changes, adding new engines |
 | `extractors/` | Data extraction logic | Adding new extraction types |
 | `models/` | Data models and schemas | Request/response types |
 | `patterns/` | Regex and parsing patterns | Pattern matching rules |
--- a/ocr/app/config.py
+++ b/ocr/app/config.py
@@ -17,6 +17,15 @@ class Settings:
            os.getenv("OCR_CONFIDENCE_THRESHOLD", "0.6")
        )
        # Cloud fallback configuration (disabled by default)
        self.ocr_fallback_engine: str = os.getenv("OCR_FALLBACK_ENGINE", "none")
        self.ocr_fallback_threshold: float = float(
            os.getenv("OCR_FALLBACK_THRESHOLD", "0.6")
        )
        self.google_vision_key_path: str = os.getenv(
            "GOOGLE_VISION_KEY_PATH", "/run/secrets/google-vision-key.json"
        )
        # Redis configuration for job queue
        self.redis_host: str = os.getenv("REDIS_HOST", "mvp-redis")
        self.redis_port: int = int(os.getenv("REDIS_PORT", "6379"))
--- a/ocr/app/engines/init.py
+++ b/ocr/app/engines/init.py
@@ -2,6 +2,12 @@
 Provides a pluggable engine interface for OCR processing,
 decoupling extractors from specific OCR libraries.
 Engines:
  - PaddleOcrEngine: PaddleOCR PP-OCRv4 (primary, CPU-only)
  - TesseractEngine: pytesseract wrapper (backward compatibility)
  - CloudEngine: Google Vision TEXT_DETECTION (optional cloud fallback)
  - HybridEngine: Primary + fallback with confidence threshold
 """
 from app.engines.base_engine import (
--- a/ocr/app/engines/cloud_engine.py
+++ b/ocr/app/engines/cloud_engine.py
@@ -0,0 +1,166 @@
 """Google Vision cloud OCR engine with lazy initialization."""
 import logging
 import os
 from typing import Any
 from app.engines.base_engine import (
    EngineProcessingError,
    EngineUnavailableError,
    OcrConfig,
    OcrEngine,
    OcrEngineResult,
    WordBox,
 )
 logger = logging.getLogger(__name__)
 # Default path for Google Vision service account key (Docker secret mount)
 _DEFAULT_KEY_PATH = "/run/secrets/google-vision-key.json"
 class CloudEngine(OcrEngine):
    """Google Vision TEXT_DETECTION wrapper with lazy initialization.
    The client is not created until the first ``recognize()`` call,
    so the container starts normally even when the secret file is
    missing or the dependency is not installed.
    """
    def __init__(self, key_path: str | None = None) -> None:
        self._key_path = key_path or os.getenv(
            "GOOGLE_VISION_KEY_PATH", _DEFAULT_KEY_PATH
        )
        self._client: Any | None = None
    @property
    def name(self) -> str:
        return "google_vision"
    # ------------------------------------------------------------------
    # Lazy init
    # ------------------------------------------------------------------
    def _get_client(self) -> Any:
        """Create the Vision client on first use."""
        if self._client is not None:
            return self._client
        # Verify credentials file exists
        if not os.path.isfile(self._key_path):
            raise EngineUnavailableError(
                f"Google Vision key not found at {self._key_path}. "
                "Set GOOGLE_VISION_KEY_PATH or mount the secret."
            )
        try:
            from google.cloud import vision  # type: ignore[import-untyped]
            # Point the SDK at the service account key
            os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = self._key_path
            self._client = vision.ImageAnnotatorClient()
            logger.info(
                "Google Vision client initialized (key: %s)", self._key_path
            )
            return self._client
        except ImportError as exc:
            raise EngineUnavailableError(
                "google-cloud-vision is not installed. "
                "Install with: pip install google-cloud-vision"
            ) from exc
        except Exception as exc:
            raise EngineUnavailableError(
                f"Failed to initialize Google Vision client: {exc}"
            ) from exc
    # ------------------------------------------------------------------
    # OCR
    # ------------------------------------------------------------------
    def recognize(self, image_bytes: bytes, config: OcrConfig) -> OcrEngineResult:
        """Run Google Vision TEXT_DETECTION on image bytes."""
        client = self._get_client()
        try:
            from google.cloud import vision  # type: ignore[import-untyped]
            image = vision.Image(content=image_bytes)
            response = client.text_detection(image=image)
            if response.error.message:
                raise EngineProcessingError(
                    f"Google Vision API error: {response.error.message}"
                )
            annotations = response.text_annotations
            if not annotations:
                return OcrEngineResult(
                    text="",
                    confidence=0.0,
                    word_boxes=[],
                    engine_name=self.name,
                )
            # First annotation is the full-page text; the rest are words
            full_text = annotations[0].description.strip()
            word_boxes: list[WordBox] = []
            confidences: list[float] = []
            for annotation in annotations[1:]:
                text = annotation.description
                vertices = annotation.bounding_poly.vertices
                # Apply character whitelist filter if configured
                if config.char_whitelist:
                    allowed = set(config.char_whitelist)
                    text = "".join(ch for ch in text if ch in allowed)
                if not text.strip():
                    continue
                xs = [v.x for v in vertices]
                ys = [v.y for v in vertices]
                x_min, y_min = min(xs), min(ys)
                x_max, y_max = max(xs), max(ys)
                # Google Vision TEXT_DETECTION does not return per-word
                # confidence in annotations.  Use 0.95 as the documented
                # typical accuracy for clear images so comparisons with
                # PaddleOCR are meaningful.
                word_conf = 0.95
                word_boxes.append(
                    WordBox(
                        text=text.strip(),
                        confidence=word_conf,
                        x=x_min,
                        y=y_min,
                        width=x_max - x_min,
                        height=y_max - y_min,
                    )
                )
                confidences.append(word_conf)
            # Apply whitelist to full text too
            if config.char_whitelist:
                allowed = set(config.char_whitelist)
                full_text = "".join(
                    ch for ch in full_text if ch in allowed or ch in " \n"
                )
            avg_confidence = (
                sum(confidences) / len(confidences) if confidences else 0.0
            )
            return OcrEngineResult(
                text=full_text,
                confidence=avg_confidence,
                word_boxes=word_boxes,
                engine_name=self.name,
            )
        except (EngineUnavailableError, EngineProcessingError):
            raise
        except Exception as exc:
            raise EngineProcessingError(
                f"Google Vision recognition failed: {exc}"
            ) from exc
--- a/ocr/app/engines/engine_factory.py
+++ b/ocr/app/engines/engine_factory.py
@@ -1,5 +1,6 @@
 """Factory function for creating OCR engine instances from configuration."""
 import importlib
 import logging
 from app.config import settings
@@ -7,28 +8,16 @@ from app.engines.base_engine import EngineUnavailableError, OcrEngine
 logger = logging.getLogger(__name__)
-# Valid engine identifiers
+# Valid engine identifiers (primary engines only; hybrid is constructed separately)
 _ENGINE_REGISTRY: dict[str, str] = {
    "paddleocr": "app.engines.paddle_engine.PaddleOcrEngine",
    "tesseract": "app.engines.tesseract_engine.TesseractEngine",
    "google_vision": "app.engines.cloud_engine.CloudEngine",
 }
-def create_engine(engine_name: str | None = None) -> OcrEngine:
+def _create_single_engine(name: str) -> OcrEngine:
-    """Instantiate an OCR engine by name (defaults to config value).
+    """Instantiate a single engine by registry name."""
    Args:
        engine_name: Engine identifier ("paddleocr", "tesseract").
                     Falls back to ``settings.ocr_primary_engine``.
    Returns:
        Initialized OcrEngine instance.
    Raises:
        EngineUnavailableError: If the engine cannot be loaded or initialized.
    """
    name = (engine_name or settings.ocr_primary_engine).lower().strip()
    if name not in _ENGINE_REGISTRY:
        raise EngineUnavailableError(
            f"Unknown engine '{name}'. Available: {list(_ENGINE_REGISTRY.keys())}"
@@ -37,8 +26,6 @@ def create_engine(engine_name: str | None = None) -> OcrEngine:
    module_path, class_name = _ENGINE_REGISTRY[name].rsplit(".", 1)
    try:
        import importlib
        module = importlib.import_module(module_path)
        engine_cls = getattr(module, class_name)
        engine: OcrEngine = engine_cls()
@@ -50,3 +37,51 @@ def create_engine(engine_name: str | None = None) -> OcrEngine:
        raise EngineUnavailableError(
            f"Failed to create engine '{name}': {exc}"
        ) from exc
 def create_engine(engine_name: str | None = None) -> OcrEngine:
    """Instantiate an OCR engine by name (defaults to config value).
    When a fallback engine is configured (``OCR_FALLBACK_ENGINE != "none"``),
    returns a ``HybridEngine`` that wraps the primary with the fallback.
    Args:
        engine_name: Engine identifier ("paddleocr", "tesseract").
                     Falls back to ``settings.ocr_primary_engine``.
    Returns:
        Initialized OcrEngine instance (possibly a HybridEngine wrapper).
    Raises:
        EngineUnavailableError: If the primary engine cannot be loaded.
    """
    name = (engine_name or settings.ocr_primary_engine).lower().strip()
    primary = _create_single_engine(name)
    # Check for cloud fallback configuration
    fallback_name = settings.ocr_fallback_engine.lower().strip()
    if fallback_name == "none" or not fallback_name:
        return primary
    # Create fallback engine (failure is non-fatal -- log and return primary only)
    try:
        fallback = _create_single_engine(fallback_name)
    except EngineUnavailableError as exc:
        logger.warning(
            "Fallback engine '%s' unavailable, proceeding without fallback: %s",
            fallback_name,
            exc,
        )
        return primary
    from app.engines.hybrid_engine import HybridEngine
    threshold = settings.ocr_fallback_threshold
    hybrid = HybridEngine(primary=primary, fallback=fallback, threshold=threshold)
    logger.info(
        "Created hybrid engine: primary=%s, fallback=%s, threshold=%.2f",
        name,
        fallback_name,
        threshold,
    )
    return hybrid
--- a/ocr/app/engines/hybrid_engine.py
+++ b/ocr/app/engines/hybrid_engine.py
@@ -0,0 +1,116 @@
 """Hybrid OCR engine: primary engine with optional cloud fallback."""
 import logging
 import time
 from app.engines.base_engine import (
    EngineError,
    EngineProcessingError,
    OcrConfig,
    OcrEngine,
    OcrEngineResult,
 )
 logger = logging.getLogger(__name__)
 # Maximum time (seconds) to wait for the cloud fallback
 _CLOUD_TIMEOUT_SECONDS = 5.0
 class HybridEngine(OcrEngine):
    """Runs a primary engine and falls back to a cloud engine when
    the primary result confidence is below the configured threshold.
    If the fallback is ``None`` (default), this engine behaves identically
    to the primary engine.  Cloud failures are handled gracefully -- the
    primary result is returned whenever the fallback is unavailable,
    times out, or errors.
    """
    def __init__(
        self,
        primary: OcrEngine,
        fallback: OcrEngine | None = None,
        threshold: float = 0.6,
    ) -> None:
        self._primary = primary
        self._fallback = fallback
        self._threshold = threshold
    @property
    def name(self) -> str:
        fallback_name = self._fallback.name if self._fallback else "none"
        return f"hybrid({self._primary.name}+{fallback_name})"
    def recognize(self, image_bytes: bytes, config: OcrConfig) -> OcrEngineResult:
        """Run primary OCR, optionally falling back to cloud engine."""
        primary_result = self._primary.recognize(image_bytes, config)
        # Happy path: primary confidence meets threshold
        if primary_result.confidence >= self._threshold:
            logger.debug(
                "Primary engine confidence %.2f >= threshold %.2f, no fallback",
                primary_result.confidence,
                self._threshold,
            )
            return primary_result
        # No fallback configured -- return primary result as-is
        if self._fallback is None:
            logger.debug(
                "Primary confidence %.2f < threshold %.2f but no fallback configured",
                primary_result.confidence,
                self._threshold,
            )
            return primary_result
        # Attempt cloud fallback with timeout guard
        logger.info(
            "Primary confidence %.2f < threshold %.2f, trying fallback (%s)",
            primary_result.confidence,
            self._threshold,
            self._fallback.name,
        )
        try:
            start = time.monotonic()
            fallback_result = self._fallback.recognize(image_bytes, config)
            elapsed = time.monotonic() - start
            if elapsed > _CLOUD_TIMEOUT_SECONDS:
                logger.warning(
                    "Cloud fallback took %.1fs (> %.1fs limit), using primary result",
                    elapsed,
                    _CLOUD_TIMEOUT_SECONDS,
                )
                return primary_result
            # Return whichever result has higher confidence
            if fallback_result.confidence > primary_result.confidence:
                logger.info(
                    "Fallback confidence %.2f > primary %.2f, using fallback result",
                    fallback_result.confidence,
                    primary_result.confidence,
                )
                return fallback_result
            logger.info(
                "Primary confidence %.2f >= fallback %.2f, keeping primary result",
                primary_result.confidence,
                fallback_result.confidence,
            )
            return primary_result
        except EngineError as exc:
            logger.warning(
                "Cloud fallback failed (%s), returning primary result: %s",
                self._fallback.name,
                exc,
            )
            return primary_result
        except Exception as exc:
            logger.warning(
                "Unexpected cloud fallback error, returning primary result: %s",
                exc,
            )
            return primary_result
--- a/ocr/app/extractors/receipt_extractor.py
+++ b/ocr/app/extractors/receipt_extractor.py
@@ -1,16 +1,13 @@
 """Receipt-specific OCR extractor with field extraction."""
 import io
 import logging
 import time
 from dataclasses import dataclass, field
 from typing import Any, Optional
 import magic
 import pytesseract
 from PIL import Image
 from pillow_heif import register_heif_opener
-from app.config import settings
+from app.engines import OcrConfig, create_engine
 from app.extractors.base import BaseExtractor
 from app.preprocessors.receipt_preprocessor import receipt_preprocessor
 from app.patterns import currency_matcher, date_matcher, fuel_matcher
@@ -53,8 +50,8 @@ class ReceiptExtractor(BaseExtractor):
    }
    def __init__(self) -> None:
-        """Initialize receipt extractor."""
+        """Initialize receipt extractor with engine from factory."""
-        pytesseract.pytesseract.tesseract_cmd = settings.tesseract_cmd
+        self._engine = create_engine()
    def extract(
        self,
@@ -150,26 +147,19 @@ class ReceiptExtractor(BaseExtractor):
        detected = mime.from_buffer(file_bytes)
        return detected or "application/octet-stream"
-    def _perform_ocr(self, image_bytes: bytes, psm: int = 6) -> str:
+    def _perform_ocr(self, image_bytes: bytes) -> str:
        """
-        Perform OCR on preprocessed image.
+        Perform OCR on preprocessed image via engine abstraction.
        Args:
            image_bytes: Preprocessed image bytes
            psm: Tesseract page segmentation mode
                 4 = Assume single column of text
                 6 = Uniform block of text
        Returns:
            Raw OCR text
        """
-        image = Image.open(io.BytesIO(image_bytes))
+        config = OcrConfig()
-
+        result = self._engine.recognize(image_bytes, config)
-        # Configure Tesseract for receipt OCR
+        return result.text
        # PSM 4 works well for columnar receipt text
        config = f"--psm {psm}"
        return pytesseract.image_to_string(image, config=config)
    def _detect_receipt_type(self, text: str) -> str:
        """
--- a/ocr/app/extractors/vin_extractor.py
+++ b/ocr/app/extractors/vin_extractor.py
@@ -1,5 +1,4 @@
 """VIN-specific OCR extractor with preprocessing and validation."""
 import io
 import logging
 import os
 import time
@@ -8,11 +7,10 @@ from datetime import datetime
 from typing import Optional
 import magic
 import pytesseract
 from PIL import Image
 from pillow_heif import register_heif_opener
 from app.config import settings
 from app.engines import OcrConfig, create_engine
 from app.extractors.base import BaseExtractor
 from app.preprocessors.vin_preprocessor import vin_preprocessor, BoundingBox
 from app.validators.vin_validator import vin_validator
@@ -56,15 +54,15 @@ class VinExtractor(BaseExtractor):
        "image/heif",
    }
-    # VIN character whitelist for Tesseract
+    # VIN character whitelist (passed to engine for post-OCR filtering)
    VIN_WHITELIST = "ABCDEFGHJKLMNPRSTUVWXYZ0123456789"
    # Fixed debug output directory (inside container)
    DEBUG_DIR = "/tmp/vin-debug"
    def __init__(self) -> None:
-        """Initialize VIN extractor."""
+        """Initialize VIN extractor with engine from factory."""
-        pytesseract.pytesseract.tesseract_cmd = settings.tesseract_cmd
+        self._engine = create_engine()
        self._debug = settings.log_level.upper() == "DEBUG"
    def _save_debug_image(self, session_dir: str, name: str, data: bytes) -> None:
@@ -135,21 +133,21 @@ class VinExtractor(BaseExtractor):
            # Perform OCR with VIN-optimized settings
            raw_text, word_confidences = self._perform_ocr(preprocessed_bytes)
-            logger.debug("PSM 6 raw text: '%s'", raw_text)
+            logger.debug("Primary OCR raw text: '%s'", raw_text)
-            logger.debug("PSM 6 word confidences: %s", word_confidences)
+            logger.debug("Primary OCR word confidences: %s", word_confidences)
            # Extract VIN candidates from raw text
            candidates = vin_validator.extract_candidates(raw_text)
-            logger.debug("PSM 6 candidates: %s", candidates)
+            logger.debug("Primary OCR candidates: %s", candidates)
            if not candidates:
-                # No VIN candidates found - try with different PSM modes
+                # No VIN candidates found - try alternate OCR configurations
                candidates = self._try_alternate_ocr(preprocessed_bytes)
            if not candidates:
-                # Try grayscale-only (no thresholding) — the Tesseract
+                # Try grayscale-only (no thresholding) — OCR engines often
-                # LSTM engine often performs better on non-binarized input
+                # perform better on non-binarized input because they do
-                # because it does its own internal preprocessing.
+                # their own internal preprocessing.
                gray_result = vin_preprocessor.preprocess(
                    image_bytes, apply_threshold=False
                )
@@ -166,9 +164,9 @@ class VinExtractor(BaseExtractor):
                raw_text, word_confidences = self._perform_ocr(
                    gray_result.image_bytes
                )
-                logger.debug("Gray PSM 6 raw text: '%s'", raw_text)
+                logger.debug("Gray primary raw text: '%s'", raw_text)
                candidates = vin_validator.extract_candidates(raw_text)
-                logger.debug("Gray PSM 6 candidates: %s", candidates)
+                logger.debug("Gray primary candidates: %s", candidates)
                if not candidates:
                    candidates = self._try_alternate_ocr(
                        gray_result.image_bytes, prefix="Gray"
@@ -188,9 +186,9 @@ class VinExtractor(BaseExtractor):
                    )
                raw_text, word_confidences = self._perform_ocr(otsu_result.image_bytes)
-                logger.debug("Otsu PSM 6 raw text: '%s'", raw_text)
+                logger.debug("Otsu primary raw text: '%s'", raw_text)
                candidates = vin_validator.extract_candidates(raw_text)
-                logger.debug("Otsu PSM 6 candidates: %s", candidates)
+                logger.debug("Otsu primary candidates: %s", candidates)
                if not candidates:
                    candidates = self._try_alternate_ocr(
                        otsu_result.image_bytes, prefix="Otsu"
@@ -280,52 +278,31 @@ class VinExtractor(BaseExtractor):
        return detected or "application/octet-stream"
    def _perform_ocr(
-        self, image_bytes: bytes, psm: int = 6
+        self,
        image_bytes: bytes,
        single_line: bool = False,
        single_word: bool = False,
    ) -> tuple[str, list[float]]:
        """
-        Perform OCR with VIN-optimized settings.
+        Perform OCR with VIN-optimized settings via engine abstraction.
        Args:
            image_bytes: Preprocessed image bytes
-            psm: Tesseract page segmentation mode
+            single_line: Treat image as a single text line
-                 6 = Uniform block of text
+            single_word: Treat image as a single word
                 7 = Single text line
                 8 = Single word
        Returns:
            Tuple of (raw_text, word_confidences)
        """
-        image = Image.open(io.BytesIO(image_bytes))
+        config = OcrConfig(
-
+            char_whitelist=self.VIN_WHITELIST,
-        # Configure Tesseract for VIN extraction
+            single_line=single_line,
-        # OEM 1 = LSTM neural network engine (best accuracy)
+            single_word=single_word,
-        # NOTE: tessedit_char_whitelist does NOT work with OEM 1 (LSTM).
+            use_angle_cls=True,
        # Using it causes empty/erratic output.  Character filtering is
        # handled post-OCR by vin_validator.correct_ocr_errors() instead.
        config = (
            f"--psm {psm} "
            f"--oem 1 "
            f"-c load_system_dawg=false "
            f"-c load_freq_dawg=false"
        )
-
+        result = self._engine.recognize(image_bytes, config)
-        # Get detailed OCR data
+        word_confidences = [wb.confidence for wb in result.word_boxes]
-        ocr_data = pytesseract.image_to_data(
+        return result.text, word_confidences
            image, config=config, output_type=pytesseract.Output.DICT
        )
        # Extract words and confidences
        words = []
        confidences = []
        for i, text in enumerate(ocr_data["text"]):
            conf = int(ocr_data["conf"][i])
            if text.strip() and conf > 0:
                words.append(text.strip())
                confidences.append(conf / 100.0)
        raw_text = " ".join(words)
        return raw_text, confidences
    def _try_alternate_ocr(
        self,
@@ -335,21 +312,25 @@ class VinExtractor(BaseExtractor):
        """
        Try alternate OCR configurations when initial extraction fails.
-        PSM modes tried in order:
+        Modes tried:
-            7  - Single text line
+            single-line - Treat as a single text line
-            8  - Single word
+            single-word - Treat as a single word
-            11 - Sparse text (finds text in any order, good for angled photos)
+
-            13 - Raw line (no Tesseract heuristics, good for clean VIN plates)
+        For PaddleOCR, angle classification handles rotated/angled text
        inherently, replacing the need for Tesseract PSM mode fallbacks.
        Returns:
            List of VIN candidates
        """
        tag = f"{prefix} " if prefix else ""
-        for psm in (7, 8, 11, 13):
+        for mode_name, kwargs in [
-            raw_text, _ = self._perform_ocr(image_bytes, psm=psm)
+            ("single-line", {"single_line": True}),
-            logger.debug("%sPSM %d raw text: '%s'", tag, psm, raw_text)
+            ("single-word", {"single_word": True}),
        ]:
            raw_text, _ = self._perform_ocr(image_bytes, **kwargs)
            logger.debug("%s%s raw text: '%s'", tag, mode_name, raw_text)
            candidates = vin_validator.extract_candidates(raw_text)
-            logger.debug("%sPSM %d candidates: %s", tag, psm, candidates)
+            logger.debug("%s%s candidates: %s", tag, mode_name, candidates)
            if candidates:
                return candidates
--- a/ocr/app/services/ocr_service.py
+++ b/ocr/app/services/ocr_service.py
@@ -1,15 +1,14 @@
-"""Core OCR service using Tesseract with HEIC support."""
+"""Core OCR service with HEIC support, using pluggable engine abstraction."""
 import io
 import logging
 import time
 from typing import Optional
 import magic
 import pytesseract
 from PIL import Image
 from pillow_heif import register_heif_opener
-from app.config import settings
+from app.engines import OcrConfig, create_engine
 from app.models import DocumentType, ExtractedField, OcrResponse
 from app.services.preprocessor import preprocessor
@@ -32,8 +31,8 @@ class OcrService:
    }
    def __init__(self) -> None:
-        """Initialize OCR service."""
+        """Initialize OCR service with engine from factory."""
-        pytesseract.pytesseract.tesseract_cmd = settings.tesseract_cmd
+        self._engine = create_engine()
    def extract(
        self,
@@ -86,14 +85,11 @@ class OcrService:
                    file_bytes, deskew=True, denoise=True
                )
-            # Perform OCR
+            # Perform OCR via engine abstraction
-            image = Image.open(io.BytesIO(file_bytes))
+            config = OcrConfig()
-            ocr_data = pytesseract.image_to_data(
+            result = self._engine.recognize(file_bytes, config)
-                image, output_type=pytesseract.Output.DICT
+            raw_text = result.text
-            )
+            confidence = result.confidence
            # Extract text and calculate confidence
            raw_text, confidence = self._process_ocr_data(ocr_data)
            # Detect document type from content
            document_type = self._detect_document_type(raw_text)
@@ -160,26 +156,6 @@ class OcrService:
        return b""
    def _process_ocr_data(
        self, ocr_data: dict
    ) -> tuple[str, float]:
        """Process Tesseract output to extract text and confidence."""
        words = []
        confidences = []
        for i, text in enumerate(ocr_data["text"]):
            # Filter out empty strings and low-confidence results
            conf = int(ocr_data["conf"][i])
            if text.strip() and conf > 0:
                words.append(text)
                confidences.append(conf)
        raw_text = " ".join(words)
        avg_confidence = sum(confidences) / len(confidences) if confidences else 0.0
        # Normalize confidence to 0-1 range (Tesseract returns 0-100)
        return raw_text, avg_confidence / 100.0
    def _detect_document_type(self, text: str) -> DocumentType:
        """Detect document type from extracted text content."""
        text_lower = text.lower()
--- a/ocr/requirements.txt
+++ b/ocr/requirements.txt
@@ -17,6 +17,7 @@ numpy>=1.24.0
 pytesseract>=0.3.10
 paddlepaddle>=2.6.0
 paddleocr>=2.8.0
 google-cloud-vision>=3.7.0
 # PDF Processing
 PyMuPDF>=1.23.0
--- a/ocr/tests/test_engine_abstraction.py
+++ b/ocr/tests/test_engine_abstraction.py
@@ -0,0 +1,675 @@
 """Tests for OCR engine abstraction layer.
 Covers: base types, exception hierarchy, PaddleOcrEngine,
 TesseractEngine, CloudEngine, HybridEngine, and engine_factory.
 """
 import io
 from unittest.mock import MagicMock, patch
 import pytest
 from PIL import Image
 from app.engines.base_engine import (
    EngineError,
    EngineProcessingError,
    EngineUnavailableError,
    OcrConfig,
    OcrEngine,
    OcrEngineResult,
    WordBox,
 )
 # --- Helpers ---
 def _create_test_image_bytes() -> bytes:
    """Create minimal PNG image bytes for engine testing."""
    img = Image.new("RGB", (100, 50), (255, 255, 255))
    buf = io.BytesIO()
    img.save(buf, format="PNG")
    return buf.getvalue()
 def _make_result(
    text: str, confidence: float, engine_name: str
 ) -> OcrEngineResult:
    """Create a minimal OcrEngineResult for testing."""
    return OcrEngineResult(
        text=text, confidence=confidence, word_boxes=[], engine_name=engine_name
    )
 # ---------------------------------------------------------------------------
 # Exception hierarchy
 # ---------------------------------------------------------------------------
 class TestExceptionHierarchy:
    """Engine errors form a proper hierarchy under EngineError."""
    def test_unavailable_is_engine_error(self) -> None:
        assert issubclass(EngineUnavailableError, EngineError)
    def test_processing_is_engine_error(self) -> None:
        assert issubclass(EngineProcessingError, EngineError)
    def test_engine_error_is_exception(self) -> None:
        assert issubclass(EngineError, Exception)
    def test_catch_base_catches_subtypes(self) -> None:
        with pytest.raises(EngineError):
            raise EngineUnavailableError("not installed")
        with pytest.raises(EngineError):
            raise EngineProcessingError("OCR failed")
 # ---------------------------------------------------------------------------
 # Data types
 # ---------------------------------------------------------------------------
 class TestWordBox:
    def test_default_positions(self) -> None:
        wb = WordBox(text="VIN", confidence=0.95)
        assert wb.x == 0
        assert wb.y == 0
        assert wb.width == 0
        assert wb.height == 0
    def test_all_fields(self) -> None:
        wb = WordBox(text="ABC", confidence=0.88, x=10, y=20, width=100, height=30)
        assert wb.text == "ABC"
        assert wb.confidence == 0.88
        assert wb.x == 10
        assert wb.width == 100
 class TestOcrConfig:
    def test_defaults(self) -> None:
        config = OcrConfig()
        assert config.char_whitelist is None
        assert config.single_line is False
        assert config.single_word is False
        assert config.use_angle_cls is True
        assert config.hints == {}
    def test_vin_whitelist_excludes_ioq(self) -> None:
        whitelist = "ABCDEFGHJKLMNPRSTUVWXYZ0123456789"
        config = OcrConfig(char_whitelist=whitelist)
        assert "I" not in config.char_whitelist
        assert "O" not in config.char_whitelist
        assert "Q" not in config.char_whitelist
    def test_hints_are_independent_across_instances(self) -> None:
        c1 = OcrConfig()
        c2 = OcrConfig()
        c1.hints["psm"] = 7
        assert "psm" not in c2.hints
 class TestOcrEngineResult:
    def test_construction(self) -> None:
        result = OcrEngineResult(
            text="1HGBH41JXMN109186",
            confidence=0.94,
            word_boxes=[WordBox(text="1HGBH41JXMN109186", confidence=0.94)],
            engine_name="paddleocr",
        )
        assert result.text == "1HGBH41JXMN109186"
        assert result.confidence == 0.94
        assert len(result.word_boxes) == 1
        assert result.engine_name == "paddleocr"
    def test_empty_result(self) -> None:
        result = OcrEngineResult(
            text="", confidence=0.0, word_boxes=[], engine_name="tesseract"
        )
        assert result.text == ""
        assert result.word_boxes == []
 # ---------------------------------------------------------------------------
 # OcrEngine ABC
 # ---------------------------------------------------------------------------
 class TestOcrEngineABC:
    def test_cannot_instantiate_directly(self) -> None:
        with pytest.raises(TypeError):
            OcrEngine()  # type: ignore[abstract]
    def test_concrete_subclass_works(self) -> None:
        class StubEngine(OcrEngine):
            @property
            def name(self) -> str:
                return "stub"
            def recognize(
                self, image_bytes: bytes, config: OcrConfig
            ) -> OcrEngineResult:
                return OcrEngineResult(
                    text="ok", confidence=1.0, word_boxes=[], engine_name="stub"
                )
        engine = StubEngine()
        assert engine.name == "stub"
        result = engine.recognize(b"", OcrConfig())
        assert result.text == "ok"
 # ---------------------------------------------------------------------------
 # PaddleOcrEngine
 # ---------------------------------------------------------------------------
 class TestPaddleOcrEngine:
    def test_name(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        assert engine.name == "paddleocr"
    def test_lazy_init_not_loaded_at_construction(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        assert engine._ocr is None
    def test_recognize_empty_results(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        mock_ocr.ocr.return_value = [None]
        engine._ocr = mock_ocr
        result = engine.recognize(_create_test_image_bytes(), OcrConfig())
        assert result.text == ""
        assert result.confidence == 0.0
        assert result.word_boxes == []
        assert result.engine_name == "paddleocr"
    def test_recognize_with_results(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        mock_ocr.ocr.return_value = [
            [
                [[[10, 20], [110, 20], [110, 50], [10, 50]], ("HELLO", 0.95)],
                [[[10, 60], [110, 60], [110, 90], [10, 90]], ("WORLD", 0.88)],
            ]
        ]
        engine._ocr = mock_ocr
        result = engine.recognize(_create_test_image_bytes(), OcrConfig())
        assert result.text == "HELLO WORLD"
        assert abs(result.confidence - 0.915) < 0.01
        assert len(result.word_boxes) == 2
        assert result.word_boxes[0].text == "HELLO"
        assert result.word_boxes[0].confidence == 0.95
        assert result.word_boxes[1].text == "WORLD"
        assert result.engine_name == "paddleocr"
    def test_recognize_whitelist_filters_characters(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        mock_ocr.ocr.return_value = [
            [
                [[[0, 0], [100, 0], [100, 30], [0, 30]], ("1HG-BH4!", 0.9)],
            ]
        ]
        engine._ocr = mock_ocr
        config = OcrConfig(char_whitelist="ABCDEFGHJKLMNPRSTUVWXYZ0123456789")
        result = engine.recognize(_create_test_image_bytes(), config)
        assert "-" not in result.text
        assert "!" not in result.text
        assert result.word_boxes[0].text == "1HGBH4"
    def test_recognize_quadrilateral_to_bounding_box(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        # Slightly rotated quad: min x=8, min y=20, max x=110, max y=55
        mock_ocr.ocr.return_value = [
            [
                [[[10, 20], [110, 25], [108, 55], [8, 50]], ("TEXT", 0.9)],
            ]
        ]
        engine._ocr = mock_ocr
        result = engine.recognize(_create_test_image_bytes(), OcrConfig())
        wb = result.word_boxes[0]
        assert wb.x == 8
        assert wb.y == 20
        assert wb.width == 102  # 110 - 8
        assert wb.height == 35  # 55 - 20
    def test_recognize_skips_empty_after_whitelist(self) -> None:
        """Text consisting only of non-whitelisted characters is skipped."""
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        mock_ocr.ocr.return_value = [
            [
                [[[0, 0], [50, 0], [50, 20], [0, 20]], ("---", 0.9)],
            ]
        ]
        engine._ocr = mock_ocr
        config = OcrConfig(char_whitelist="ABC")
        result = engine.recognize(_create_test_image_bytes(), config)
        assert result.text == ""
        assert result.word_boxes == []
        assert result.confidence == 0.0
    def test_import_error_raises_unavailable(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        engine._ocr = None
        with patch.dict("sys.modules", {"paddleocr": None}):
            with patch(
                "app.engines.paddle_engine.importlib.import_module",
                side_effect=ImportError("No module"),
            ):
                # Force re-import by removing cached paddleocr
                original_import = __builtins__.__import__ if hasattr(__builtins__, '__import__') else __import__
                def mock_import(name, *args, **kwargs):
                    if name == "paddleocr":
                        raise ImportError("No module named 'paddleocr'")
                    return original_import(name, *args, **kwargs)
                with patch("builtins.__import__", side_effect=mock_import):
                    with pytest.raises(EngineUnavailableError, match="paddleocr"):
                        engine._get_ocr()
    def test_processing_error_on_exception(self) -> None:
        from app.engines.paddle_engine import PaddleOcrEngine
        engine = PaddleOcrEngine()
        mock_ocr = MagicMock()
        mock_ocr.ocr.side_effect = RuntimeError("OCR crashed")
        engine._ocr = mock_ocr
        with pytest.raises(EngineProcessingError, match="PaddleOCR recognition failed"):
            engine.recognize(_create_test_image_bytes(), OcrConfig())
 # ---------------------------------------------------------------------------
 # TesseractEngine
 # ---------------------------------------------------------------------------
 class TestTesseractEngine:
    """Tests for TesseractEngine using mocked pytesseract."""
    @pytest.fixture()
    def engine(self) -> "TesseractEngine":  # type: ignore[name-defined]
        """Create a TesseractEngine with mocked pytesseract dependency."""
        mock_pytesseract = MagicMock()
        mock_pytesseract.Output.DICT = "dict"
        with patch.dict("sys.modules", {"pytesseract": mock_pytesseract}):
            with patch("app.engines.tesseract_engine.settings") as mock_settings:
                mock_settings.tesseract_cmd = "/usr/bin/tesseract"
                from app.engines.tesseract_engine import TesseractEngine
                eng = TesseractEngine()
                eng._mock_pytesseract = mock_pytesseract  # type: ignore[attr-defined]
                return eng
    def test_name(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        assert engine.name == "tesseract"
    def test_build_config_default_psm(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        config_str = engine._build_config(OcrConfig())
        assert "--psm 6" in config_str
    def test_build_config_single_line(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        config_str = engine._build_config(OcrConfig(single_line=True))
        assert "--psm 7" in config_str
    def test_build_config_single_word(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        config_str = engine._build_config(OcrConfig(single_word=True))
        assert "--psm 8" in config_str
    def test_build_config_whitelist(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        config_str = engine._build_config(OcrConfig(char_whitelist="ABC123"))
        assert "-c tessedit_char_whitelist=ABC123" in config_str
    def test_build_config_psm_hint(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        config_str = engine._build_config(OcrConfig(hints={"psm": 11}))
        assert "--psm 11" in config_str
    def test_recognize_normalizes_confidence(self, engine: "TesseractEngine") -> None:  # type: ignore[name-defined]
        """Tesseract returns 0-100 confidence; engine normalizes to 0.0-1.0."""
        engine._pytesseract.image_to_data.return_value = {
            "text": ["HELLO", ""],
            "conf": [92, -1],
            "left": [10],
            "top": [20],
            "width": [100],
            "height": [30],
        }
        result = engine.recognize(_create_test_image_bytes(), OcrConfig())
        assert result.text == "HELLO"
        assert abs(result.confidence - 0.92) < 0.01
        assert result.engine_name == "tesseract"
    def test_import_error_raises_unavailable(self) -> None:
        with patch.dict("sys.modules", {"pytesseract": None}):
            with patch("app.engines.tesseract_engine.settings") as mock_settings:
                mock_settings.tesseract_cmd = "/usr/bin/tesseract"
                def mock_import(name, *args, **kwargs):
                    if name == "pytesseract":
                        raise ImportError("No module named 'pytesseract'")
                    return __import__(name, *args, **kwargs)
                with patch("builtins.__import__", side_effect=mock_import):
                    from app.engines.tesseract_engine import TesseractEngine
                    with pytest.raises(EngineUnavailableError, match="pytesseract"):
                        TesseractEngine()
 # ---------------------------------------------------------------------------
 # CloudEngine
 # ---------------------------------------------------------------------------
 class TestCloudEngine:
    def test_name(self) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/fake/path.json")
        assert engine.name == "google_vision"
    def test_lazy_init_not_loaded_at_construction(self) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/fake/path.json")
        assert engine._client is None
    def test_missing_key_file_raises_unavailable(self) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/nonexistent/key.json")
        with pytest.raises(EngineUnavailableError, match="key not found"):
            engine._get_client()
    @patch("os.path.isfile", return_value=True)
    def test_missing_library_raises_unavailable(self, _mock_isfile: MagicMock) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/fake/key.json")
        def mock_import(name, *args, **kwargs):
            if "google.cloud" in name:
                raise ImportError("No module named 'google.cloud'")
            return __import__(name, *args, **kwargs)
        with patch("builtins.__import__", side_effect=mock_import):
            with pytest.raises(EngineUnavailableError, match="google-cloud-vision"):
                engine._get_client()
    def test_recognize_empty_annotations(self) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/fake/key.json")
        mock_client = MagicMock()
        mock_response = MagicMock()
        mock_response.error.message = ""
        mock_response.text_annotations = []
        mock_client.text_detection.return_value = mock_response
        engine._client = mock_client
        # Mock the google.cloud.vision import inside recognize()
        mock_vision = MagicMock()
        with patch.dict("sys.modules", {"google.cloud.vision": mock_vision, "google.cloud": MagicMock(), "google": MagicMock()}):
            result = engine.recognize(b"fake_image", OcrConfig())
        assert result.text == ""
        assert result.confidence == 0.0
        assert result.engine_name == "google_vision"
    def test_recognize_api_error_raises_processing_error(self) -> None:
        from app.engines.cloud_engine import CloudEngine
        engine = CloudEngine(key_path="/fake/key.json")
        mock_client = MagicMock()
        mock_response = MagicMock()
        mock_response.error.message = "API quota exceeded"
        mock_client.text_detection.return_value = mock_response
        engine._client = mock_client
        mock_vision = MagicMock()
        with patch.dict("sys.modules", {"google.cloud.vision": mock_vision, "google.cloud": MagicMock(), "google": MagicMock()}):
            with pytest.raises(EngineProcessingError, match="API quota exceeded"):
                engine.recognize(b"fake_image", OcrConfig())
 # ---------------------------------------------------------------------------
 # HybridEngine
 # ---------------------------------------------------------------------------
 class TestHybridEngine:
    def test_name_with_fallback(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback = MagicMock(spec=OcrEngine)
        fallback.name = "google_vision"
        engine = HybridEngine(primary=primary, fallback=fallback)
        assert engine.name == "hybrid(paddleocr+google_vision)"
    def test_name_without_fallback(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        engine = HybridEngine(primary=primary)
        assert engine.name == "hybrid(paddleocr+none)"
    def test_high_confidence_skips_fallback(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "cloud"
        primary.recognize.return_value = _make_result("VIN123", 0.95, "paddleocr")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"
        assert result.engine_name == "paddleocr"
        fallback.recognize.assert_not_called()
    def test_low_confidence_triggers_fallback(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "google_vision"
        primary.recognize.return_value = _make_result("VIN123", 0.3, "paddleocr")
        fallback.recognize.return_value = _make_result("VIN456", 0.92, "google_vision")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN456"
        assert result.engine_name == "google_vision"
        fallback.recognize.assert_called_once()
    def test_low_confidence_no_fallback_returns_primary(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        primary.recognize.return_value = _make_result("VIN123", 0.3, "paddleocr")
        engine = HybridEngine(primary=primary, fallback=None, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"
    def test_fallback_lower_confidence_returns_primary(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "google_vision"
        primary.recognize.return_value = _make_result("VIN123", 0.4, "paddleocr")
        fallback.recognize.return_value = _make_result("VIN456", 0.3, "google_vision")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"
    def test_fallback_engine_error_returns_primary(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "google_vision"
        primary.recognize.return_value = _make_result("VIN123", 0.3, "paddleocr")
        fallback.recognize.side_effect = EngineUnavailableError("key missing")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"
    def test_fallback_unexpected_error_returns_primary(self) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "google_vision"
        primary.recognize.return_value = _make_result("VIN123", 0.3, "paddleocr")
        fallback.recognize.side_effect = RuntimeError("network error")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"
    @patch("app.engines.hybrid_engine.time")
    def test_fallback_timeout_returns_primary(self, mock_time: MagicMock) -> None:
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "google_vision"
        primary.recognize.return_value = _make_result("VIN123", 0.3, "paddleocr")
        fallback.recognize.return_value = _make_result("VIN456", 0.92, "google_vision")
        # Simulate 6-second delay (exceeds 5s limit)
        mock_time.monotonic.side_effect = [0.0, 6.0]
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.text == "VIN123"  # timeout -> use primary
    def test_exact_threshold_skips_fallback(self) -> None:
        """When confidence == threshold, no fallback needed (>= check)."""
        from app.engines.hybrid_engine import HybridEngine
        primary = MagicMock(spec=OcrEngine)
        fallback = MagicMock(spec=OcrEngine)
        primary.name = "paddleocr"
        fallback.name = "cloud"
        primary.recognize.return_value = _make_result("VIN", 0.6, "paddleocr")
        engine = HybridEngine(primary=primary, fallback=fallback, threshold=0.6)
        result = engine.recognize(b"img", OcrConfig())
        assert result.engine_name == "paddleocr"
        fallback.recognize.assert_not_called()
 # ---------------------------------------------------------------------------
 # Engine factory
 # ---------------------------------------------------------------------------
 class TestEngineFactory:
    def test_unknown_engine_raises(self) -> None:
        from app.engines.engine_factory import _create_single_engine
        with pytest.raises(EngineUnavailableError, match="Unknown engine"):
            _create_single_engine("nonexistent")
    @patch("app.engines.engine_factory.settings")
    @patch("app.engines.engine_factory._create_single_engine")
    def test_defaults_to_settings_primary(
        self, mock_create: MagicMock, mock_settings: MagicMock
    ) -> None:
        mock_settings.ocr_primary_engine = "paddleocr"
        mock_settings.ocr_fallback_engine = "none"
        mock_engine = MagicMock(spec=OcrEngine)
        mock_create.return_value = mock_engine
        from app.engines.engine_factory import create_engine
        result = create_engine()
        mock_create.assert_called_once_with("paddleocr")
        assert result == mock_engine
    @patch("app.engines.engine_factory.settings")
    @patch("app.engines.engine_factory._create_single_engine")
    def test_explicit_name_overrides_settings(
        self, mock_create: MagicMock, mock_settings: MagicMock
    ) -> None:
        mock_settings.ocr_fallback_engine = "none"
        mock_engine = MagicMock(spec=OcrEngine)
        mock_create.return_value = mock_engine
        from app.engines.engine_factory import create_engine
        create_engine("tesseract")
        mock_create.assert_called_once_with("tesseract")
    @patch("app.engines.engine_factory.settings")
    @patch("app.engines.engine_factory._create_single_engine")
    def test_creates_hybrid_when_fallback_configured(
        self, mock_create: MagicMock, mock_settings: MagicMock
    ) -> None:
        mock_settings.ocr_primary_engine = "paddleocr"
        mock_settings.ocr_fallback_engine = "google_vision"
        mock_settings.ocr_fallback_threshold = 0.7
        mock_primary = MagicMock(spec=OcrEngine)
        mock_fallback = MagicMock(spec=OcrEngine)
        mock_create.side_effect = [mock_primary, mock_fallback]
        from app.engines.engine_factory import create_engine
        from app.engines.hybrid_engine import HybridEngine
        result = create_engine()
        assert isinstance(result, HybridEngine)
    @patch("app.engines.engine_factory.settings")
    @patch("app.engines.engine_factory._create_single_engine")
    def test_fallback_failure_returns_primary_only(
        self, mock_create: MagicMock, mock_settings: MagicMock
    ) -> None:
        mock_settings.ocr_primary_engine = "paddleocr"
        mock_settings.ocr_fallback_engine = "google_vision"
        mock_settings.ocr_fallback_threshold = 0.6
        mock_primary = MagicMock(spec=OcrEngine)
        mock_create.side_effect = [mock_primary, EngineUnavailableError("no key")]
        from app.engines.engine_factory import create_engine
        result = create_engine()
        assert result == mock_primary
--- a/ocr/tests/test_vin_extraction.py
+++ b/ocr/tests/test_vin_extraction.py
@@ -1,11 +1,12 @@
-"""Integration tests for VIN extraction endpoint."""
+"""Integration tests for VIN extraction endpoint and engine integration."""
 import io
 from unittest.mock import patch, MagicMock
 import pytest
 from fastapi.testclient import TestClient
-from PIL import Image, ImageDraw, ImageFont
+from PIL import Image, ImageDraw
 from app.engines.base_engine import OcrConfig, OcrEngineResult, WordBox
 from app.main import app
@@ -240,3 +241,106 @@ class TestVinExtractionContentTypes:
        )
        assert response.status_code == 200
 # ---------------------------------------------------------------------------
 # VIN extractor engine integration tests
 # ---------------------------------------------------------------------------
 class TestVinExtractorEngineIntegration:
    """Tests verifying VinExtractor integrates correctly with engine abstraction."""
    @patch("app.extractors.vin_extractor.create_engine")
    def test_perform_ocr_calls_engine_with_vin_config(
        self, mock_create_engine: MagicMock
    ) -> None:
        """_perform_ocr passes VIN whitelist and angle_cls to engine."""
        from app.extractors.vin_extractor import VinExtractor
        mock_engine = MagicMock()
        mock_engine.recognize.return_value = OcrEngineResult(
            text="1HGBH41JXMN109186",
            confidence=0.94,
            word_boxes=[WordBox(text="1HGBH41JXMN109186", confidence=0.94)],
            engine_name="paddleocr",
        )
        mock_create_engine.return_value = mock_engine
        extractor = VinExtractor()
        text, confidences = extractor._perform_ocr(b"fake_image")
        mock_engine.recognize.assert_called_once()
        call_config = mock_engine.recognize.call_args[0][1]
        assert isinstance(call_config, OcrConfig)
        assert call_config.char_whitelist == VinExtractor.VIN_WHITELIST
        assert call_config.use_angle_cls is True
        assert call_config.single_line is False
        assert call_config.single_word is False
        assert text == "1HGBH41JXMN109186"
        assert confidences == [0.94]
    @patch("app.extractors.vin_extractor.create_engine")
    def test_perform_ocr_single_line_mode(
        self, mock_create_engine: MagicMock
    ) -> None:
        """_perform_ocr passes single_line flag to engine config."""
        from app.extractors.vin_extractor import VinExtractor
        mock_engine = MagicMock()
        mock_engine.recognize.return_value = OcrEngineResult(
            text="VIN123", confidence=0.9, word_boxes=[], engine_name="paddleocr"
        )
        mock_create_engine.return_value = mock_engine
        extractor = VinExtractor()
        extractor._perform_ocr(b"img", single_line=True)
        call_config = mock_engine.recognize.call_args[0][1]
        assert call_config.single_line is True
        assert call_config.single_word is False
    @patch("app.extractors.vin_extractor.create_engine")
    def test_perform_ocr_single_word_mode(
        self, mock_create_engine: MagicMock
    ) -> None:
        """_perform_ocr passes single_word flag to engine config."""
        from app.extractors.vin_extractor import VinExtractor
        mock_engine = MagicMock()
        mock_engine.recognize.return_value = OcrEngineResult(
            text="VIN123", confidence=0.9, word_boxes=[], engine_name="paddleocr"
        )
        mock_create_engine.return_value = mock_engine
        extractor = VinExtractor()
        extractor._perform_ocr(b"img", single_word=True)
        call_config = mock_engine.recognize.call_args[0][1]
        assert call_config.single_word is True
        assert call_config.single_line is False
    def test_calculate_base_confidence_empty_returns_default(self) -> None:
        """Empty word confidences return 0.5 default."""
        from app.extractors.vin_extractor import VinExtractor
        extractor = VinExtractor.__new__(VinExtractor)
        assert extractor._calculate_base_confidence([]) == 0.5
    def test_calculate_base_confidence_weighted_blend(self) -> None:
        """Confidence = 70% average + 30% minimum."""
        from app.extractors.vin_extractor import VinExtractor
        extractor = VinExtractor.__new__(VinExtractor)
        # avg = (0.9 + 0.8) / 2 = 0.85, min = 0.8
        # result = 0.7 * 0.85 + 0.3 * 0.8 = 0.595 + 0.24 = 0.835
        result = extractor._calculate_base_confidence([0.9, 0.8])
        assert abs(result - 0.835) < 0.001
    def test_calculate_base_confidence_single_value(self) -> None:
        """Single confidence value: avg == min, so result equals that value."""
        from app.extractors.vin_extractor import VinExtractor
        extractor = VinExtractor.__new__(VinExtractor)
        result = extractor._calculate_base_confidence([0.92])
        assert abs(result - 0.92) < 0.001
--- a/secrets/app/google-vision-key.json.example
+++ b/secrets/app/google-vision-key.json.example
@@ -0,0 +1,18 @@
 {
  "_comment": "Google Vision API service account key for OCR cloud fallback",
  "_instructions": [
    "1. Create a Google Cloud service account with Vision API access",
    "2. Download the JSON key file",
    "3. Save it as secrets/app/google-vision-key.json (gitignored)",
    "4. Uncomment the volume mount in docker-compose.yml",
    "5. Set OCR_FALLBACK_ENGINE=google_vision"
  ],
  "type": "service_account",
  "project_id": "your-project-id",
  "private_key_id": "",
  "private_key": "",
  "client_email": "your-sa@your-project-id.iam.gserviceaccount.com",
  "client_id": "",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token"
 }
Author	SHA1	Message	Date
Eric Gullickson	47c5676498	chore: update OCR tests and documentation (refs #121 ) Some checks failed Deploy to Staging / Build Images (pull_request) Failing after 7m4s Details Deploy to Staging / Deploy to Staging (pull_request) Has been skipped Details Deploy to Staging / Verify Staging (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Failure (pull_request) Successful in 7s Details Add engine abstraction tests and update docs to reflect PaddleOCR primary architecture with optional Google Vision cloud fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:42:51 -06:00
Eric Gullickson	1e96baca6f	fix: workflow contract	2026-02-07 11:32:36 -06:00
Eric Gullickson	3c1a090ae3	fix: resolve crop tool regression with stale ref and aspect ratio minSize (refs #120 ) Three bugs fixed in the draw-first crop tool introduced by PR #114: 1. Stale cropAreaRef: replaced useEffect-based ref sync with direct synchronous updates in handleMove and handleDrawStart. The useEffect ran after browser paint, so handleDragEnd read stale values (often {width:0, height:0}), preventing cropDrawn from being set. 2. Aspect ratio minSize: when aspectRatio=6 (VIN mode), height=width/6 required width>=60% to pass the height>=10% check. Now only checks width>=minSize when aspect ratio constrains height. 3. Bounds clamping: aspect-ratio-forced height could push crop area past 100% of container. Now clamps y position to keep within bounds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:29:16 -06:00
Eric Gullickson	9b6417379b	chore: update Docker and compose files for PaddleOCR engine (refs #119 ) - Replace libtesseract-dev with libgomp1 (OpenMP for PaddlePaddle) - Pre-download PP-OCRv4 models during Docker build - Add OCR engine env vars to all compose files (base, staging, prod) - Add optional Google Vision secret mount (commented, enable on demand) - Create google-vision-key.json.example placeholder Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:17:44 -06:00
Eric Gullickson	4ef942cb9d	feat: add optional Google Vision cloud fallback engine (refs #118 ) CloudEngine wraps Google Vision TEXT_DETECTION with lazy init. HybridEngine runs primary engine, falls back to cloud when confidence is below threshold. Disabled by default (OCR_FALLBACK_ENGINE=none). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:12:08 -06:00
Eric Gullickson	013fb0c67a	feat: migrate VIN/receipt extractors and OCR service to engine abstraction (refs #117 ) Replace direct pytesseract calls with OcrEngine interface in vin_extractor.py, receipt_extractor.py, and ocr_service.py. PSM mode fallbacks replaced with engine-agnostic single-line/single-word configs. Dead _process_ocr_data removed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:56:27 -06:00