# ocr/app/engines/ OCR engine abstraction layer. Two categories of engines: 1. **OcrEngine subclasses** (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes. 2. **GeminiEngine** (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via google-genai SDK. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ. ## Files | File | What | When to read | | ---- | ---- | ------------ | | `__init__.py` | Public engine API exports (OcrEngine, create_engine, exceptions) | Importing engine interfaces | | `base_engine.py` | OcrEngine ABC, OcrConfig, OcrEngineResult, WordBox, exception hierarchy | Engine interface contract, adding new engines | | `paddle_engine.py` | PaddleOCR PP-OCRv4 primary engine | Local OCR debugging, accuracy tuning | | `cloud_engine.py` | Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota | | `hybrid_engine.py` | Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior | | `engine_factory.py` | Factory function and engine registry for instantiation | Adding new engine types | | `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (google-genai SDK, 20MB PDF limit, structured JSON output, Google Search grounding for VIN decode) | Manual extraction debugging, VIN decode, Gemini configuration | ## Engine Selection ``` create_engine(config) | +-- Primary: PaddleOCR (local, fast, no API limits) | +-- Fallback: Google Vision (cloud, 1000/month limit) | v HybridEngine (tries primary, falls back if confidence < threshold) ``` GeminiEngine is created independently by ManualExtractor and the VIN decode router, not through the engine factory.