Files
motovaultpro/ocr/app/engines/CLAUDE.md
Eric Gullickson 96e1dde7b2
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 8m4s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 24s
Deploy to Staging / Verify Staging (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
docs: update CLAUDE.md references from Vertex AI to google-genai (refs #231)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:21:58 -06:00

1.9 KiB

ocr/app/engines/

OCR engine abstraction layer. Two categories of engines:

  1. OcrEngine subclasses (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
  2. GeminiEngine (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via google-genai SDK. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.

Files

File What When to read
__init__.py Public engine API exports (OcrEngine, create_engine, exceptions) Importing engine interfaces
base_engine.py OcrEngine ABC, OcrConfig, OcrEngineResult, WordBox, exception hierarchy Engine interface contract, adding new engines
paddle_engine.py PaddleOCR PP-OCRv4 primary engine Local OCR debugging, accuracy tuning
cloud_engine.py Google Vision TEXT_DETECTION fallback engine (WIF authentication) Cloud OCR configuration, API quota
hybrid_engine.py Combines primary + fallback engine with confidence threshold switching Engine selection logic, fallback behavior
engine_factory.py Factory function and engine registry for instantiation Adding new engine types
gemini_engine.py Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (google-genai SDK, 20MB PDF limit, structured JSON output, Google Search grounding for VIN decode) Manual extraction debugging, VIN decode, Gemini configuration

Engine Selection

create_engine(config)
    |
    +-- Primary: PaddleOCR (local, fast, no API limits)
    |
    +-- Fallback: Google Vision (cloud, 1000/month limit)
    |
    v
HybridEngine (tries primary, falls back if confidence < threshold)

GeminiEngine is created independently by ManualExtractor and the VIN decode router, not through the engine factory.