Delete vehicles/external/nhtsa/ directory (3 files), remove VPICVariable and VPICResponse from platform models. Update all documentation to reflect Gemini VIN decode via OCR service architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1.8 KiB
1.8 KiB
ocr/app/engines/
OCR engine abstraction layer. Two categories of engines:
- OcrEngine subclasses (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
- GeminiEngine (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via Vertex AI. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
Files
| File | What | When to read |
|---|---|---|
__init__.py |
Public engine API exports (OcrEngine, create_engine, exceptions) | Importing engine interfaces |
base_engine.py |
OcrEngine ABC, OcrConfig, OcrEngineResult, WordBox, exception hierarchy | Engine interface contract, adding new engines |
paddle_engine.py |
PaddleOCR PP-OCRv4 primary engine | Local OCR debugging, accuracy tuning |
cloud_engine.py |
Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota |
hybrid_engine.py |
Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior |
engine_factory.py |
Factory function and engine registry for instantiation | Adding new engine types |
gemini_engine.py |
Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (Vertex AI SDK, 20MB PDF limit, structured JSON output) | Manual extraction debugging, VIN decode, Gemini configuration |
Engine Selection
create_engine(config)
|
+-- Primary: PaddleOCR (local, fast, no API limits)
|
+-- Fallback: Google Vision (cloud, 1000/month limit)
|
v
HybridEngine (tries primary, falls back if confidence < threshold)
GeminiEngine is created independently by ManualExtractor and the VIN decode router, not through the engine factory.