Add/update documentation across backend, Python OCR service, and frontend for receipt scanning, manual extraction, and Gemini integration. Create new CLAUDE.md files for engines/, fuel-logs/, documents/, and maintenance/ features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1.7 KiB
1.7 KiB
ocr/app/engines/
OCR engine abstraction layer. Two categories of engines:
- OcrEngine subclasses (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
- GeminiEngine (PDF-to-structured-data): Standalone module for maintenance schedule extraction via Vertex AI. Accepts PDF bytes, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
Files
| File | What | When to read |
|---|---|---|
__init__.py |
Public engine API exports (OcrEngine, create_engine, exceptions) | Importing engine interfaces |
base_engine.py |
OcrEngine ABC, OcrConfig, OcrEngineResult, WordBox, exception hierarchy | Engine interface contract, adding new engines |
paddle_engine.py |
PaddleOCR PP-OCRv4 primary engine | Local OCR debugging, accuracy tuning |
cloud_engine.py |
Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota |
hybrid_engine.py |
Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior |
engine_factory.py |
Factory function and engine registry for instantiation | Adding new engine types |
gemini_engine.py |
Gemini 2.5 Flash integration for maintenance schedule extraction (Vertex AI SDK, 20MB PDF limit, structured JSON output) | Manual extraction debugging, Gemini configuration |
Engine Selection
create_engine(config)
|
+-- Primary: PaddleOCR (local, fast, no API limits)
|
+-- Fallback: Google Vision (cloud, 1000/month limit)
|
v
HybridEngine (tries primary, falls back if confidence < threshold)
GeminiEngine is created independently by ManualExtractor, not through the engine factory.