Files
motovaultpro/ocr/app/CLAUDE.md
Eric Gullickson ab0d8463be docs: update CLAUDE.md indexes and README for OCR expansion (refs #137)
Add/update documentation across backend, Python OCR service, and frontend
for receipt scanning, manual extraction, and Gemini integration. Create
new CLAUDE.md files for engines/, fuel-logs/, documents/, and maintenance/
features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:04:19 -06:00

1.6 KiB

ocr/app/

Python OCR microservice (FastAPI). Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Gemini 2.5 Flash for maintenance manual PDF extraction (standalone module, not an OcrEngine subclass).

Files

File What When to read
main.py FastAPI application entry point Route registration, app setup
config.py Configuration settings (OCR engines, Vertex AI, Redis, Vision API limits) Environment variables, settings
__init__.py Package init Package structure

Subdirectories

Directory What When to read
engines/ OCR engine abstraction (PaddleOCR, Google Vision, Hybrid) and Gemini module Engine changes, adding new engines
extractors/ Domain-specific data extraction (receipts, fuel receipts, maintenance manuals) Adding new extraction types, modifying extraction logic
models/ Data models and schemas Request/response types
patterns/ Regex patterns and service name mapping (27 maintenance subtypes) Pattern matching rules, service categorization
preprocessors/ Image preprocessing pipeline Image preparation before OCR
routers/ FastAPI route handlers (/extract, /extract/receipt, /extract/manual, /jobs) API endpoint changes
services/ Business logic services (job queue with Redis) Core OCR processing, async job management
table_extraction/ Table detection and parsing Structured data extraction from images
validators/ Input validation Validation rules