Add standalone GeminiEngine class for maintenance schedule extraction
from PDF owners manuals using Vertex AI Gemini 2.5 Flash with structured
JSON output enforcement, 20MB size limit, and lazy initialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CloudEngine wraps Google Vision TEXT_DETECTION with lazy init.
HybridEngine runs primary engine, falls back to cloud when confidence
is below threshold. Disabled by default (OCR_FALLBACK_ENGINE=none).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce pluggable OcrEngine ABC with PaddleOCR PP-OCRv4 as primary
engine and Tesseract wrapper for backward compatibility. Engine factory
reads OCR_PRIMARY_ENGINE config to instantiate the correct engine.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement async PDF processing for owner's manuals with maintenance
schedule extraction:
- Add PDF preprocessor with PyMuPDF for text/scanned PDF handling
- Add maintenance pattern matching (mileage, time, fluid specs)
- Add service name mapping to maintenance subtypes
- Add table detection and parsing for schedule tables
- Add manual extractor orchestrating the complete pipeline
- Add POST /extract/manual endpoint for async job submission
- Add Redis job queue support for manual extraction jobs
- Add progress tracking during processing
Processing pipeline:
1. Analyze PDF structure (text layer vs scanned)
2. Find maintenance schedule sections
3. Extract text or OCR scanned pages at 300 DPI
4. Detect and parse maintenance tables
5. Normalize service names and extract intervals
6. Return structured maintenance schedules with confidence scores
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
OCR Service (Python/FastAPI):
- POST /extract for synchronous OCR extraction
- POST /jobs and GET /jobs/{job_id} for async processing
- Image preprocessing (deskew, denoise) for accuracy
- HEIC conversion via pillow-heif
- Redis job queue for async processing
Backend (Fastify):
- POST /api/ocr/extract - authenticated proxy to OCR
- POST /api/ocr/jobs - async job submission
- GET /api/ocr/jobs/:jobId - job polling
- Multipart file upload handling
- JWT authentication required
File size limits: 10MB sync, 200MB async
Processing time target: <3 seconds for typical photos
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Python-based OCR service container (mvp-ocr) as the 6th service:
- Python 3.11-slim with FastAPI/uvicorn
- Tesseract OCR with English language pack
- pillow-heif for HEIC image support
- opencv-python-headless for image preprocessing
- Health endpoint at /health
- Unit tests for health, HEIC support, and Tesseract availability
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>