Commit Graph

6 Commits

Author SHA1 Message Date
Eric Gullickson
4abd7d8d5b feat: add Vision monthly cap, WIF auth, and cloud-primary hybrid engine (refs #127)
- Add VISION_MONTHLY_LIMIT config setting (default 1000)
- Update CloudEngine to use WIF credential config via ADC
- Rewrite HybridEngine to support cloud-primary with Redis counter
- Pass monthly_limit through engine factory

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 20:50:02 -06:00
Eric Gullickson
b9fe222f12 fix: Build errors and tesseract removal
Some checks failed
Deploy to Staging / Build Images (pull_request) Failing after 4m14s
Deploy to Staging / Deploy to Staging (pull_request) Has been skipped
Deploy to Staging / Verify Staging (pull_request) Has been skipped
Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped
Deploy to Staging / Notify Staging Failure (pull_request) Successful in 8s
2026-02-07 12:12:04 -06:00
Eric Gullickson
4ef942cb9d feat: add optional Google Vision cloud fallback engine (refs #118)
CloudEngine wraps Google Vision TEXT_DETECTION with lazy init.
HybridEngine runs primary engine, falls back to cloud when confidence
is below threshold. Disabled by default (OCR_FALLBACK_ENGINE=none).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 11:12:08 -06:00
Eric Gullickson
ebc633fb36 feat: add OCR engine abstraction layer (refs #116)
Introduce pluggable OcrEngine ABC with PaddleOCR PP-OCRv4 as primary
engine and Tesseract wrapper for backward compatibility. Engine factory
reads OCR_PRIMARY_ENGINE config to instantiate the correct engine.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:47:40 -06:00
Eric Gullickson
852c9013b5 feat: add core OCR API integration (refs #65)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 5m59s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s
Deploy to Staging / Verify Staging (pull_request) Successful in 2m19s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 7s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
OCR Service (Python/FastAPI):
- POST /extract for synchronous OCR extraction
- POST /jobs and GET /jobs/{job_id} for async processing
- Image preprocessing (deskew, denoise) for accuracy
- HEIC conversion via pillow-heif
- Redis job queue for async processing

Backend (Fastify):
- POST /api/ocr/extract - authenticated proxy to OCR
- POST /api/ocr/jobs - async job submission
- GET /api/ocr/jobs/:jobId - job polling
- Multipart file upload handling
- JWT authentication required

File size limits: 10MB sync, 200MB async
Processing time target: <3 seconds for typical photos

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 16:02:11 -06:00
Eric Gullickson
1ba491144b feat: add OCR service container (refs #64)
Some checks failed
Deploy to Staging / Build Images (pull_request) Successful in 7m41s
Deploy to Staging / Deploy to Staging (pull_request) Failing after 13s
Deploy to Staging / Verify Staging (pull_request) Has been skipped
Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped
Deploy to Staging / Notify Staging Failure (pull_request) Successful in 8s
Add Python-based OCR service container (mvp-ocr) as the 6th service:
- Python 3.11-slim with FastAPI/uvicorn
- Tesseract OCR with English language pack
- pillow-heif for HEIC image support
- opencv-python-headless for image preprocessing
- Health endpoint at /health
- Unit tests for health, HEIC support, and Tesseract availability

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 13:06:16 -06:00