feat: add core OCR API integration (refs #65)

OCR Service (Python/FastAPI): - POST /extract for synchronous OCR extraction - POST /jobs and GET /jobs/{job_id} for async processing - Image preprocessing (deskew, denoise) for accuracy - HEIC conversion via pillow-heif - Redis job queue for async processing Backend (Fastify): - POST /api/ocr/extract - authenticated proxy to OCR - POST /api/ocr/jobs - async job submission - GET /api/ocr/jobs/:jobId - job polling - Multipart file upload handling - JWT authentication required File size limits: 10MB sync, 200MB async Processing time target: <3 seconds for typical photos Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 16:02:11 -06:00
parent 94e49306dc
commit 852c9013b5
25 changed files with 1931 additions and 3 deletions
--- a/ocr/app/config.py
+++ b/ocr/app/config.py
@@ -11,5 +11,10 @@ class Settings:
        self.port: int = int(os.getenv("PORT", "8000"))
        self.tesseract_cmd: str = os.getenv("TESSERACT_CMD", "/usr/bin/tesseract")

+        # Redis configuration for job queue
+        self.redis_host: str = os.getenv("REDIS_HOST", "mvp-redis")
+        self.redis_port: int = int(os.getenv("REDIS_PORT", "6379"))
+        self.redis_db: int = int(os.getenv("REDIS_DB", "1"))
+

 settings = Settings()