feat: Gemini engine module and configuration (#129) #142
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Relates to #129
Milestone 4: Gemini Engine Module and Configuration
Files
ocr/app/engines/gemini_engine.py(NEW)ocr/app/config.pyocr/requirements.txtdocker-compose.ymldocker-compose.staging.ymldocker-compose.prod.ymlRequirements
GeminiEngineclass (standalone, NOT extending OcrEngine -- different interface: OcrEngine.recognize() accepts image bytes and returns text+confidence; GeminiEngine.extract_maintenance() accepts PDF bytes and returns structured JSON)GenerativeModel("gemini-2.5-flash")withgenerate_content()GeminiEngine.extract_maintenance(pdf_bytes)validates PDF size BEFORE processing. Reject if raw bytes > 20MB with clear error message including file size. Vertex AI SDK handles base64 encoding internally.response_mime_type="application/json"andresponse_schemafor guaranteed JSON structuremaintenanceSchedule[]withserviceName(required),intervalMiles(nullable),intervalMonths(nullable),details(nullable)GeminiEngine._get_client()wraps Vertex AI client initialization in try/except. Catch all exceptions, log full traceback, raiseEngineUnavailableErrorwith message: "Vertex AI authentication failed: {exc}"GOOGLE_APPLICATION_CREDENTIALSenv var (same as Vision API). Vertex AI SDK ADC supportsexternal_account(WIF) credential type -- no code changes needed, environment setup onlyVERTEX_AI_PROJECT(required),VERTEX_AI_LOCATION(default: us-central1),GEMINI_MODEL(default: gemini-2.5-flash)google-cloud-aiplatform>=1.40.0to requirements.txtAcceptance Criteria
GeminiEngine.extract_maintenance(pdf_bytes)returns structured JSON withmaintenanceSchedulearrayEngineUnavailableErrorwith diagnostic messageTests
ocr/tests/test_gemini_engine.py(NEW)Milestone: Gemini Engine Module and Configuration
Phase: Execution | Agent: Developer | Status: PASS
Summary
Audited existing GeminiEngine implementation (committed under refs #133) against all #142 acceptance criteria. Found one gap and fixed it.
Acceptance Criteria Verification
GenerativeModelwithgenerate_content()response_mime_type="application/json"andresponse_schemaenforcementmaintenanceSchedule[]with serviceName required, nullable fields_get_model()wraps init with traceback logging, raises GeminiUnavailableErrorGOOGLE_APPLICATION_CREDENTIALSenv var for WIF authgoogle-cloud-aiplatform>=1.40.0in requirements.txtGap Fixed
logger.exception()calls for full traceback logging in_get_model()exception handlersTest Results
All 18 Gemini engine tests pass:
Commit
f9a650afeat: add traceback logging and spec-aligned error message to GeminiEngine (refs #142)Verdict: PASS | Next: Close issue or continue with parent #129 milestones