feat: Gemini engine module and configuration (#129) #142

New Issue

egullickson · 2026-02-11T03:49:44Z

egullickson commented

2026-02-11 03:49:44 +00:00

Relates to #129

Milestone 4: Gemini Engine Module and Configuration

Files

ocr/app/engines/gemini_engine.py (NEW)
ocr/app/config.py
ocr/requirements.txt
docker-compose.yml
docker-compose.staging.yml
docker-compose.prod.yml

Requirements

Create GeminiEngine class (standalone, NOT extending OcrEngine -- different interface: OcrEngine.recognize() accepts image bytes and returns text+confidence; GeminiEngine.extract_maintenance() accepts PDF bytes and returns structured JSON)
GeminiEngine uses Vertex AI SDK: GenerativeModel("gemini-2.5-flash") with generate_content()
GeminiEngine.extract_maintenance(pdf_bytes) validates PDF size BEFORE processing. Reject if raw bytes > 20MB with clear error message including file size. Vertex AI SDK handles base64 encoding internally.
Use response_mime_type="application/json" and response_schema for guaranteed JSON structure
Gemini response schema uses camelCase field names (serviceName, intervalMiles, intervalMonths, details) to match backend API convention
Use prompt from issue #129 for maintenance schedule extraction
Response schema enforces maintenanceSchedule[] with serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable)
GeminiEngine._get_client() wraps Vertex AI client initialization in try/except. Catch all exceptions, log full traceback, raise EngineUnavailableError with message: "Vertex AI authentication failed: {exc}"
Gemini reads GOOGLE_APPLICATION_CREDENTIALS env var (same as Vision API). Vertex AI SDK ADC supports external_account (WIF) credential type -- no code changes needed, environment setup only
Add config settings: VERTEX_AI_PROJECT (required), VERTEX_AI_LOCATION (default: us-central1), GEMINI_MODEL (default: gemini-2.5-flash)
Add google-cloud-aiplatform>=1.40.0 to requirements.txt
Add environment variables to all docker-compose files for OCR service

Acceptance Criteria

GeminiEngine.extract_maintenance(pdf_bytes) returns structured JSON with maintenanceSchedule array
Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable)
PDFs > 20MB (raw bytes) rejected with clear error message including file size
WIF authentication failure raises EngineUnavailableError with diagnostic message
Configuration reads from environment variables with sensible defaults
All field names in response use camelCase

Tests

Test files: ocr/tests/test_gemini_engine.py (NEW)
Test type: unit (mock Vertex AI SDK)
Scenarios:
- Normal: Valid PDF returns structured maintenance schedules with camelCase fields
- Edge: PDF with no maintenance content returns empty array
- Error: PDF > 20MB rejected with size error message
- Error: Vertex AI authentication failure raises EngineUnavailableError with diagnostic info
- Error: Gemini API call failure handled gracefully

Relates to #129 ## Milestone 4: Gemini Engine Module and Configuration ### Files - `ocr/app/engines/gemini_engine.py` (NEW) - `ocr/app/config.py` - `ocr/requirements.txt` - `docker-compose.yml` - `docker-compose.staging.yml` - `docker-compose.prod.yml` ### Requirements - Create `GeminiEngine` class (standalone, NOT extending OcrEngine -- different interface: OcrEngine.recognize() accepts image bytes and returns text+confidence; GeminiEngine.extract_maintenance() accepts PDF bytes and returns structured JSON) - GeminiEngine uses Vertex AI SDK: `GenerativeModel("gemini-2.5-flash")` with `generate_content()` - `GeminiEngine.extract_maintenance(pdf_bytes)` validates PDF size BEFORE processing. Reject if raw bytes > 20MB with clear error message including file size. Vertex AI SDK handles base64 encoding internally. - Use `response_mime_type="application/json"` and `response_schema` for guaranteed JSON structure - Gemini response schema uses camelCase field names (serviceName, intervalMiles, intervalMonths, details) to match backend API convention - Use prompt from issue #129 for maintenance schedule extraction - Response schema enforces `maintenanceSchedule[]` with `serviceName` (required), `intervalMiles` (nullable), `intervalMonths` (nullable), `details` (nullable) - `GeminiEngine._get_client()` wraps Vertex AI client initialization in try/except. Catch all exceptions, log full traceback, raise `EngineUnavailableError` with message: "Vertex AI authentication failed: {exc}" - Gemini reads `GOOGLE_APPLICATION_CREDENTIALS` env var (same as Vision API). Vertex AI SDK ADC supports `external_account` (WIF) credential type -- no code changes needed, environment setup only - Add config settings: `VERTEX_AI_PROJECT` (required), `VERTEX_AI_LOCATION` (default: us-central1), `GEMINI_MODEL` (default: gemini-2.5-flash) - Add `google-cloud-aiplatform>=1.40.0` to requirements.txt - Add environment variables to all docker-compose files for OCR service ### Acceptance Criteria - `GeminiEngine.extract_maintenance(pdf_bytes)` returns structured JSON with `maintenanceSchedule` array - Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable) - PDFs > 20MB (raw bytes) rejected with clear error message including file size - WIF authentication failure raises `EngineUnavailableError` with diagnostic message - Configuration reads from environment variables with sensible defaults - All field names in response use camelCase ### Tests - **Test files**: `ocr/tests/test_gemini_engine.py` (NEW) - **Test type**: unit (mock Vertex AI SDK) - **Scenarios**: - Normal: Valid PDF returns structured maintenance schedules with camelCase fields - Edge: PDF with no maintenance content returns empty array - Error: PDF > 20MB rejected with size error message - Error: Vertex AI authentication failure raises EngineUnavailableError with diagnostic info - Error: Gemini API call failure handled gracefully

egullickson added the

labels 2026-02-11 03:51:15 +00:00

egullickson referenced this issue

2026-02-11 03:53:01 +00:00

feat: Expand OCR with fuel receipt scanning and owners manual maintenance extraction #129

egullickson added

and removed

labels 2026-02-11 20:31:02 +00:00

egullickson commented

2026-02-11 20:35:24 +00:00

Milestone: Gemini Engine Module and Configuration

Phase: Execution | Agent: Developer | Status: PASS

Summary

Audited existing GeminiEngine implementation (committed under refs #133) against all #142 acceptance criteria. Found one gap and fixed it.

Acceptance Criteria Verification

#	Requirement	Status
1	GeminiEngine class, standalone (NOT extending OcrEngine)	PASS
2	Vertex AI SDK `GenerativeModel` with `generate_content()`	PASS
3	PDF size validation before processing, >20MB rejected with file size in error	PASS
4	`response_mime_type="application/json"` and `response_schema` enforcement	PASS
5	camelCase field names (serviceName, intervalMiles, intervalMonths, details)	PASS
6	Extraction prompt matches #129 spec	PASS
7	Response schema: `maintenanceSchedule[]` with serviceName required, nullable fields	PASS
8	`_get_model()` wraps init with traceback logging, raises GeminiUnavailableError	PASS (fixed)
9	Reads `GOOGLE_APPLICATION_CREDENTIALS` env var for WIF auth	PASS
10	Config: VERTEX_AI_PROJECT, VERTEX_AI_LOCATION (default us-central1), GEMINI_MODEL (default gemini-2.5-flash)	PASS
11	`google-cloud-aiplatform>=1.40.0` in requirements.txt	PASS
12	Environment variables in all 3 docker-compose files	PASS

Gap Fixed

Added logger.exception() calls for full traceback logging in _get_model() exception handlers
Updated catch-all error message to "Vertex AI authentication failed: {exc}" per spec

Test Results

All 18 Gemini engine tests pass:

3 exception hierarchy tests
4 data type tests
2 PDF size validation tests
3 successful extraction tests
4 error handling tests
2 lazy initialization tests

Commit

f9a650a feat: add traceback logging and spec-aligned error message to GeminiEngine (refs #142)

Verdict: PASS | Next: Close issue or continue with parent #129 milestones

## Milestone: Gemini Engine Module and Configuration **Phase**: Execution | **Agent**: Developer | **Status**: PASS ### Summary Audited existing GeminiEngine implementation (committed under refs #133) against all #142 acceptance criteria. Found one gap and fixed it. ### Acceptance Criteria Verification | # | Requirement | Status | |---|---|---| | 1 | GeminiEngine class, standalone (NOT extending OcrEngine) | PASS | | 2 | Vertex AI SDK `GenerativeModel` with `generate_content()` | PASS | | 3 | PDF size validation before processing, >20MB rejected with file size in error | PASS | | 4 | `response_mime_type="application/json"` and `response_schema` enforcement | PASS | | 5 | camelCase field names (serviceName, intervalMiles, intervalMonths, details) | PASS | | 6 | Extraction prompt matches #129 spec | PASS | | 7 | Response schema: `maintenanceSchedule[]` with serviceName required, nullable fields | PASS | | 8 | `_get_model()` wraps init with traceback logging, raises GeminiUnavailableError | PASS (fixed) | | 9 | Reads `GOOGLE_APPLICATION_CREDENTIALS` env var for WIF auth | PASS | | 10 | Config: VERTEX_AI_PROJECT, VERTEX_AI_LOCATION (default us-central1), GEMINI_MODEL (default gemini-2.5-flash) | PASS | | 11 | `google-cloud-aiplatform>=1.40.0` in requirements.txt | PASS | | 12 | Environment variables in all 3 docker-compose files | PASS | ### Gap Fixed - Added `logger.exception()` calls for full traceback logging in `_get_model()` exception handlers - Updated catch-all error message to "Vertex AI authentication failed: {exc}" per spec ### Test Results All 18 Gemini engine tests pass: - 3 exception hierarchy tests - 4 data type tests - 2 PDF size validation tests - 3 successful extraction tests - 4 error handling tests - 2 lazy initialization tests ### Commit `f9a650a` feat: add traceback logging and spec-aligned error message to GeminiEngine (refs #142) *Verdict*: PASS | *Next*: Close issue or continue with parent #129 milestones

egullickson referenced this issue from a commit

2026-02-11 21:27:47 +00:00

feat: add traceback logging and spec-aligned error message to GeminiEngine (refs #142)

egullickson referenced a pull request that will close this issue

2026-02-11 21:28:11 +00:00

feat: Expand OCR with fuel receipt scanning and maintenance extraction (#129) #147

egullickson closed this issue

2026-02-13 02:25:57 +00:00

Sign in to join this conversation.