feat: Gemini engine module and configuration (#129) #133

New Issue

egullickson · 2026-02-11T03:04:50Z

egullickson commented

2026-02-11 03:04:50 +00:00

Relates to #129

Milestone 4: Gemini Engine Module and Configuration

Create standalone Gemini 2.5 Flash module in the Python OCR service for maintenance schedule extraction.

Files

ocr/app/engines/gemini_engine.py (NEW)
ocr/app/config.py
ocr/requirements.txt
docker-compose.yml
docker-compose.staging.yml
docker-compose.prod.yml

Requirements

Create GeminiEngine class (standalone, NOT extending OcrEngine) with extract_maintenance() method
Uses Vertex AI SDK: GenerativeModel("gemini-2.5-flash") with generate_content()
Accept PDF bytes (base64 inline for <20MB), reject >20MB with clear error
Use response_mime_type="application/json" and response_schema for guaranteed JSON structure
Use prompt from issue #129 for maintenance schedule extraction
Authentication via same WIF credential path (GOOGLE_APPLICATION_CREDENTIALS)
Add config settings: VERTEX_AI_PROJECT, VERTEX_AI_LOCATION, GEMINI_MODEL
Add google-cloud-aiplatform>=1.40.0 to requirements.txt
Add environment variables to docker-compose files

Acceptance Criteria

GeminiEngine.extract_maintenance(pdf_bytes) returns structured JSON with maintenanceSchedule array
Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable)
PDFs > 20MB rejected with clear error message
Authentication works via WIF in container environment
Configuration reads from environment variables with sensible defaults

Tests

Test files: ocr/tests/test_gemini_engine.py (NEW)
Test type: unit (mock Vertex AI SDK)
Scenarios:
- Normal: Valid PDF returns structured maintenance schedules
- Edge: PDF with no maintenance content returns empty array
- Error: PDF > 20MB rejected with size error
- Error: Vertex AI authentication failure handled gracefully

Relates to #129 ## Milestone 4: Gemini Engine Module and Configuration Create standalone Gemini 2.5 Flash module in the Python OCR service for maintenance schedule extraction. ### Files - `ocr/app/engines/gemini_engine.py` (NEW) - `ocr/app/config.py` - `ocr/requirements.txt` - `docker-compose.yml` - `docker-compose.staging.yml` - `docker-compose.prod.yml` ### Requirements - Create `GeminiEngine` class (standalone, NOT extending OcrEngine) with `extract_maintenance()` method - Uses Vertex AI SDK: `GenerativeModel("gemini-2.5-flash")` with `generate_content()` - Accept PDF bytes (base64 inline for <20MB), reject >20MB with clear error - Use `response_mime_type="application/json"` and `response_schema` for guaranteed JSON structure - Use prompt from issue #129 for maintenance schedule extraction - Authentication via same WIF credential path (`GOOGLE_APPLICATION_CREDENTIALS`) - Add config settings: `VERTEX_AI_PROJECT`, `VERTEX_AI_LOCATION`, `GEMINI_MODEL` - Add `google-cloud-aiplatform>=1.40.0` to requirements.txt - Add environment variables to docker-compose files ### Acceptance Criteria - [ ] GeminiEngine.extract_maintenance(pdf_bytes) returns structured JSON with maintenanceSchedule array - [ ] Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable) - [ ] PDFs > 20MB rejected with clear error message - [ ] Authentication works via WIF in container environment - [ ] Configuration reads from environment variables with sensible defaults ### Tests - **Test files**: `ocr/tests/test_gemini_engine.py` (NEW) - **Test type**: unit (mock Vertex AI SDK) - **Scenarios**: - Normal: Valid PDF returns structured maintenance schedules - Edge: PDF with no maintenance content returns empty array - Error: PDF > 20MB rejected with size error - Error: Vertex AI authentication failure handled gracefully

egullickson added the

labels 2026-02-11 03:12:39 +00:00

egullickson added

and removed

labels 2026-02-11 15:53:28 +00:00

egullickson commented

2026-02-11 16:01:07 +00:00

Milestone: Gemini Engine Module and Configuration

Phase: Execution | Agent: Feature Agent | Status: PASS

Changes Made

New Files:

ocr/app/engines/gemini_engine.py - Standalone GeminiEngine class with extract_maintenance(pdf_bytes) method
ocr/tests/test_gemini_engine.py - 18 unit tests (all passing)

Modified Files:

ocr/app/config.py - Added vertex_ai_project, vertex_ai_location, gemini_model settings
ocr/requirements.txt - Added google-cloud-aiplatform>=1.40.0
docker-compose.yml - Added VERTEX_AI_PROJECT, VERTEX_AI_LOCATION, GEMINI_MODEL env vars to mvp-ocr
docker-compose.staging.yml - Same env vars added
docker-compose.prod.yml - Same env vars added

Implementation Details

GeminiEngine is standalone (NOT extending OcrEngine ABC) since Gemini does semantic document understanding, not traditional OCR
Uses Vertex AI SDK with GenerativeModel and generate_content() with response_mime_type="application/json" and response_schema for guaranteed JSON structure
PDFs >20MB rejected with clear error message
Lazy initialization: model not created until first extract_maintenance() call
Authentication via same WIF credential path as Google Vision (GOOGLE_APPLICATION_CREDENTIALS)
Returns MaintenanceExtractionResult with list of MaintenanceItem dataclasses (camelCase from API mapped to snake_case Python)

Test Results

18 passed in 4.81s

Test scenarios covered:

Exception hierarchy validation
Data type construction (required-only and all fields)
PDF >20MB rejection
PDF exactly at 20MB limit passes size check
Valid PDF returns structured maintenance schedules
PDF with no maintenance content returns empty array
Nullable fields handled correctly
Missing credential file raises GeminiUnavailableError
Missing SDK raises GeminiUnavailableError
API runtime error wrapped as GeminiProcessingError
Invalid JSON response wrapped as GeminiProcessingError
Lazy initialization verified (model is None after construction)
Model reused on subsequent calls

Acceptance Criteria Status

GeminiEngine.extract_maintenance(pdf_bytes) returns structured JSON with maintenanceSchedule array
Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable)
PDFs >20MB rejected with clear error message
Authentication works via WIF in container environment (same path as Google Vision)
Configuration reads from environment variables with sensible defaults

Verdict: PASS | Next: Ready for PR / Quality Review

## Milestone: Gemini Engine Module and Configuration **Phase**: Execution | **Agent**: Feature Agent | **Status**: PASS ### Changes Made **New Files:** - `ocr/app/engines/gemini_engine.py` - Standalone `GeminiEngine` class with `extract_maintenance(pdf_bytes)` method - `ocr/tests/test_gemini_engine.py` - 18 unit tests (all passing) **Modified Files:** - `ocr/app/config.py` - Added `vertex_ai_project`, `vertex_ai_location`, `gemini_model` settings - `ocr/requirements.txt` - Added `google-cloud-aiplatform>=1.40.0` - `docker-compose.yml` - Added `VERTEX_AI_PROJECT`, `VERTEX_AI_LOCATION`, `GEMINI_MODEL` env vars to mvp-ocr - `docker-compose.staging.yml` - Same env vars added - `docker-compose.prod.yml` - Same env vars added ### Implementation Details - `GeminiEngine` is standalone (NOT extending `OcrEngine` ABC) since Gemini does semantic document understanding, not traditional OCR - Uses Vertex AI SDK with `GenerativeModel` and `generate_content()` with `response_mime_type="application/json"` and `response_schema` for guaranteed JSON structure - PDFs >20MB rejected with clear error message - Lazy initialization: model not created until first `extract_maintenance()` call - Authentication via same WIF credential path as Google Vision (`GOOGLE_APPLICATION_CREDENTIALS`) - Returns `MaintenanceExtractionResult` with list of `MaintenanceItem` dataclasses (camelCase from API mapped to snake_case Python) ### Test Results ``` 18 passed in 4.81s ``` Test scenarios covered: - Exception hierarchy validation - Data type construction (required-only and all fields) - PDF >20MB rejection - PDF exactly at 20MB limit passes size check - Valid PDF returns structured maintenance schedules - PDF with no maintenance content returns empty array - Nullable fields handled correctly - Missing credential file raises GeminiUnavailableError - Missing SDK raises GeminiUnavailableError - API runtime error wrapped as GeminiProcessingError - Invalid JSON response wrapped as GeminiProcessingError - Lazy initialization verified (model is None after construction) - Model reused on subsequent calls ### Acceptance Criteria Status - [x] GeminiEngine.extract_maintenance(pdf_bytes) returns structured JSON with maintenanceSchedule array - [x] Each schedule item has serviceName (required), intervalMiles (nullable), intervalMonths (nullable), details (nullable) - [x] PDFs >20MB rejected with clear error message - [x] Authentication works via WIF in container environment (same path as Google Vision) - [x] Configuration reads from environment variables with sensible defaults *Verdict*: PASS | *Next*: Ready for PR / Quality Review

egullickson referenced this issue

2026-02-11 20:35:24 +00:00

feat: Gemini engine module and configuration (#129) #142

egullickson referenced this issue from a commit

2026-02-11 21:27:47 +00:00

feat: add Gemini engine module and configuration (refs #133)

egullickson referenced a pull request that will close this issue

2026-02-11 21:28:11 +00:00

feat: Expand OCR with fuel receipt scanning and maintenance extraction (#129) #147

egullickson closed this issue

2026-02-13 02:25:56 +00:00

Sign in to join this conversation.