feat: Migrate GeminiEngine to google-genai with Google Search grounding (#231) #233

Closed
opened 2026-02-20 15:11:05 +00:00 by egullickson · 2 comments
Owner

Relates to #231

Migrate ocr/app/engines/gemini_engine.py from vertexai.generative_models to google.genai:

  • Replace _get_model() with _get_client() using genai.Client(vertexai=True, project, location)
  • Store self._client (genai.Client) and self._model_name (str) instead of self._model and self._generation_config
  • Migrate extract_maintenance(): Part.from_data() -> types.Part.from_bytes(), use client.models.generate_content(model=..., contents=..., config=GenerateContentConfig(...))
  • Migrate decode_vin(): same pattern + add tools=[types.Tool(google_search=types.GoogleSearch())]
  • Update imports and error messages

Acceptance Criteria

  • No imports from vertexai or google.cloud.aiplatform
  • Uses genai.Client(vertexai=True, ...) for initialization
  • decode_vin() includes Google Search grounding tool
  • extract_maintenance() works with new SDK (no search grounding)
  • Error handling preserved

File

ocr/app/engines/gemini_engine.py

Relates to #231 Migrate `ocr/app/engines/gemini_engine.py` from `vertexai.generative_models` to `google.genai`: - Replace `_get_model()` with `_get_client()` using `genai.Client(vertexai=True, project, location)` - Store `self._client` (genai.Client) and `self._model_name` (str) instead of `self._model` and `self._generation_config` - Migrate `extract_maintenance()`: `Part.from_data()` -> `types.Part.from_bytes()`, use `client.models.generate_content(model=..., contents=..., config=GenerateContentConfig(...))` - Migrate `decode_vin()`: same pattern + add `tools=[types.Tool(google_search=types.GoogleSearch())]` - Update imports and error messages ## Acceptance Criteria - [ ] No imports from `vertexai` or `google.cloud.aiplatform` - [ ] Uses `genai.Client(vertexai=True, ...)` for initialization - [ ] `decode_vin()` includes Google Search grounding tool - [ ] `extract_maintenance()` works with new SDK (no search grounding) - [ ] Error handling preserved ## File `ocr/app/engines/gemini_engine.py`
egullickson added the
status
in-progress
type
feature
labels 2026-02-20 15:11:19 +00:00
egullickson added this to the Sprint 2026-02-02 milestone 2026-02-20 15:11:22 +00:00
Author
Owner

Plan: M2 -- Migrate GeminiEngine (#233)

Phase: Planning | Agent: Planner | Status: APPROVED
Parent: #231 | Revision: v4


Context

The OCR service uses the deprecated vertexai.generative_models SDK from google-cloud-aiplatform in gemini_engine.py. The replacement google-genai package provides a client-per-call pattern and first-class Google Search grounding for VIN decode accuracy.

Codebase Analysis

File SDK References Action
ocr/app/engines/gemini_engine.py 4 import sites: aiplatform (L232), GenerationConfig+GenerativeModel (L233-236), Part (L299), GenerationConfig (L374) Full migration + Google Search grounding
ocr/app/config.py vertex_ai_project, vertex_ai_location, gemini_model + inline comment Settings unchanged; update inline comment to reference google-genai
docker-compose.yml VERTEX_AI_PROJECT, VERTEX_AI_LOCATION, GEMINI_MODEL env vars No changes -- genai.Client(vertexai=True) consumes project and location as constructor keyword args, which map to these env vars via config.py

API Migration Map

Old (vertexai.generative_models) New (google.genai)
from google.cloud import aiplatform from google import genai
from vertexai.generative_models import GenerativeModel, GenerationConfig, Part from google.genai import types
aiplatform.init(project=..., location=...) genai.Client(vertexai=True, project=..., location=...)
GenerativeModel(model_name) Client handles model per-call via model= kwarg
model.generate_content([...], generation_config=config) client.models.generate_content(model=name, contents=[...], config=config)
GenerationConfig(response_mime_type=..., response_schema=...) types.GenerateContentConfig(response_mime_type=..., response_schema=...)
Part.from_data(data=bytes, mime_type=...) types.Part.from_bytes(data=bytes, mime_type=...)
N/A tools=[types.Tool(google_search=types.GoogleSearch())]
Schema type "string", "object", etc. Schema type "STRING", "OBJECT", etc. (uppercase per Vertex AI Schema spec)

Internal State Changes

GeminiEngine changes from:

self._model: Any | None = None          # GenerativeModel instance (set in _get_model)
# self._generation_config also set in _get_model() -- will be removed

To:

self._client: Any | None = None         # genai.Client instance (set in _get_client)
self._model_name: str = ""              # Model name string for per-call use

Note: manual_extractor.py uses GeminiEngine.extract_maintenance() transitively -- its maintenance extraction flow is covered by this migration.

Authentication

Authentication unchanged: GOOGLE_APPLICATION_CREDENTIALS env var pointing to WIF credential config. The new SDK uses the same ADC (Application Default Credentials) chain.

CRITICAL: In _get_client(), os.environ["GOOGLE_APPLICATION_CREDENTIALS"] and os.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"] MUST be set BEFORE genai.Client() construction (preserving the existing pattern from _get_model()).

Config settings vertex_ai_project and vertex_ai_location remain in use -- passed as keyword args to genai.Client(vertexai=True, project=settings.vertex_ai_project, location=settings.vertex_ai_location).

Error Handling Strategy

The existing error handling pattern is preserved: try/except ImportError for missing SDK, try/except Exception as catch-all for auth and API failures. The generic except Exception clause intentionally catches new google-genai SDK exceptions (APIError, ClientError, ServerError) and wraps them as GeminiUnavailableError or GeminiProcessingError. This is acceptable because:

  • The custom exception hierarchy provides a stable interface for callers
  • Exception type and message are logged before wrapping, preserving diagnostic info
  • No silent failures: all errors are logged and re-raised

Implementation

  • File: ocr/app/engines/gemini_engine.py
  • Replace _get_model() with _get_client(): set ADC env vars first, then create genai.Client(vertexai=True, project=..., location=...), store as self._client
  • Store self._model_name = settings.gemini_model for per-call use
  • Remove self._generation_config assignment entirely (configs built per-call)
  • Convert _RESPONSE_SCHEMA and _VIN_DECODE_SCHEMA type values to uppercase ("object" -> "OBJECT", "string" -> "STRING", "number" -> "NUMBER", "integer" -> "INTEGER", "array" -> "ARRAY")
  • extract_maintenance(): Part.from_data() -> types.Part.from_bytes(), call self._client.models.generate_content(model=self._model_name, contents=[...], config=types.GenerateContentConfig(...))
  • decode_vin(): same migration + add tools=[types.Tool(google_search=types.GoogleSearch())] in GenerateContentConfig
  • Update ImportError message from "google-cloud-aiplatform" to "google-genai"
  • Update module docstring (L5): replace "Uses Vertex AI SDK" with "Uses google-genai SDK"
  • Update class docstring (L209): replace "the Vertex AI client" with "the Gemini client"
  • Also update ocr/app/config.py inline comment from "Vertex AI / Gemini configuration" to "Google GenAI / Gemini configuration"

Risk Assessment

Google Search + response_schema compatibility: The VIN decode currently uses response_schema for structured JSON output. Google Search grounding may conflict with structured output mode. Fallback: if incompatible, remove response_schema from VIN decode config and parse JSON from the text response (the prompt already requests JSON format). Low risk -- documentation indicates both can coexist.

VERTEX_AI_LOCATION=global: The docker-compose.yml sets this to "global". The new genai.Client(vertexai=True, location=...) should accept this value. If "global" is not supported, fall back to "us-central1".

Schema type casing: All response_schema dicts will be converted to uppercase types per Vertex AI Schema specification. If the new SDK auto-normalizes lowercase, the uppercase conversion is harmless. If it does not, this prevents runtime failures.

Review Findings

QR plan-completeness:

  • [RULE 1] HIGH: Import count is 4, not 3 -- corrected in plan v4
  • [RULE 1] MEDIUM: Config settings confirmation -- confirmed settings.vertex_ai_project and settings.vertex_ai_location still consumed
  • [RULE 1] MEDIUM: Error messages reference old package name at gemini_engine.py:265-266 -- included in scope

TW plan-scrub:

  • PRECISION: API Migration Map uses keyword arg syntax genai.Client(vertexai=True, project=..., location=...) -- corrected
  • COMPLETENESS: self._generation_config assignment removal explicitly stated
  • CONSISTENCY: MaintenanceExtractionResult.model field (model name string) is unaffected -- disambiguated

QR plan-code:

  • [RULE 0] CRITICAL: Schema type values must be uppercase -- added to implementation
  • [RULE 0] HIGH: New SDK exception types caught by generic except Exception -- documented in error handling strategy
  • [RULE 0] MEDIUM: ADC env var setup ordering -- explicitly stated as CRITICAL
  • [RULE 0] MEDIUM: Google Search grounding adds new external failure mode -- log specific error type

QR plan-docs:

  • [RULE 2] SHOULD_FIX: Class docstring update (L209) -- included in implementation
  • [RULE 2] SHOULD_FIX: config.py inline comment -- included in implementation

Verdict: APPROVED | Next: Execute (depends on M1 #232)

## Plan: M2 -- Migrate GeminiEngine (#233) **Phase**: Planning | **Agent**: Planner | **Status**: APPROVED **Parent**: #231 | **Revision**: v4 --- ### Context The OCR service uses the deprecated `vertexai.generative_models` SDK from `google-cloud-aiplatform` in `gemini_engine.py`. The replacement `google-genai` package provides a client-per-call pattern and first-class Google Search grounding for VIN decode accuracy. ### Codebase Analysis | File | SDK References | Action | |------|---------------|--------| | `ocr/app/engines/gemini_engine.py` | 4 import sites: `aiplatform` (L232), `GenerationConfig+GenerativeModel` (L233-236), `Part` (L299), `GenerationConfig` (L374) | Full migration + Google Search grounding | | `ocr/app/config.py` | `vertex_ai_project`, `vertex_ai_location`, `gemini_model` + inline comment | Settings unchanged; update inline comment to reference google-genai | | `docker-compose.yml` | `VERTEX_AI_PROJECT`, `VERTEX_AI_LOCATION`, `GEMINI_MODEL` env vars | No changes -- `genai.Client(vertexai=True)` consumes `project` and `location` as constructor keyword args, which map to these env vars via config.py | ### API Migration Map | Old (`vertexai.generative_models`) | New (`google.genai`) | |-------------------------------------|----------------------| | `from google.cloud import aiplatform` | `from google import genai` | | `from vertexai.generative_models import GenerativeModel, GenerationConfig, Part` | `from google.genai import types` | | `aiplatform.init(project=..., location=...)` | `genai.Client(vertexai=True, project=..., location=...)` | | `GenerativeModel(model_name)` | Client handles model per-call via `model=` kwarg | | `model.generate_content([...], generation_config=config)` | `client.models.generate_content(model=name, contents=[...], config=config)` | | `GenerationConfig(response_mime_type=..., response_schema=...)` | `types.GenerateContentConfig(response_mime_type=..., response_schema=...)` | | `Part.from_data(data=bytes, mime_type=...)` | `types.Part.from_bytes(data=bytes, mime_type=...)` | | N/A | `tools=[types.Tool(google_search=types.GoogleSearch())]` | | Schema type `"string"`, `"object"`, etc. | Schema type `"STRING"`, `"OBJECT"`, etc. (uppercase per Vertex AI Schema spec) | ### Internal State Changes **GeminiEngine** changes from: ```python self._model: Any | None = None # GenerativeModel instance (set in _get_model) # self._generation_config also set in _get_model() -- will be removed ``` To: ```python self._client: Any | None = None # genai.Client instance (set in _get_client) self._model_name: str = "" # Model name string for per-call use ``` Note: `manual_extractor.py` uses `GeminiEngine.extract_maintenance()` transitively -- its maintenance extraction flow is covered by this migration. ### Authentication Authentication unchanged: `GOOGLE_APPLICATION_CREDENTIALS` env var pointing to WIF credential config. The new SDK uses the same ADC (Application Default Credentials) chain. **CRITICAL**: In `_get_client()`, `os.environ["GOOGLE_APPLICATION_CREDENTIALS"]` and `os.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"]` MUST be set BEFORE `genai.Client()` construction (preserving the existing pattern from `_get_model()`). Config settings `vertex_ai_project` and `vertex_ai_location` remain in use -- passed as keyword args to `genai.Client(vertexai=True, project=settings.vertex_ai_project, location=settings.vertex_ai_location)`. ### Error Handling Strategy The existing error handling pattern is preserved: `try/except ImportError` for missing SDK, `try/except Exception` as catch-all for auth and API failures. The generic `except Exception` clause **intentionally** catches new `google-genai` SDK exceptions (`APIError`, `ClientError`, `ServerError`) and wraps them as `GeminiUnavailableError` or `GeminiProcessingError`. This is acceptable because: - The custom exception hierarchy provides a stable interface for callers - Exception type and message are logged before wrapping, preserving diagnostic info - No silent failures: all errors are logged and re-raised ### Implementation - File: `ocr/app/engines/gemini_engine.py` - Replace `_get_model()` with `_get_client()`: set ADC env vars first, then create `genai.Client(vertexai=True, project=..., location=...)`, store as `self._client` - Store `self._model_name = settings.gemini_model` for per-call use - Remove `self._generation_config` assignment entirely (configs built per-call) - Convert `_RESPONSE_SCHEMA` and `_VIN_DECODE_SCHEMA` type values to uppercase (`"object"` -> `"OBJECT"`, `"string"` -> `"STRING"`, `"number"` -> `"NUMBER"`, `"integer"` -> `"INTEGER"`, `"array"` -> `"ARRAY"`) - `extract_maintenance()`: `Part.from_data()` -> `types.Part.from_bytes()`, call `self._client.models.generate_content(model=self._model_name, contents=[...], config=types.GenerateContentConfig(...))` - `decode_vin()`: same migration + add `tools=[types.Tool(google_search=types.GoogleSearch())]` in `GenerateContentConfig` - Update ImportError message from "google-cloud-aiplatform" to "google-genai" - Update module docstring (L5): replace "Uses Vertex AI SDK" with "Uses google-genai SDK" - Update class docstring (L209): replace "the Vertex AI client" with "the Gemini client" - Also update `ocr/app/config.py` inline comment from "Vertex AI / Gemini configuration" to "Google GenAI / Gemini configuration" ### Risk Assessment **Google Search + response_schema compatibility**: The VIN decode currently uses `response_schema` for structured JSON output. Google Search grounding may conflict with structured output mode. Fallback: if incompatible, remove `response_schema` from VIN decode config and parse JSON from the text response (the prompt already requests JSON format). Low risk -- documentation indicates both can coexist. **VERTEX_AI_LOCATION=global**: The docker-compose.yml sets this to "global". The new `genai.Client(vertexai=True, location=...)` should accept this value. If "global" is not supported, fall back to "us-central1". **Schema type casing**: All response_schema dicts will be converted to uppercase types per Vertex AI Schema specification. If the new SDK auto-normalizes lowercase, the uppercase conversion is harmless. If it does not, this prevents runtime failures. ### Review Findings **QR plan-completeness:** - [RULE 1] HIGH: Import count is 4, not 3 -- corrected in plan v4 - [RULE 1] MEDIUM: Config settings confirmation -- confirmed `settings.vertex_ai_project` and `settings.vertex_ai_location` still consumed - [RULE 1] MEDIUM: Error messages reference old package name at `gemini_engine.py:265-266` -- included in scope **TW plan-scrub:** - PRECISION: API Migration Map uses keyword arg syntax `genai.Client(vertexai=True, project=..., location=...)` -- corrected - COMPLETENESS: `self._generation_config` assignment removal explicitly stated - CONSISTENCY: `MaintenanceExtractionResult.model` field (model name string) is **unaffected** -- disambiguated **QR plan-code:** - [RULE 0] CRITICAL: Schema type values must be uppercase -- added to implementation - [RULE 0] HIGH: New SDK exception types caught by generic `except Exception` -- documented in error handling strategy - [RULE 0] MEDIUM: ADC env var setup ordering -- explicitly stated as CRITICAL - [RULE 0] MEDIUM: Google Search grounding adds new external failure mode -- log specific error type **QR plan-docs:** - [RULE 2] SHOULD_FIX: Class docstring update (L209) -- included in implementation - [RULE 2] SHOULD_FIX: config.py inline comment -- included in implementation --- *Verdict*: APPROVED | *Next*: Execute (depends on M1 #232)
Author
Owner

Milestone: M2 Complete -- Migrate GeminiEngine

Phase: Execution | Agent: Developer | Status: PASS


Changes

  • ocr/app/engines/gemini_engine.py: Full SDK migration from vertexai.generative_models to google.genai
    • _get_model() -> _get_client() using genai.Client(vertexai=True, ...)
    • self._model + self._generation_config -> self._client + self._model_name
    • Part.from_data() -> types.Part.from_bytes()
    • model.generate_content(...) -> client.models.generate_content(model=..., ...)
    • GenerationConfig(...) -> types.GenerateContentConfig(...)
    • decode_vin() now includes tools=[types.Tool(google_search=types.GoogleSearch())]
    • Schema type values converted to uppercase (STRING, OBJECT, NUMBER, etc.)
    • Updated module docstring, class docstring, and error messages
  • ocr/app/config.py: Updated inline comment from "Vertex AI" to "Google GenAI"

Acceptance Criteria

  • No imports from vertexai or google.cloud.aiplatform
  • Uses genai.Client(vertexai=True, ...) for initialization
  • decode_vin() includes Google Search grounding tool
  • extract_maintenance() works with new SDK (no search grounding)
  • Error handling preserved

Verdict: PASS | Next: M3 -- Migrate MaintenanceReceiptExtractor (#234)

## Milestone: M2 Complete -- Migrate GeminiEngine **Phase**: Execution | **Agent**: Developer | **Status**: PASS --- ### Changes - `ocr/app/engines/gemini_engine.py`: Full SDK migration from `vertexai.generative_models` to `google.genai` - `_get_model()` -> `_get_client()` using `genai.Client(vertexai=True, ...)` - `self._model` + `self._generation_config` -> `self._client` + `self._model_name` - `Part.from_data()` -> `types.Part.from_bytes()` - `model.generate_content(...)` -> `client.models.generate_content(model=..., ...)` - `GenerationConfig(...)` -> `types.GenerateContentConfig(...)` - `decode_vin()` now includes `tools=[types.Tool(google_search=types.GoogleSearch())]` - Schema type values converted to uppercase (STRING, OBJECT, NUMBER, etc.) - Updated module docstring, class docstring, and error messages - `ocr/app/config.py`: Updated inline comment from "Vertex AI" to "Google GenAI" ### Acceptance Criteria - [x] No imports from `vertexai` or `google.cloud.aiplatform` - [x] Uses `genai.Client(vertexai=True, ...)` for initialization - [x] `decode_vin()` includes Google Search grounding tool - [x] `extract_maintenance()` works with new SDK (no search grounding) - [x] Error handling preserved --- *Verdict*: PASS | *Next*: M3 -- Migrate MaintenanceReceiptExtractor (#234)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: egullickson/motovaultpro#233