feat: Migrate GeminiEngine to google-genai with Google Search grounding (#231) #233
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Relates to #231
Migrate
ocr/app/engines/gemini_engine.pyfromvertexai.generative_modelstogoogle.genai:_get_model()with_get_client()usinggenai.Client(vertexai=True, project, location)self._client(genai.Client) andself._model_name(str) instead ofself._modelandself._generation_configextract_maintenance():Part.from_data()->types.Part.from_bytes(), useclient.models.generate_content(model=..., contents=..., config=GenerateContentConfig(...))decode_vin(): same pattern + addtools=[types.Tool(google_search=types.GoogleSearch())]Acceptance Criteria
vertexaiorgoogle.cloud.aiplatformgenai.Client(vertexai=True, ...)for initializationdecode_vin()includes Google Search grounding toolextract_maintenance()works with new SDK (no search grounding)File
ocr/app/engines/gemini_engine.pyPlan: M2 -- Migrate GeminiEngine (#233)
Phase: Planning | Agent: Planner | Status: APPROVED
Parent: #231 | Revision: v4
Context
The OCR service uses the deprecated
vertexai.generative_modelsSDK fromgoogle-cloud-aiplatformingemini_engine.py. The replacementgoogle-genaipackage provides a client-per-call pattern and first-class Google Search grounding for VIN decode accuracy.Codebase Analysis
ocr/app/engines/gemini_engine.pyaiplatform(L232),GenerationConfig+GenerativeModel(L233-236),Part(L299),GenerationConfig(L374)ocr/app/config.pyvertex_ai_project,vertex_ai_location,gemini_model+ inline commentdocker-compose.ymlVERTEX_AI_PROJECT,VERTEX_AI_LOCATION,GEMINI_MODELenv varsgenai.Client(vertexai=True)consumesprojectandlocationas constructor keyword args, which map to these env vars via config.pyAPI Migration Map
vertexai.generative_models)google.genai)from google.cloud import aiplatformfrom google import genaifrom vertexai.generative_models import GenerativeModel, GenerationConfig, Partfrom google.genai import typesaiplatform.init(project=..., location=...)genai.Client(vertexai=True, project=..., location=...)GenerativeModel(model_name)model=kwargmodel.generate_content([...], generation_config=config)client.models.generate_content(model=name, contents=[...], config=config)GenerationConfig(response_mime_type=..., response_schema=...)types.GenerateContentConfig(response_mime_type=..., response_schema=...)Part.from_data(data=bytes, mime_type=...)types.Part.from_bytes(data=bytes, mime_type=...)tools=[types.Tool(google_search=types.GoogleSearch())]"string","object", etc."STRING","OBJECT", etc. (uppercase per Vertex AI Schema spec)Internal State Changes
GeminiEngine changes from:
To:
Note:
manual_extractor.pyusesGeminiEngine.extract_maintenance()transitively -- its maintenance extraction flow is covered by this migration.Authentication
Authentication unchanged:
GOOGLE_APPLICATION_CREDENTIALSenv var pointing to WIF credential config. The new SDK uses the same ADC (Application Default Credentials) chain.CRITICAL: In
_get_client(),os.environ["GOOGLE_APPLICATION_CREDENTIALS"]andos.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"]MUST be set BEFOREgenai.Client()construction (preserving the existing pattern from_get_model()).Config settings
vertex_ai_projectandvertex_ai_locationremain in use -- passed as keyword args togenai.Client(vertexai=True, project=settings.vertex_ai_project, location=settings.vertex_ai_location).Error Handling Strategy
The existing error handling pattern is preserved:
try/except ImportErrorfor missing SDK,try/except Exceptionas catch-all for auth and API failures. The genericexcept Exceptionclause intentionally catches newgoogle-genaiSDK exceptions (APIError,ClientError,ServerError) and wraps them asGeminiUnavailableErrororGeminiProcessingError. This is acceptable because:Implementation
ocr/app/engines/gemini_engine.py_get_model()with_get_client(): set ADC env vars first, then creategenai.Client(vertexai=True, project=..., location=...), store asself._clientself._model_name = settings.gemini_modelfor per-call useself._generation_configassignment entirely (configs built per-call)_RESPONSE_SCHEMAand_VIN_DECODE_SCHEMAtype values to uppercase ("object"->"OBJECT","string"->"STRING","number"->"NUMBER","integer"->"INTEGER","array"->"ARRAY")extract_maintenance():Part.from_data()->types.Part.from_bytes(), callself._client.models.generate_content(model=self._model_name, contents=[...], config=types.GenerateContentConfig(...))decode_vin(): same migration + addtools=[types.Tool(google_search=types.GoogleSearch())]inGenerateContentConfigocr/app/config.pyinline comment from "Vertex AI / Gemini configuration" to "Google GenAI / Gemini configuration"Risk Assessment
Google Search + response_schema compatibility: The VIN decode currently uses
response_schemafor structured JSON output. Google Search grounding may conflict with structured output mode. Fallback: if incompatible, removeresponse_schemafrom VIN decode config and parse JSON from the text response (the prompt already requests JSON format). Low risk -- documentation indicates both can coexist.VERTEX_AI_LOCATION=global: The docker-compose.yml sets this to "global". The new
genai.Client(vertexai=True, location=...)should accept this value. If "global" is not supported, fall back to "us-central1".Schema type casing: All response_schema dicts will be converted to uppercase types per Vertex AI Schema specification. If the new SDK auto-normalizes lowercase, the uppercase conversion is harmless. If it does not, this prevents runtime failures.
Review Findings
QR plan-completeness:
settings.vertex_ai_projectandsettings.vertex_ai_locationstill consumedgemini_engine.py:265-266-- included in scopeTW plan-scrub:
genai.Client(vertexai=True, project=..., location=...)-- correctedself._generation_configassignment removal explicitly statedMaintenanceExtractionResult.modelfield (model name string) is unaffected -- disambiguatedQR plan-code:
except Exception-- documented in error handling strategyQR plan-docs:
Verdict: APPROVED | Next: Execute (depends on M1 #232)
Milestone: M2 Complete -- Migrate GeminiEngine
Phase: Execution | Agent: Developer | Status: PASS
Changes
ocr/app/engines/gemini_engine.py: Full SDK migration fromvertexai.generative_modelstogoogle.genai_get_model()->_get_client()usinggenai.Client(vertexai=True, ...)self._model+self._generation_config->self._client+self._model_namePart.from_data()->types.Part.from_bytes()model.generate_content(...)->client.models.generate_content(model=..., ...)GenerationConfig(...)->types.GenerateContentConfig(...)decode_vin()now includestools=[types.Tool(google_search=types.GoogleSearch())]ocr/app/config.py: Updated inline comment from "Vertex AI" to "Google GenAI"Acceptance Criteria
vertexaiorgoogle.cloud.aiplatformgenai.Client(vertexai=True, ...)for initializationdecode_vin()includes Google Search grounding toolextract_maintenance()works with new SDK (no search grounding)Verdict: PASS | Next: M3 -- Migrate MaintenanceReceiptExtractor (#234)