feat: Migrate MaintenanceReceiptExtractor to google-genai (#231) #234
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Relates to #231
Migrate
ocr/app/extractors/maintenance_receipt_extractor.pyfromvertexai.generative_modelstogoogle.genai:_get_model()with_get_client()usinggenai.Client(vertexai=True, project, location)self._clientandself._model_nameinstead ofself._modelandself._generation_config_extract_with_gemini(): useclient.models.generate_content(model=..., contents=..., config=GenerateContentConfig(...))Acceptance Criteria
vertexaiorgoogle.cloud.aiplatformgenai.Client(vertexai=True, ...)for initializationFile
ocr/app/extractors/maintenance_receipt_extractor.pyPlan: M3 -- Migrate MaintenanceReceiptExtractor (#234)
Phase: Planning | Agent: Planner | Status: APPROVED
Parent: #231 | Revision: v4
Context
The OCR service uses the deprecated
vertexai.generative_modelsSDK inmaintenance_receipt_extractor.py. This file follows the same SDK pattern asgemini_engine.pybut processes text-only receipts (no image parts, no Google Search grounding).Codebase Analysis
ocr/app/extractors/maintenance_receipt_extractor.pyaiplatform+GenerationConfig+GenerativeModel(L187-191)API Migration Map
vertexai.generative_models)google.genai)from google.cloud import aiplatformfrom google import genaifrom vertexai.generative_models import GenerativeModel, GenerationConfig, Partfrom google.genai import typesaiplatform.init(project=..., location=...)genai.Client(vertexai=True, project=..., location=...)GenerativeModel(model_name)model=kwargmodel.generate_content([...], generation_config=config)client.models.generate_content(model=name, contents=[...], config=config)GenerationConfig(response_mime_type=..., response_schema=...)types.GenerateContentConfig(response_mime_type=..., response_schema=...)"string","object", etc."STRING","OBJECT", etc. (uppercase per Vertex AI Schema spec)Internal State Changes
MaintenanceReceiptExtractor changes from:
To:
Note:
MaintenanceExtractionResult.model(the model name string field, e.g.,"gemini-2.5-flash") is unaffected by this migration -- it is populated fromsettings.gemini_modeland has no relation to theself._modelinstance attribute.Authentication
Same as GeminiEngine:
GOOGLE_APPLICATION_CREDENTIALSenv var pointing to WIF credential config. CRITICAL:os.environ["GOOGLE_APPLICATION_CREDENTIALS"]andos.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"]MUST be set BEFOREgenai.Client()construction.Implementation
ocr/app/extractors/maintenance_receipt_extractor.py_get_model()->_get_client()pattern as M2 (ADC env vars set first)self._modelandself._generation_configfrom__init__; replace withself._clientandself._model_name_RECEIPT_RESPONSE_SCHEMAtype values to uppercase (same as M2)_extract_with_gemini(): callself._client.models.generate_content(model=self._model_name, contents=[...], config=types.GenerateContentConfig(...))_get_client()to raiseGeminiUnavailableErrorfor missing credentials (currently raises bareRuntimeError); addtry/except ImportErrorandtry/except Exceptionblocks matchingGeminiEngine._get_client()pattern_get_model()docstring (L173): replace "Lazy-initialize Vertex AI Gemini model" with "Lazy-initialize google-genai Gemini client"Partusage (text input only)Review Findings
QR plan-code:
MaintenanceReceiptExtractorraises bareRuntimeErrorinstead ofGeminiUnavailableError-- fix during migrationQR plan-docs:
_get_model()docstring (L173) -- included in implementationTW plan-scrub:
MaintenanceExtractionResult.modelfield disambiguated fromself._modelVerdict: APPROVED | Next: Execute (depends on M1 #232)
Milestone: M3 Complete -- Migrate MaintenanceReceiptExtractor
Phase: Execution | Agent: Developer | Status: PASS
Changes
ocr/app/extractors/maintenance_receipt_extractor.py: Full SDK migration_get_model()->_get_client()pattern as GeminiEngineself._model+self._generation_config->self._client+self._model_name_extract_with_gemini()usesclient.models.generate_content(model=..., ...)RuntimeErrortoGeminiUnavailableErrorfor missing credentialstry/except ImportErrorandtry/except Exceptionblocks matching GeminiEngine pattern_get_model()docstringAcceptance Criteria
vertexaiorgoogle.cloud.aiplatformgenai.Client(vertexai=True, ...)for initializationVerdict: PASS | Next: M4 -- Update test mocks (#235)