15 Commits

Author SHA1 Message Date
7e2bb9ef36 Merge pull request 'feat: Migrate Gemini SDK to google-genai (#231)' (#236) from issue-231-migrate-gemini-sdk-google-genai into main
All checks were successful
Deploy to Staging / Build Images (push) Successful in 37s
Deploy to Staging / Deploy to Staging (push) Successful in 51s
Deploy to Staging / Verify Staging (push) Successful in 9s
Deploy to Staging / Notify Staging Ready (push) Successful in 8s
Deploy to Staging / Notify Staging Failure (push) Has been skipped
Reviewed-on: #236
2026-03-01 04:08:09 +00:00
Eric Gullickson
56df5d48f3 fix: revert unsupported AFC config and add diagnostic logging for VIN decode (refs #231)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 12m33s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 52s
Deploy to Staging / Verify Staging (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
- Remove AutomaticFunctionCallingConfig(max_remote_calls=3) which caused
  pydantic validation error on the installed google-genai version
- Log full Gemini raw JSON response in OCR engine for debugging
- Add engine/transmission to backend raw values log
- Add hasTrim/hasEngine/hasTransmission to decode success log

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 21:16:56 -06:00
Eric Gullickson
1add6c8240 fix: remove unsupported AutomaticFunctionCallingConfig parameter (refs #231)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 39s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 53s
Deploy to Staging / Verify Staging (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
The installed google-genai version does not support max_remote_calls on
AutomaticFunctionCallingConfig, causing a pydantic validation error that
broke VIN decode on staging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 12:59:04 -06:00
Eric Gullickson
936753fac2 fix: VIN Decoding timeouts and logic errors
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 3m33s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 52s
Deploy to Staging / Verify Staging (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
2026-02-28 12:02:26 -06:00
Eric Gullickson
96e1dde7b2 docs: update CLAUDE.md references from Vertex AI to google-genai (refs #231)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 8m4s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 24s
Deploy to Staging / Verify Staging (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 9s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:21:58 -06:00
Eric Gullickson
1464a0e1af feat: update test mocks for google-genai SDK (refs #235)
Replace engine._model/engine._generation_config mocks with
engine._client/engine._model_name. Update sys.modules patches
from vertexai to google.genai. Remove dead if-False branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:21:10 -06:00
Eric Gullickson
9f51e62b94 feat: migrate MaintenanceReceiptExtractor to google-genai SDK (refs #234)
Replace vertexai.generative_models with google.genai client pattern.
Fix pre-existing bug: raise GeminiUnavailableError instead of bare
RuntimeError for missing credentials. Add proper try/except blocks
matching GeminiEngine error handling pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:17:14 -06:00
Eric Gullickson
b7f472b3e8 feat: migrate GeminiEngine to google-genai SDK with Google Search grounding (refs #233)
Replace vertexai.generative_models with google.genai client pattern.
Add Google Search grounding tool to VIN decode for improved accuracy.
Convert response schema types to uppercase per Vertex AI Schema spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:16:18 -06:00
Eric Gullickson
398d67304f feat: replace google-cloud-aiplatform with google-genai dependency (refs #232)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:13:54 -06:00
Eric Gullickson
0055d9f0f3 fix: VIN decoding year fixes
All checks were successful
Deploy to Staging / Build Images (push) Successful in 35s
Deploy to Staging / Deploy to Staging (push) Successful in 53s
Deploy to Staging / Verify Staging (push) Successful in 9s
Deploy to Staging / Notify Staging Ready (push) Successful in 9s
Deploy to Staging / Notify Staging Failure (push) Has been skipped
Mirror Base Images / Mirror Base Images (push) Successful in 1m2s
2026-02-28 11:09:46 -06:00
Eric Gullickson
9dc56a3773 fix: distribute plan storage to sub-issues for context efficiency
Some checks failed
Deploy to Staging / Build Images (push) Successful in 38s
Deploy to Staging / Verify Staging (push) Has been cancelled
Deploy to Staging / Notify Staging Ready (push) Has been cancelled
Deploy to Staging / Notify Staging Failure (push) Has been cancelled
Deploy to Staging / Deploy to Staging (push) Has been cancelled
Split monolithic parent-issue plan into per-sub-issue comments.
  Updated workflow contract to enforce self-contained sub-issue plans.
2026-02-28 11:08:49 -06:00
Eric Gullickson
283ba6b108 fix: Remove VIN Cache
All checks were successful
Deploy to Staging / Build Images (push) Successful in 3m36s
Deploy to Staging / Deploy to Staging (push) Successful in 52s
Deploy to Staging / Verify Staging (push) Successful in 8s
Deploy to Staging / Notify Staging Ready (push) Successful in 8s
Deploy to Staging / Notify Staging Failure (push) Has been skipped
Mirror Base Images / Mirror Base Images (push) Successful in 42s
2026-02-20 08:26:39 -06:00
Eric Gullickson
7d90f4b25a fix: add VIN year code table to Gemini decode prompt (refs #229)
All checks were successful
Deploy to Staging / Build Images (push) Successful in 37s
Deploy to Staging / Deploy to Staging (push) Successful in 51s
Deploy to Staging / Verify Staging (push) Successful in 8s
Deploy to Staging / Notify Staging Ready (push) Successful in 7s
Deploy to Staging / Notify Staging Failure (push) Has been skipped
gemini-3-flash-preview was hallucinating year (e.g., returning 1993
instead of 2023 for position-10 code P). Prompt now includes the full
1980-2039 year code table and position-7 disambiguation rule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:55:21 -06:00
e2e6471c5e Merge pull request 'fix: increase VIN decode timeout for Gemini cold start' (#230) from issue-223-replace-nhtsa-vin-decode-gemini into main
All checks were successful
Deploy to Staging / Build Images (push) Successful in 37s
Deploy to Staging / Deploy to Staging (push) Successful in 52s
Deploy to Staging / Verify Staging (push) Successful in 10s
Deploy to Staging / Notify Staging Ready (push) Successful in 8s
Deploy to Staging / Notify Staging Failure (push) Has been skipped
Reviewed-on: #230
2026-02-20 03:37:49 +00:00
Eric Gullickson
3b5b84729f fix: increase VIN decode timeout to 60s for Gemini cold start (refs #229)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 3m31s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s
Deploy to Staging / Verify Staging (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
Default 10s API client timeout caused frontend "Failed to decode" errors
when Gemini engine cold-starts (34s+ on first call).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:30:31 -06:00
14 changed files with 407 additions and 259 deletions

View File

@@ -52,7 +52,8 @@
"ONE PR for the parent issue. The PR closes the parent and all sub-issues.",
"Commits reference the specific sub-issue index they implement.",
"Sub-issues should be small enough to fit in a single AI context window.",
"Plan milestones map 1:1 to sub-issues."
"Plan milestones map 1:1 to sub-issues.",
"Each sub-issue receives its own plan comment with duplicated shared context. An agent must be able to execute from the sub-issue alone."
],
"examples": {
"parent": "#105 'feat: Add Grafana dashboards and alerting'",
@@ -103,8 +104,9 @@
"[SKILL] Problem Analysis if complex problem.",
"[SKILL] Decision Critic if uncertain approach.",
"If multi-file feature (3+ files): decompose into sub-issues per sub_issues rules. Each sub-issue = one plan milestone.",
"[SKILL] Planner writes plan as parent issue comment. Plan milestones map 1:1 to sub-issues.",
"[SKILL] Plan review cycle: QR plan-completeness -> TW plan-scrub -> QR plan-code -> QR plan-docs.",
"[SKILL] Planner writes plan summary as parent issue comment: shared context + milestone index linking each milestone to its sub-issue. M5 (doc-sync) stays on parent if no sub-issue exists.",
"[SKILL] Planner posts each milestone's self-contained implementation plan as a comment on the corresponding sub-issue. Each sub-issue plan duplicates relevant shared context (API maps, state changes, auth, error handling, risk) so an agent can execute from the sub-issue alone without reading the parent.",
"[SKILL] Plan review cycle: QR plan-completeness -> TW plan-scrub -> QR plan-code -> QR plan-docs. Distribute milestone-specific review findings to sub-issue plan comments.",
"Create ONE branch issue-{parent_index}-{slug} from main.",
"[SKILL] Planner executes plan, delegates to Developer per milestone/sub-issue.",
"[SKILL] QR post-implementation per milestone (results in parent issue comment).",
@@ -123,7 +125,7 @@
"execution_review": ["QR post-implementation per milestone"],
"final_review": ["Quality Agent RULE 0/1/2"]
},
"plan_storage": "gitea_issue_comments",
"plan_storage": "gitea_issue_comments: summary on parent issue, milestone detail on sub-issues",
"tracking_storage": "gitea_issue_comments",
"issue_comment_operations": {
"create_comment": "mcp__gitea-mcp__create_issue_comment",

View File

@@ -406,20 +406,9 @@ export class VehiclesController {
logger.info('VIN decode requested', { userId, vin: sanitizedVin.substring(0, 6) + '...' });
// Check cache first
const cached = await this.vehiclesService.getVinCached(sanitizedVin);
if (cached) {
logger.info('VIN decode cache hit', { userId });
const decodedData = await this.vehiclesService.mapVinDecodeResponse(cached);
return reply.code(200).send(decodedData);
}
// Call OCR service for VIN decode
const response = await ocrClient.decodeVin(sanitizedVin);
// Cache the response
await this.vehiclesService.saveVinCache(sanitizedVin, response);
// Map response to decoded vehicle data with dropdown matching
const decodedData = await this.vehiclesService.mapVinDecodeResponse(response);
@@ -427,7 +416,10 @@ export class VehiclesController {
userId,
hasYear: !!decodedData.year.value,
hasMake: !!decodedData.make.value,
hasModel: !!decodedData.model.value
hasModel: !!decodedData.model.value,
hasTrim: !!decodedData.trimLevel.value,
hasEngine: !!decodedData.engine.value,
hasTransmission: !!decodedData.transmission.value,
});
return reply.code(200).send(decodedData);

View File

@@ -1,6 +1,6 @@
/**
* @ai-summary Business logic for vehicles feature
* @ai-context Handles VIN decoding, caching, and business rules
* @ai-context Handles VIN decoding and business rules
*/
import { Pool } from 'pg';
@@ -594,72 +594,6 @@ export class VehiclesService {
await cacheService.del(cacheKey);
}
/**
* Check vin_cache for existing VIN data.
* Format-aware: validates raw_data has `success` field (Gemini format).
* Old NHTSA-format entries are treated as cache misses and expire via TTL.
*/
async getVinCached(vin: string): Promise<VinDecodeResponse | null> {
try {
const result = await this.pool.query<{
raw_data: any;
cached_at: Date;
}>(
`SELECT raw_data, cached_at
FROM vin_cache
WHERE vin = $1
AND cached_at > NOW() - INTERVAL '365 days'`,
[vin]
);
if (result.rows.length === 0) {
return null;
}
const rawData = result.rows[0].raw_data;
// Format-aware check: Gemini responses have `success` field,
// old NHTSA responses do not. Treat old format as cache miss.
if (!rawData || typeof rawData !== 'object' || !('success' in rawData)) {
logger.debug('VIN cache format mismatch (legacy NHTSA entry), treating as miss', { vin });
return null;
}
logger.debug('VIN cache hit', { vin });
return rawData as VinDecodeResponse;
} catch (error) {
logger.error('Failed to check VIN cache', { vin, error });
return null;
}
}
/**
* Save VIN decode response to cache with ON CONFLICT upsert.
*/
async saveVinCache(vin: string, response: VinDecodeResponse): Promise<void> {
try {
await this.pool.query(
`INSERT INTO vin_cache (vin, make, model, year, engine_type, body_type, raw_data, cached_at)
VALUES ($1, $2, $3, $4, $5, $6, $7, NOW())
ON CONFLICT (vin) DO UPDATE SET
make = EXCLUDED.make,
model = EXCLUDED.model,
year = EXCLUDED.year,
engine_type = EXCLUDED.engine_type,
body_type = EXCLUDED.body_type,
raw_data = EXCLUDED.raw_data,
cached_at = NOW()
WHERE (vin_cache.raw_data->>'confidence')::float <= $8`,
[vin, response.make, response.model, response.year, response.engine, response.bodyType, JSON.stringify(response), response.confidence ?? 1]
);
logger.debug('VIN cached', { vin, confidence: response.confidence });
} catch (error) {
logger.error('Failed to cache VIN data', { vin, error });
// Don't throw - caching failure shouldn't break the decode flow
}
}
async getDropdownMakes(year: number): Promise<string[]> {
const vehicleDataService = getVehicleDataService();
const pool = getPool();
@@ -745,7 +679,8 @@ export class VehiclesService {
logger.debug('VIN decode raw values', {
vin: response.vin,
year: sourceYear, make: sourceMake, model: sourceModel,
trim: sourceTrim, confidence: response.confidence
trim: sourceTrim, engine: sourceEngine, transmission: sourceTransmission,
confidence: response.confidence
});
// Year is always high confidence if present (exact numeric match)

View File

@@ -242,18 +242,6 @@ export interface DecodedVehicleData {
transmission: MatchedField<string>;
}
/** Cached VIN data from vin_cache table */
export interface VinCacheEntry {
vin: string;
make: string | null;
model: string | null;
year: number | null;
engineType: string | null;
bodyType: string | null;
rawData: import('../../ocr/domain/ocr.types').VinDecodeResponse;
cachedAt: Date;
}
/** VIN decode request body */
export interface DecodeVinRequest {
vin: string;

View File

@@ -36,15 +36,13 @@ describe('Vehicles Integration Tests', () => {
afterAll(async () => {
// Clean up test database
await pool.query('DROP TABLE IF EXISTS vehicles CASCADE');
await pool.query('DROP TABLE IF EXISTS vin_cache CASCADE');
await pool.query('DROP FUNCTION IF EXISTS update_updated_at_column() CASCADE');
await pool.end();
});
beforeEach(async () => {
// Clean up test data before each test - more thorough cleanup
// Clean up test data before each test
await pool.query('DELETE FROM vehicles WHERE user_id = $1', ['test-user-123']);
await pool.query('DELETE FROM vin_cache WHERE vin LIKE $1', ['1HGBH41JXMN%']);
// Clear Redis cache for the test user
try {

View File

@@ -86,7 +86,9 @@ export const vehiclesApi = {
* Requires Pro or Enterprise tier
*/
decodeVin: async (vin: string): Promise<DecodedVehicleData> => {
const response = await apiClient.post('/vehicles/decode-vin', { vin });
const response = await apiClient.post('/vehicles/decode-vin', { vin }, {
timeout: 120000 // 120 seconds for Gemini + Google Search grounding
});
return response.data;
}
};

View File

@@ -7,7 +7,7 @@ Python OCR microservice (FastAPI). Primary engine: PaddleOCR PP-OCRv4 with optio
| File | What | When to read |
| ---- | ---- | ------------ |
| `main.py` | FastAPI application entry point | Route registration, app setup |
| `config.py` | Configuration settings (OCR engines, Vertex AI, Redis, Vision API limits) | Environment variables, settings |
| `config.py` | Configuration settings (OCR engines, Google GenAI, Redis, Vision API limits) | Environment variables, settings |
| `__init__.py` | Package init | Package structure |
## Subdirectories

View File

@@ -29,7 +29,7 @@ class Settings:
os.getenv("VISION_MONTHLY_LIMIT", "1000")
)
# Vertex AI / Gemini configuration
# Google GenAI / Gemini configuration
self.vertex_ai_project: str = os.getenv("VERTEX_AI_PROJECT", "")
self.vertex_ai_location: str = os.getenv(
"VERTEX_AI_LOCATION", "global"

View File

@@ -3,7 +3,7 @@
OCR engine abstraction layer. Two categories of engines:
1. **OcrEngine subclasses** (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
2. **GeminiEngine** (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via Vertex AI. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
2. **GeminiEngine** (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via google-genai SDK. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
## Files
@@ -15,7 +15,7 @@ OCR engine abstraction layer. Two categories of engines:
| `cloud_engine.py` | Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota |
| `hybrid_engine.py` | Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior |
| `engine_factory.py` | Factory function and engine registry for instantiation | Adding new engine types |
| `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (Vertex AI SDK, 20MB PDF limit, structured JSON output) | Manual extraction debugging, VIN decode, Gemini configuration |
| `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (google-genai SDK, 20MB PDF limit, structured JSON output, Google Search grounding for VIN decode) | Manual extraction debugging, VIN decode, Gemini configuration |
## Engine Selection

View File

@@ -2,13 +2,14 @@
Standalone module (does NOT extend OcrEngine) because Gemini performs
semantic document understanding, not traditional OCR word-box extraction.
Uses Vertex AI SDK with structured JSON output enforcement.
Uses google-genai SDK with structured JSON output enforcement.
"""
import json
import logging
import os
from dataclasses import dataclass
from datetime import datetime
from typing import Any
from app.config import settings
@@ -37,43 +38,114 @@ Do not include one-time procedures, troubleshooting steps, or warranty informati
Return the results as a JSON object with a single "maintenanceSchedule" array.\
"""
# VIN year code lookup: position 10 character -> base year (first cycle, 1980-2009).
# The 30-year cycle repeats: +30 for 2010-2039, +60 for 2040-2069.
# Disambiguation uses position 7: alphabetic -> 2010+ cycle, numeric -> 1980s cycle.
# Per NHTSA FMVSS No. 115: MY2010+ vehicles must use alphabetic position 7.
# For the 2040+ cycle (when position 7 is numeric again), we pick the most
# recent plausible year (not more than 2 years in the future).
_VIN_YEAR_CODES: dict[str, int] = {
"A": 1980, "B": 1981, "C": 1982, "D": 1983, "E": 1984,
"F": 1985, "G": 1986, "H": 1987, "J": 1988, "K": 1989,
"L": 1990, "M": 1991, "N": 1992, "P": 1993, "R": 1994,
"S": 1995, "T": 1996, "V": 1997, "W": 1998, "X": 1999,
"Y": 2000,
"1": 2001, "2": 2002, "3": 2003, "4": 2004, "5": 2005,
"6": 2006, "7": 2007, "8": 2008, "9": 2009,
}
def resolve_vin_year(vin: str) -> int | None:
"""Deterministically resolve model year from VIN positions 7 and 10.
VIN year codes repeat on a 30-year cycle. Position 7 disambiguates:
- Alphabetic position 7 -> 2010-2039 cycle (NHTSA MY2010+ requirement)
- Numeric position 7 -> 1980-2009 or 2040-2069 cycle
For the numeric case with two possible cycles, picks the most recent
year that is not more than 2 years in the future.
Returns None if the VIN is too short or position 10 is not a valid year code.
"""
if len(vin) < 17:
return None
code = vin[9].upper() # position 10 (0-indexed)
pos7 = vin[6].upper() # position 7 (0-indexed)
base_year = _VIN_YEAR_CODES.get(code)
if base_year is None:
return None
if pos7.isalpha():
# Alphabetic position 7 -> second cycle (2010-2039)
return base_year + 30
# Numeric position 7 -> first cycle (1980-2009) or third cycle (2040-2069)
# Pick the most recent plausible year
max_plausible = datetime.now().year + 2
third_cycle = base_year + 60 # 2040-2069
if third_cycle <= max_plausible:
return third_cycle
return base_year
_VIN_DECODE_PROMPT = """\
Given the VIN (Vehicle Identification Number) below, decode it and return the vehicle specifications.
Decode the following VIN (Vehicle Identification Number) using standard VIN structure rules.
VIN: {vin}
Model year: {year} (determined from position 10 code '{year_code}')
Return the vehicle's year, make, model, trim level, body type, drive type, fuel type, engine description, and transmission type. If a field cannot be determined from the VIN, return null for that field. Return a confidence score (0.0-1.0) indicating overall decode reliability.\
The model year has already been resolved deterministically. Use {year} as the year.
VIN position reference:
- Positions 1-3 (WMI): World Manufacturer Identifier (country + manufacturer)
- Positions 4-8 (VDS): Vehicle attributes (model, body, engine, etc.)
- Position 9: Check digit
- Position 10: Model year code (30-year cycle, extended through 2050):
A=1980/2010/2040 B=1981/2011/2041 C=1982/2012/2042 D=1983/2013/2043 E=1984/2014/2044
F=1985/2015/2045 G=1986/2016/2046 H=1987/2017/2047 J=1988/2018/2048 K=1989/2019/2049
L=1990/2020/2050 M=1991/2021 N=1992/2022 P=1993/2023 R=1994/2024
S=1995/2025 T=1996/2026 V=1997/2027 W=1998/2028 X=1999/2029
Y=2000/2030 1=2001/2031 2=2002/2032 3=2003/2033 4=2004/2034
5=2005/2035 6=2006/2036 7=2007/2037 8=2008/2038 9=2009/2039
- Position 11: Assembly plant
- Positions 12-17: Sequential production number
Return the vehicle's make, model, trim level, body type, drive type, fuel type, engine description, and transmission type. If a field cannot be determined from the VIN, return null for that field. Return a confidence score (0.0-1.0) indicating overall decode reliability.\
"""
_VIN_DECODE_SCHEMA: dict[str, Any] = {
"type": "object",
"type": "OBJECT",
"properties": {
"year": {"type": "integer", "nullable": True},
"make": {"type": "string", "nullable": True},
"model": {"type": "string", "nullable": True},
"trimLevel": {"type": "string", "nullable": True},
"bodyType": {"type": "string", "nullable": True},
"driveType": {"type": "string", "nullable": True},
"fuelType": {"type": "string", "nullable": True},
"engine": {"type": "string", "nullable": True},
"transmission": {"type": "string", "nullable": True},
"confidence": {"type": "number"},
"year": {"type": "INTEGER", "nullable": True},
"make": {"type": "STRING", "nullable": True},
"model": {"type": "STRING", "nullable": True},
"trimLevel": {"type": "STRING", "nullable": True},
"bodyType": {"type": "STRING", "nullable": True},
"driveType": {"type": "STRING", "nullable": True},
"fuelType": {"type": "STRING", "nullable": True},
"engine": {"type": "STRING", "nullable": True},
"transmission": {"type": "STRING", "nullable": True},
"confidence": {"type": "NUMBER"},
},
"required": ["confidence"],
}
_RESPONSE_SCHEMA: dict[str, Any] = {
"type": "object",
"type": "OBJECT",
"properties": {
"maintenanceSchedule": {
"type": "array",
"type": "ARRAY",
"items": {
"type": "object",
"type": "OBJECT",
"properties": {
"serviceName": {"type": "string"},
"intervalMiles": {"type": "number", "nullable": True},
"intervalMonths": {"type": "number", "nullable": True},
"details": {"type": "string", "nullable": True},
"serviceName": {"type": "STRING"},
"intervalMiles": {"type": "NUMBER", "nullable": True},
"intervalMonths": {"type": "NUMBER", "nullable": True},
"details": {"type": "STRING", "nullable": True},
},
"required": ["serviceName"],
},
@@ -135,20 +207,21 @@ class GeminiEngine:
Standalone class (not an OcrEngine subclass) because Gemini performs
semantic document understanding rather than traditional OCR.
Uses lazy initialization: the Vertex AI client is not created until
Uses lazy initialization: the Gemini client is not created until
the first call to ``extract_maintenance()`` or ``decode_vin()``.
"""
def __init__(self) -> None:
self._model: Any | None = None
self._client: Any | None = None
self._model_name: str = ""
def _get_model(self) -> Any:
"""Create the GenerativeModel on first use.
def _get_client(self) -> Any:
"""Create the genai.Client on first use.
Authentication uses the same WIF credential path as Google Vision.
"""
if self._model is not None:
return self._model
if self._client is not None:
return self._client
key_path = settings.google_vision_key_path
if not os.path.isfile(key_path):
@@ -158,46 +231,37 @@ class GeminiEngine:
)
try:
from google.cloud import aiplatform # type: ignore[import-untyped]
from vertexai.generative_models import ( # type: ignore[import-untyped]
GenerationConfig,
GenerativeModel,
)
from google import genai # type: ignore[import-untyped]
# Point ADC at the WIF credential config
# Point ADC at the WIF credential config (must be set BEFORE Client construction)
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = key_path
os.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"] = "1"
aiplatform.init(
self._client = genai.Client(
vertexai=True,
project=settings.vertex_ai_project,
location=settings.vertex_ai_location,
)
model_name = settings.gemini_model
self._model = GenerativeModel(model_name)
self._generation_config = GenerationConfig(
response_mime_type="application/json",
response_schema=_RESPONSE_SCHEMA,
)
self._model_name = settings.gemini_model
logger.info(
"Gemini engine initialized (model=%s, project=%s, location=%s)",
model_name,
self._model_name,
settings.vertex_ai_project,
settings.vertex_ai_location,
)
return self._model
return self._client
except ImportError as exc:
logger.exception("Vertex AI SDK import failed")
logger.exception("google-genai SDK import failed")
raise GeminiUnavailableError(
"google-cloud-aiplatform is not installed. "
"Install with: pip install google-cloud-aiplatform"
"google-genai is not installed. "
"Install with: pip install google-genai"
) from exc
except Exception as exc:
logger.exception("Vertex AI authentication failed")
logger.exception("Gemini authentication failed: %s", type(exc).__name__)
raise GeminiUnavailableError(
f"Vertex AI authentication failed: {exc}"
f"Gemini authentication failed: {exc}"
) from exc
def extract_maintenance(
@@ -222,19 +286,23 @@ class GeminiEngine:
"inline processing. Upload to GCS and use a gs:// URI instead."
)
model = self._get_model()
client = self._get_client()
try:
from vertexai.generative_models import Part # type: ignore[import-untyped]
from google.genai import types # type: ignore[import-untyped]
pdf_part = Part.from_data(
pdf_part = types.Part.from_bytes(
data=pdf_bytes,
mime_type="application/pdf",
)
response = model.generate_content(
[pdf_part, _EXTRACTION_PROMPT],
generation_config=self._generation_config,
response = client.models.generate_content(
model=self._model_name,
contents=[pdf_part, _EXTRACTION_PROMPT],
config=types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=_RESPONSE_SCHEMA,
),
)
raw = json.loads(response.text)
@@ -273,6 +341,10 @@ class GeminiEngine:
def decode_vin(self, vin: str) -> VinDecodeResult:
"""Decode a VIN string into structured vehicle data via Gemini.
The model year is resolved deterministically from VIN positions 7
and 10 -- never delegated to the LLM. Gemini handles make, model,
trim, and other fields that require manufacturer knowledge.
Args:
vin: A 17-character Vehicle Identification Number.
@@ -283,28 +355,58 @@ class GeminiEngine:
GeminiProcessingError: If Gemini fails to decode the VIN.
GeminiUnavailableError: If the engine cannot be initialized.
"""
model = self._get_model()
client = self._get_client()
try:
from vertexai.generative_models import GenerationConfig # type: ignore[import-untyped]
vin_config = GenerationConfig(
response_mime_type="application/json",
response_schema=_VIN_DECODE_SCHEMA,
# Resolve year deterministically from VIN structure
resolved_year = resolve_vin_year(vin)
year_code = vin[9].upper() if len(vin) >= 10 else "?"
logger.info(
"VIN year resolved: code=%s pos7=%s -> year=%s",
year_code,
vin[6] if len(vin) >= 7 else "?",
resolved_year,
)
prompt = _VIN_DECODE_PROMPT.format(vin=vin)
response = model.generate_content(
[prompt],
generation_config=vin_config,
try:
from google.genai import types # type: ignore[import-untyped]
prompt = _VIN_DECODE_PROMPT.format(
vin=vin,
year=resolved_year or "unknown",
year_code=year_code,
)
response = client.models.generate_content(
model=self._model_name,
contents=[prompt],
config=types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=_VIN_DECODE_SCHEMA,
tools=[types.Tool(google_search=types.GoogleSearch())],
),
)
raw = json.loads(response.text)
logger.info("Gemini decoded VIN %s (confidence=%.2f)", vin, raw.get("confidence", 0))
# Override year with deterministic value -- never trust the LLM
# for a mechanical lookup
gemini_year = raw.get("year")
if resolved_year and gemini_year != resolved_year:
logger.warning(
"Gemini returned year %s but resolved year is %s for VIN %s -- overriding",
gemini_year,
resolved_year,
vin,
)
logger.info(
"Gemini decoded VIN %s (confidence=%.2f) raw=%s",
vin,
raw.get("confidence", 0),
json.dumps(raw, default=str),
)
return VinDecodeResult(
year=raw.get("year"),
year=resolved_year if resolved_year else raw.get("year"),
make=raw.get("make"),
model=raw.get("model"),
trim_level=raw.get("trimLevel"),

View File

@@ -14,6 +14,7 @@ import time
from typing import Any, Optional
from app.config import settings
from app.engines.gemini_engine import GeminiUnavailableError
from app.extractors.receipt_extractor import (
ExtractedField,
ReceiptExtractionResult,
@@ -54,16 +55,16 @@ OCR Text:
"""
_RECEIPT_RESPONSE_SCHEMA: dict[str, Any] = {
"type": "object",
"type": "OBJECT",
"properties": {
"serviceName": {"type": "string", "nullable": True},
"serviceDate": {"type": "string", "nullable": True},
"totalCost": {"type": "number", "nullable": True},
"shopName": {"type": "string", "nullable": True},
"laborCost": {"type": "number", "nullable": True},
"partsCost": {"type": "number", "nullable": True},
"odometerReading": {"type": "number", "nullable": True},
"vehicleInfo": {"type": "string", "nullable": True},
"serviceName": {"type": "STRING", "nullable": True},
"serviceDate": {"type": "STRING", "nullable": True},
"totalCost": {"type": "NUMBER", "nullable": True},
"shopName": {"type": "STRING", "nullable": True},
"laborCost": {"type": "NUMBER", "nullable": True},
"partsCost": {"type": "NUMBER", "nullable": True},
"odometerReading": {"type": "NUMBER", "nullable": True},
"vehicleInfo": {"type": "STRING", "nullable": True},
},
"required": [
"serviceName",
@@ -87,8 +88,8 @@ class MaintenanceReceiptExtractor:
"""
def __init__(self) -> None:
self._model: Any | None = None
self._generation_config: Any | None = None
self._client: Any | None = None
self._model_name: str = ""
def extract(
self,
@@ -169,47 +170,52 @@ class MaintenanceReceiptExtractor:
processing_time_ms=processing_time_ms,
)
def _get_model(self) -> Any:
"""Lazy-initialize Vertex AI Gemini model.
def _get_client(self) -> Any:
"""Lazy-initialize google-genai Gemini client.
Uses the same authentication pattern as GeminiEngine.
"""
if self._model is not None:
return self._model
if self._client is not None:
return self._client
key_path = settings.google_vision_key_path
if not os.path.isfile(key_path):
raise RuntimeError(
raise GeminiUnavailableError(
f"Google credential config not found at {key_path}. "
"Set GOOGLE_VISION_KEY_PATH or mount the secret."
)
from google.cloud import aiplatform # type: ignore[import-untyped]
from vertexai.generative_models import ( # type: ignore[import-untyped]
GenerationConfig,
GenerativeModel,
)
try:
from google import genai # type: ignore[import-untyped]
# Point ADC at the WIF credential config (must be set BEFORE Client construction)
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = key_path
os.environ["GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES"] = "1"
aiplatform.init(
self._client = genai.Client(
vertexai=True,
project=settings.vertex_ai_project,
location=settings.vertex_ai_location,
)
model_name = settings.gemini_model
self._model = GenerativeModel(model_name)
self._generation_config = GenerationConfig(
response_mime_type="application/json",
response_schema=_RECEIPT_RESPONSE_SCHEMA,
)
self._model_name = settings.gemini_model
logger.info(
"Maintenance receipt Gemini model initialized (model=%s)",
model_name,
"Maintenance receipt Gemini client initialized (model=%s)",
self._model_name,
)
return self._model
return self._client
except ImportError as exc:
logger.exception("google-genai SDK import failed")
raise GeminiUnavailableError(
"google-genai is not installed. "
"Install with: pip install google-genai"
) from exc
except Exception as exc:
logger.exception("Gemini authentication failed: %s", type(exc).__name__)
raise GeminiUnavailableError(
f"Gemini authentication failed: {exc}"
) from exc
def _extract_with_gemini(self, ocr_text: str) -> dict:
"""Send OCR text to Gemini for semantic field extraction.
@@ -220,13 +226,19 @@ class MaintenanceReceiptExtractor:
Returns:
Dictionary of field_name -> extracted_value from Gemini.
"""
model = self._get_model()
client = self._get_client()
from google.genai import types # type: ignore[import-untyped]
prompt = _RECEIPT_EXTRACTION_PROMPT.format(ocr_text=ocr_text)
response = model.generate_content(
[prompt],
generation_config=self._generation_config,
response = client.models.generate_content(
model=self._model_name,
contents=[prompt],
config=types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=_RECEIPT_RESPONSE_SCHEMA,
),
)
raw = json.loads(response.text)

View File

@@ -21,8 +21,8 @@ google-cloud-vision>=3.7.0
# PDF Processing
PyMuPDF>=1.23.0
# Vertex AI / Gemini (maintenance schedule extraction)
google-cloud-aiplatform>=1.40.0
# Google GenAI / Gemini (maintenance schedule extraction, VIN decode)
google-genai>=1.0.0
# Redis for job queue
redis>=5.0.0

View File

@@ -2,11 +2,11 @@
Covers: GeminiEngine initialization, PDF size validation,
successful extraction, empty results, and error handling.
All Vertex AI SDK calls are mocked.
All google-genai SDK calls are mocked.
"""
import json
from unittest.mock import MagicMock, patch, PropertyMock
from unittest.mock import MagicMock, patch
import pytest
@@ -156,22 +156,16 @@ class TestExtractMaintenance:
},
]
mock_model = MagicMock()
mock_model.generate_content.return_value = _make_gemini_response(schedule)
mock_client = MagicMock()
mock_client.models.generate_content.return_value = _make_gemini_response(schedule)
with (
patch(
"app.engines.gemini_engine.importlib_vertex_ai"
) if False else patch.dict("sys.modules", {
"google.cloud": MagicMock(),
"google.cloud.aiplatform": MagicMock(),
"vertexai": MagicMock(),
"vertexai.generative_models": MagicMock(),
}),
):
with patch.dict("sys.modules", {
"google.genai": MagicMock(),
"google.genai.types": MagicMock(),
}):
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
result = engine.extract_maintenance(_make_pdf_bytes())
@@ -200,12 +194,12 @@ class TestExtractMaintenance:
mock_settings.vertex_ai_location = "us-central1"
mock_settings.gemini_model = "gemini-2.5-flash"
mock_model = MagicMock()
mock_model.generate_content.return_value = _make_gemini_response([])
mock_client = MagicMock()
mock_client.models.generate_content.return_value = _make_gemini_response([])
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
result = engine.extract_maintenance(_make_pdf_bytes())
@@ -223,12 +217,12 @@ class TestExtractMaintenance:
schedule = [{"serviceName": "Brake Fluid Replacement"}]
mock_model = MagicMock()
mock_model.generate_content.return_value = _make_gemini_response(schedule)
mock_client = MagicMock()
mock_client.models.generate_content.return_value = _make_gemini_response(schedule)
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
result = engine.extract_maintenance(_make_pdf_bytes())
@@ -264,7 +258,8 @@ class TestErrorHandling:
with (
patch("app.engines.gemini_engine.settings") as mock_settings,
patch.dict("sys.modules", {
"google.cloud.aiplatform": None,
"google": None,
"google.genai": None,
}),
):
mock_settings.google_vision_key_path = "/fake/creds.json"
@@ -283,12 +278,12 @@ class TestErrorHandling:
mock_settings.vertex_ai_location = "us-central1"
mock_settings.gemini_model = "gemini-2.5-flash"
mock_model = MagicMock()
mock_model.generate_content.side_effect = RuntimeError("API quota exceeded")
mock_client = MagicMock()
mock_client.models.generate_content.side_effect = RuntimeError("API quota exceeded")
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
with pytest.raises(GeminiProcessingError, match="maintenance extraction failed"):
engine.extract_maintenance(_make_pdf_bytes())
@@ -307,12 +302,12 @@ class TestErrorHandling:
mock_response = MagicMock()
mock_response.text = "not valid json {{"
mock_model = MagicMock()
mock_model.generate_content.return_value = mock_response
mock_client = MagicMock()
mock_client.models.generate_content.return_value = mock_response
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
with pytest.raises(GeminiProcessingError, match="invalid JSON"):
engine.extract_maintenance(_make_pdf_bytes())
@@ -322,32 +317,32 @@ class TestErrorHandling:
class TestLazyInitialization:
"""Verify the model is not created until first use."""
"""Verify the client is not created until first use."""
def test_model_is_none_after_construction(self):
"""GeminiEngine should not initialize the model in __init__."""
def test_client_is_none_after_construction(self):
"""GeminiEngine should not initialize the client in __init__."""
engine = GeminiEngine()
assert engine._model is None
assert engine._client is None
@patch("app.engines.gemini_engine.settings")
@patch("app.engines.gemini_engine.os.path.isfile", return_value=True)
def test_model_reused_on_second_call(self, mock_isfile, mock_settings):
"""Once initialized, the same model instance is reused."""
def test_client_reused_on_second_call(self, mock_isfile, mock_settings):
"""Once initialized, the same client instance is reused."""
mock_settings.google_vision_key_path = "/fake/creds.json"
mock_settings.vertex_ai_project = "test-project"
mock_settings.vertex_ai_location = "us-central1"
mock_settings.gemini_model = "gemini-2.5-flash"
schedule = [{"serviceName": "Oil Change", "intervalMiles": 5000}]
mock_model = MagicMock()
mock_model.generate_content.return_value = _make_gemini_response(schedule)
mock_client = MagicMock()
mock_client.models.generate_content.return_value = _make_gemini_response(schedule)
engine = GeminiEngine()
engine._model = mock_model
engine._generation_config = MagicMock()
engine._client = mock_client
engine._model_name = "gemini-2.5-flash"
engine.extract_maintenance(_make_pdf_bytes())
engine.extract_maintenance(_make_pdf_bytes())
# Model's generate_content should have been called twice
assert mock_model.generate_content.call_count == 2
# Client's generate_content should have been called twice
assert mock_client.models.generate_content.call_count == 2

View File

@@ -0,0 +1,122 @@
"""Tests for deterministic VIN year resolution.
Covers: all three 30-year cycles (1980-2009, 2010-2039, 2040-2050),
position 7 disambiguation, edge cases, and invalid inputs.
"""
import pytest
from unittest.mock import patch
from datetime import datetime
from app.engines.gemini_engine import resolve_vin_year
class TestSecondCycle:
"""Position 7 alphabetic -> 2010-2039 cycle (NHTSA MY2010+ requirement)."""
def test_p_with_alpha_pos7_returns_2023(self):
"""P=2023 when position 7 is alphabetic (the bug that triggered this fix)."""
# VIN: 1G1YE2D32P5602473 -- pos7='D' (alphabetic), pos10='P'
assert resolve_vin_year("1G1YE2D32P5602473") == 2023
def test_a_with_alpha_pos7_returns_2010(self):
"""A=2010 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D12A5602473") == 2010
def test_l_with_alpha_pos7_returns_2020(self):
"""L=2020 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D12L5602473") == 2020
def test_9_with_alpha_pos7_returns_2039(self):
"""9=2039 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D1295602473") == 2039
def test_digit_1_with_alpha_pos7_returns_2031(self):
"""1=2031 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D1215602473") == 2031
def test_s_with_alpha_pos7_returns_2025(self):
"""S=2025 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D12S5602473") == 2025
def test_t_with_alpha_pos7_returns_2026(self):
"""T=2026 when position 7 is alphabetic."""
assert resolve_vin_year("1G1YE2D12T5602473") == 2026
class TestFirstCycle:
"""Position 7 numeric -> 1980-2009 cycle."""
def test_m_with_numeric_pos7_returns_1991(self):
"""M=1991 when position 7 is numeric."""
assert resolve_vin_year("1G1YE2132M5602473") == 1991
def test_n_with_numeric_pos7_returns_1992(self):
"""N=1992 when position 7 is numeric."""
assert resolve_vin_year("1G1YE2132N5602473") == 1992
def test_p_with_numeric_pos7_returns_1993(self):
"""P=1993 when position 7 is numeric."""
assert resolve_vin_year("1G1YE2132P5602473") == 1993
def test_y_with_numeric_pos7_returns_2000(self):
"""Y=2000 when position 7 is numeric."""
assert resolve_vin_year("1G1YE2132Y5602473") == 2000
class TestThirdCycle:
"""Position 7 numeric + third cycle year (2040-2050) is plausible."""
@patch("app.engines.gemini_engine.datetime")
def test_a_with_numeric_pos7_returns_2040_when_plausible(self, mock_dt):
"""A=2040 when position 7 is numeric and year 2040 is plausible."""
mock_dt.now.return_value = datetime(2039, 1, 1)
# 2039 + 2 = 2041 >= 2040, so third cycle is plausible
assert resolve_vin_year("1G1YE2132A5602473") == 2040
@patch("app.engines.gemini_engine.datetime")
def test_l_with_numeric_pos7_returns_2050_when_plausible(self, mock_dt):
"""L=2050 when position 7 is numeric and year 2050 is plausible."""
mock_dt.now.return_value = datetime(2049, 6, 1)
assert resolve_vin_year("1G1YE2132L5602473") == 2050
@patch("app.engines.gemini_engine.datetime")
def test_a_with_numeric_pos7_returns_1980_when_2040_not_plausible(self, mock_dt):
"""A=1980 when third cycle year (2040) exceeds max plausible."""
mock_dt.now.return_value = datetime(2026, 2, 20)
# 2026 + 2 = 2028 < 2040, so third cycle not plausible -> first cycle
assert resolve_vin_year("1G1YE2132A5602473") == 1980
@patch("app.engines.gemini_engine.datetime")
def test_k_with_numeric_pos7_returns_2049_when_plausible(self, mock_dt):
"""K=2049 when position 7 is numeric and year is plausible."""
mock_dt.now.return_value = datetime(2048, 1, 1)
assert resolve_vin_year("1G1YE2132K5602473") == 2049
class TestEdgeCases:
"""Invalid inputs and boundary conditions."""
def test_short_vin_returns_none(self):
"""VIN shorter than 17 chars returns None."""
assert resolve_vin_year("1G1YE2D32") is None
def test_empty_vin_returns_none(self):
"""Empty string returns None."""
assert resolve_vin_year("") is None
def test_invalid_year_code_returns_none(self):
"""Position 10 with invalid code (e.g., 'Z') returns None."""
# Z is not a valid year code
assert resolve_vin_year("1G1YE2D32Z5602473") is None
def test_lowercase_vin_handled(self):
"""Lowercase VIN characters are handled correctly."""
assert resolve_vin_year("1g1ye2d32p5602473") == 2023
def test_i_o_q_not_valid_year_codes(self):
"""Letters I, O, Q are not valid VIN year codes."""
# These are excluded from VINs entirely but test graceful handling
assert resolve_vin_year("1G1YE2D32I5602473") is None
assert resolve_vin_year("1G1YE2D32O5602473") is None
assert resolve_vin_year("1G1YE2D32Q5602473") is None