The installed google-genai version does not support max_remote_calls on
AutomaticFunctionCallingConfig, causing a pydantic validation error that
broke VIN decode on staging.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace engine._model/engine._generation_config mocks with
engine._client/engine._model_name. Update sys.modules patches
from vertexai to google.genai. Remove dead if-False branch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace vertexai.generative_models with google.genai client pattern.
Add Google Search grounding tool to VIN decode for improved accuracy.
Convert response schema types to uppercase per Vertex AI Schema spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gemini-3-flash-preview was hallucinating year (e.g., returning 1993
instead of 2023 for position-10 code P). Prompt now includes the full
1980-2039 year code table and position-7 disambiguation rule.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Delete vehicles/external/nhtsa/ directory (3 files), remove VPICVariable
and VPICResponse from platform models. Update all documentation to
reflect Gemini VIN decode via OCR service architecture.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add POST /decode/vin endpoint using Gemini 2.5 Flash for VIN string
decoding. Returns structured vehicle data (year, make, model, trim,
body/drive/fuel type, engine, transmission) with confidence score.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 5s cloud timeout was too tight for the initial WIF authentication
which requires 3 HTTP round-trips (STS, IAM credentials, resource
manager). First call took 5.5s and was discarded, falling back to slow
CPU-based PaddleOCR. Increased to 10s to accommodate cold-start auth.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pdf2image requires poppler-utils which is not installed in the OCR
container. PyMuPDF is already in requirements.txt and can render PDF
pages to PNG at 300 DPI natively without extra system dependencies.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The receipt extractor only accepted image MIME types, rejecting PDFs at
the OCR layer. Added application/pdf to supported types and PDF-to-image
conversion (first page at 300 DPI) before OCR preprocessing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New MaintenanceReceiptExtractor: Gemini-primary extraction with regex
cross-validation for dates, amounts, and odometer readings
- New maintenance_receipt_validation.py: cross-validation patterns for
structured field confidence adjustment
- New POST /extract/maintenance-receipt endpoint reusing
ReceiptExtractionResponse model
- Per-field confidence scores (0.0-1.0) with Gemini base 0.85,
boosted/reduced by regex agreement
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add/update documentation across backend, Python OCR service, and frontend
for receipt scanning, manual extraction, and Gemini integration. Create
new CLAUDE.md files for engines/, fuel-logs/, documents/, and maintenance/
features.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace traditional OCR pipeline (table_detector, table_parser,
maintenance_patterns) with GeminiEngine for semantic PDF extraction.
Map Gemini serviceName values to 27 maintenance subtypes via
ServiceMapper fuzzy matching. Add 8 unit tests covering normal
extraction, unusual names, empty response, and error handling.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add standalone GeminiEngine class for maintenance schedule extraction
from PDF owners manuals using Vertex AI Gemini 2.5 Flash with structured
JSON output enforcement, 20MB size limit, and lazy initialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The set -e + curl --fail-with-body inside $() caused the script to exit
with code 22 and empty stderr, hiding the actual Auth0 error. Switch to
writing the body to a temp file and checking HTTP status manually so the
error response is visible in logs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Create fetch-auth0-token.sh for Auth0 M2M -> GCP WIF token exchange
- Add jq to Dockerfile system dependencies
- Ensure script is executable in container image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add VISION_MONTHLY_LIMIT config setting (default 1000)
- Update CloudEngine to use WIF credential config via ADC
- Rewrite HybridEngine to support cloud-primary with Redis counter
- Pass monthly_limit through engine factory
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add engine abstraction tests and update docs to reflect PaddleOCR primary
architecture with optional Google Vision cloud fallback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace libtesseract-dev with libgomp1 (OpenMP for PaddlePaddle)
- Pre-download PP-OCRv4 models during Docker build
- Add OCR engine env vars to all compose files (base, staging, prod)
- Add optional Google Vision secret mount (commented, enable on demand)
- Create google-vision-key.json.example placeholder
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CloudEngine wraps Google Vision TEXT_DETECTION with lazy init.
HybridEngine runs primary engine, falls back to cloud when confidence
is below threshold. Disabled by default (OCR_FALLBACK_ENGINE=none).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace direct pytesseract calls with OcrEngine interface in vin_extractor.py,
receipt_extractor.py, and ocr_service.py. PSM mode fallbacks replaced with
engine-agnostic single-line/single-word configs. Dead _process_ocr_data removed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce pluggable OcrEngine ABC with PaddleOCR PP-OCRv4 as primary
engine and Tesseract wrapper for backward compatibility. Engine factory
reads OCR_PRIMARY_ENGINE config to instantiate the correct engine.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When OCR reads extra characters (e.g. sticker border as 'C', spurious
'Z' insertion), the raw text exceeds 17 chars and the old first-17
trim produced wrong VINs. New strategy tries all 17-char sliding
windows and single/double character deletions, validating each via
check digit. For 'CWVGGNPE2Z4NP069500', this finds the correct VIN
'WVGGNPE24NP069500' (valid check digit) instead of 'CWVGGNPE2Z4NP0695'
(invalid).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tessedit_char_whitelist does not work with OEM 1 (LSTM engine) and
causes empty/erratic output. This was the root cause of Tesseract
returning empty text despite clear, well-preprocessed images.
Character filtering is already handled post-OCR by the VIN validator's
correct_ocr_errors() method (I->1, O->0, Q->0, etc).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The min-channel correctly extracts contrast (white text=255 vs green
sticker bg=130), but Tesseract expects dark text on light background.
Without inversion, the grayscale-only path returned empty text for
every PSM mode because Tesseract couldn't see bright-on-dark text.
Invert via bitwise_not: text becomes 0 (black), sticker bg becomes
125 (gray). Fixes all three OCR paths (adaptive, grayscale, Otsu).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two fixes:
1. Always use min-channel for color images instead of gated comparison
that was falling back to standard grayscale (which has only 23%
contrast for white-on-green VIN stickers).
2. Add grayscale-only OCR path (CLAHE + denoise, no thresholding)
between adaptive and Otsu attempts. Tesseract's LSTM engine is
designed to handle grayscale input directly and often outperforms
binarized input where thresholding creates artifacts.
Pipeline order: adaptive threshold → grayscale-only → Otsu threshold
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace std-based channel selection (which incorrectly picked green for
green-tinted VIN stickers) with per-pixel min(B,G,R). White text stays
255 in all channels while colored backgrounds drop to their weakest
channel value, giving 2x contrast improvement. Add morphological
opening after thresholding to remove noise speckles from car body
surface that were confusing Tesseract's page segmentation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
White text on green VIN stickers has only ~12% contrast in standard
grayscale conversion because the green channel dominates luminance.
The new _best_contrast_channel method evaluates each RGB channel's
standard deviation and selects the one with highest contrast, giving
~2x improvement for green-tinted VIN stickers. Falls back to standard
grayscale for neutral-colored images.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Save original, adaptive, and Otsu preprocessed images to
/tmp/vin-debug/{timestamp}/ when LOG_LEVEL is set to debug.
No images saved at info level. Volume mount added for access.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace filesystem-based debug system (VIN_DEBUG_DIR) with standard
logger.debug() calls that flow through Loki when LOG_LEVEL=DEBUG.
Use .env.logging variable for OCR LOG_LEVEL. Increase image capture
quality to 0.95 for better OCR accuracy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>