feat: VIN Photo OCR Pipeline #67

New Issue

egullickson · 2026-02-01T18:47:01Z

egullickson commented

2026-02-01 18:47:01 +00:00

Overview

Implement VIN-specific OCR extraction in the OCR service, including image preprocessing optimized for VIN plates/stickers and 17-character pattern matching.

Parent Issue: #12 (OCR-powered smart capture)
Priority: P1 - VIN OCR
Dependencies: OCR Service Container Setup, Core OCR API Integration

Scope

VIN Extraction Endpoint

POST /extract/vin
Content-Type: multipart/form-data

Request:
  - file: image (HEIC, JPEG, PNG)

Response:
{
  "success": true,
  "vin": "1HGBH41JXMN109186",
  "confidence": 0.94,
  "boundingBox": { "x": 120, "y": 80, "width": 340, "height": 45 },
  "alternatives": [
    { "vin": "1HGBH41JXMN109186", "confidence": 0.94 },
    { "vin": "1HGBH41JXMN1O9186", "confidence": 0.72 }
  ],
  "processingTimeMs": 1250
}

Image Preprocessing Pipeline

Input Image
    ↓
HEIC Conversion (pillow-heif) if needed
    ↓
Grayscale conversion
    ↓
Deskew (correct rotation/tilt)
    ↓
Contrast enhancement (CLAHE)
    ↓
Noise reduction (fastNlMeansDenoising)
    ↓
Adaptive thresholding
    ↓
OCR with Tesseract
    ↓
VIN pattern extraction

VIN Pattern Matching

17 alphanumeric characters (modern vehicles 1981+)
Exclude I, O, Q (not used in VINs)
Validate check digit (position 9)
Support relaxed matching for pre-1981 (11-17 chars)
Return confidence based on:
- Character recognition confidence
- Pattern validity
- Check digit validation

VIN Validation Rules

VIN_PATTERN = r'^[A-HJ-NPR-Z0-9]{17}$'
TRANSLITERATION = {'I': '1', 'O': '0', 'Q': '0'}  # Common OCR errors

def validate_vin(vin: str) -> tuple[bool, float]:
    # Check length
    # Check character set
    # Validate check digit
    # Return (is_valid, confidence_adjustment)

Directory Structure

ocr/app/
├── extractors/
│   ├── __init__.py
│   ├── base.py           # Base extractor class
│   └── vin_extractor.py  # VIN-specific logic
├── preprocessors/
│   ├── __init__.py
│   └── vin_preprocessor.py  # VIN-optimized preprocessing
└── validators/
    ├── __init__.py
    └── vin_validator.py  # VIN format validation

Test Cases

Input	Expected Output
Clear VIN sticker photo	Correct VIN, confidence > 90%
Angled VIN plate	Correct VIN after deskew, confidence > 80%
Dashboard VIN through windshield	Correct VIN, confidence > 70%
Low light VIN photo	Best effort, confidence may be lower
Non-VIN image	`success: false`, no VIN returned
Partial VIN visible	Partial match with low confidence

Acceptance Criteria

Endpoint accepts HEIC, JPEG, PNG images
HEIC conversion works correctly
Preprocessing improves OCR accuracy
VIN pattern extraction returns valid VIN
Check digit validation implemented
Common OCR errors corrected (I→1, O→0)
Confidence scoring reflects quality
Alternatives returned when ambiguous
Processing time < 3 seconds
Handles edge cases gracefully
Unit tests for validation logic
Integration tests with sample images

Technical Notes

Tesseract PSM mode: 7 (single text line) or 8 (single word)
Character whitelist: ABCDEFGHJKLMNPRSTUVWXYZ0123456789
Consider region-of-interest detection to crop VIN area
Log OCR attempts for debugging (redact actual VIN in production)

Out of Scope

Camera capture UI (see #12c)
VehicleForm integration (see #12e)
NHTSA decode (existing feature, just wire up)
PaddleOCR fallback (add if Tesseract insufficient)

## Overview Implement VIN-specific OCR extraction in the OCR service, including image preprocessing optimized for VIN plates/stickers and 17-character pattern matching. **Parent Issue**: #12 (OCR-powered smart capture) **Priority**: P1 - VIN OCR **Dependencies**: OCR Service Container Setup, Core OCR API Integration ## Scope ### VIN Extraction Endpoint ``` POST /extract/vin Content-Type: multipart/form-data Request: - file: image (HEIC, JPEG, PNG) Response: { "success": true, "vin": "1HGBH41JXMN109186", "confidence": 0.94, "boundingBox": { "x": 120, "y": 80, "width": 340, "height": 45 }, "alternatives": [ { "vin": "1HGBH41JXMN109186", "confidence": 0.94 }, { "vin": "1HGBH41JXMN1O9186", "confidence": 0.72 } ], "processingTimeMs": 1250 } ``` ### Image Preprocessing Pipeline ``` Input Image ↓ HEIC Conversion (pillow-heif) if needed ↓ Grayscale conversion ↓ Deskew (correct rotation/tilt) ↓ Contrast enhancement (CLAHE) ↓ Noise reduction (fastNlMeansDenoising) ↓ Adaptive thresholding ↓ OCR with Tesseract ↓ VIN pattern extraction ``` ### VIN Pattern Matching - 17 alphanumeric characters (modern vehicles 1981+) - Exclude I, O, Q (not used in VINs) - Validate check digit (position 9) - Support relaxed matching for pre-1981 (11-17 chars) - Return confidence based on: - Character recognition confidence - Pattern validity - Check digit validation ### VIN Validation Rules ```python VIN_PATTERN = r'^[A-HJ-NPR-Z0-9]{17}$' TRANSLITERATION = {'I': '1', 'O': '0', 'Q': '0'} # Common OCR errors def validate_vin(vin: str) -> tuple[bool, float]: # Check length # Check character set # Validate check digit # Return (is_valid, confidence_adjustment) ``` ## Directory Structure ``` ocr/app/ ├── extractors/ │ ├── __init__.py │ ├── base.py # Base extractor class │ └── vin_extractor.py # VIN-specific logic ├── preprocessors/ │ ├── __init__.py │ └── vin_preprocessor.py # VIN-optimized preprocessing └── validators/ ├── __init__.py └── vin_validator.py # VIN format validation ``` ## Test Cases | Input | Expected Output | |-------|-----------------| | Clear VIN sticker photo | Correct VIN, confidence > 90% | | Angled VIN plate | Correct VIN after deskew, confidence > 80% | | Dashboard VIN through windshield | Correct VIN, confidence > 70% | | Low light VIN photo | Best effort, confidence may be lower | | Non-VIN image | `success: false`, no VIN returned | | Partial VIN visible | Partial match with low confidence | ## Acceptance Criteria - [ ] Endpoint accepts HEIC, JPEG, PNG images - [ ] HEIC conversion works correctly - [ ] Preprocessing improves OCR accuracy - [ ] VIN pattern extraction returns valid VIN - [ ] Check digit validation implemented - [ ] Common OCR errors corrected (I→1, O→0) - [ ] Confidence scoring reflects quality - [ ] Alternatives returned when ambiguous - [ ] Processing time < 3 seconds - [ ] Handles edge cases gracefully - [ ] Unit tests for validation logic - [ ] Integration tests with sample images ## Technical Notes - Tesseract PSM mode: 7 (single text line) or 8 (single word) - Character whitelist: `ABCDEFGHJKLMNPRSTUVWXYZ0123456789` - Consider region-of-interest detection to crop VIN area - Log OCR attempts for debugging (redact actual VIN in production) ## Out of Scope - Camera capture UI (see #12c) - VehicleForm integration (see #12e) - NHTSA decode (existing feature, just wire up) - PaddleOCR fallback (add if Tesseract insufficient)

egullickson added the

labels 2026-02-01 18:48:36 +00:00

egullickson referenced this issue

2026-02-01 18:49:00 +00:00

feat: OCR-powered smart capture for VIN, receipts, and owner's manuals #12

egullickson added

status

in-progress