feat: VIN Photo OCR Pipeline #67

Closed
opened 2026-02-01 18:47:01 +00:00 by egullickson · 0 comments
Owner

Overview

Implement VIN-specific OCR extraction in the OCR service, including image preprocessing optimized for VIN plates/stickers and 17-character pattern matching.

Parent Issue: #12 (OCR-powered smart capture)
Priority: P1 - VIN OCR
Dependencies: OCR Service Container Setup, Core OCR API Integration

Scope

VIN Extraction Endpoint

POST /extract/vin
Content-Type: multipart/form-data

Request:
  - file: image (HEIC, JPEG, PNG)

Response:
{
  "success": true,
  "vin": "1HGBH41JXMN109186",
  "confidence": 0.94,
  "boundingBox": { "x": 120, "y": 80, "width": 340, "height": 45 },
  "alternatives": [
    { "vin": "1HGBH41JXMN109186", "confidence": 0.94 },
    { "vin": "1HGBH41JXMN1O9186", "confidence": 0.72 }
  ],
  "processingTimeMs": 1250
}

Image Preprocessing Pipeline

Input Image
    ↓
HEIC Conversion (pillow-heif) if needed
    ↓
Grayscale conversion
    ↓
Deskew (correct rotation/tilt)
    ↓
Contrast enhancement (CLAHE)
    ↓
Noise reduction (fastNlMeansDenoising)
    ↓
Adaptive thresholding
    ↓
OCR with Tesseract
    ↓
VIN pattern extraction

VIN Pattern Matching

  • 17 alphanumeric characters (modern vehicles 1981+)
  • Exclude I, O, Q (not used in VINs)
  • Validate check digit (position 9)
  • Support relaxed matching for pre-1981 (11-17 chars)
  • Return confidence based on:
    • Character recognition confidence
    • Pattern validity
    • Check digit validation

VIN Validation Rules

VIN_PATTERN = r'^[A-HJ-NPR-Z0-9]{17}$'
TRANSLITERATION = {'I': '1', 'O': '0', 'Q': '0'}  # Common OCR errors

def validate_vin(vin: str) -> tuple[bool, float]:
    # Check length
    # Check character set
    # Validate check digit
    # Return (is_valid, confidence_adjustment)

Directory Structure

ocr/app/
├── extractors/
│   ├── __init__.py
│   ├── base.py           # Base extractor class
│   └── vin_extractor.py  # VIN-specific logic
├── preprocessors/
│   ├── __init__.py
│   └── vin_preprocessor.py  # VIN-optimized preprocessing
└── validators/
    ├── __init__.py
    └── vin_validator.py  # VIN format validation

Test Cases

Input Expected Output
Clear VIN sticker photo Correct VIN, confidence > 90%
Angled VIN plate Correct VIN after deskew, confidence > 80%
Dashboard VIN through windshield Correct VIN, confidence > 70%
Low light VIN photo Best effort, confidence may be lower
Non-VIN image success: false, no VIN returned
Partial VIN visible Partial match with low confidence

Acceptance Criteria

  • Endpoint accepts HEIC, JPEG, PNG images
  • HEIC conversion works correctly
  • Preprocessing improves OCR accuracy
  • VIN pattern extraction returns valid VIN
  • Check digit validation implemented
  • Common OCR errors corrected (I→1, O→0)
  • Confidence scoring reflects quality
  • Alternatives returned when ambiguous
  • Processing time < 3 seconds
  • Handles edge cases gracefully
  • Unit tests for validation logic
  • Integration tests with sample images

Technical Notes

  • Tesseract PSM mode: 7 (single text line) or 8 (single word)
  • Character whitelist: ABCDEFGHJKLMNPRSTUVWXYZ0123456789
  • Consider region-of-interest detection to crop VIN area
  • Log OCR attempts for debugging (redact actual VIN in production)

Out of Scope

  • Camera capture UI (see #12c)
  • VehicleForm integration (see #12e)
  • NHTSA decode (existing feature, just wire up)
  • PaddleOCR fallback (add if Tesseract insufficient)
## Overview Implement VIN-specific OCR extraction in the OCR service, including image preprocessing optimized for VIN plates/stickers and 17-character pattern matching. **Parent Issue**: #12 (OCR-powered smart capture) **Priority**: P1 - VIN OCR **Dependencies**: OCR Service Container Setup, Core OCR API Integration ## Scope ### VIN Extraction Endpoint ``` POST /extract/vin Content-Type: multipart/form-data Request: - file: image (HEIC, JPEG, PNG) Response: { "success": true, "vin": "1HGBH41JXMN109186", "confidence": 0.94, "boundingBox": { "x": 120, "y": 80, "width": 340, "height": 45 }, "alternatives": [ { "vin": "1HGBH41JXMN109186", "confidence": 0.94 }, { "vin": "1HGBH41JXMN1O9186", "confidence": 0.72 } ], "processingTimeMs": 1250 } ``` ### Image Preprocessing Pipeline ``` Input Image ↓ HEIC Conversion (pillow-heif) if needed ↓ Grayscale conversion ↓ Deskew (correct rotation/tilt) ↓ Contrast enhancement (CLAHE) ↓ Noise reduction (fastNlMeansDenoising) ↓ Adaptive thresholding ↓ OCR with Tesseract ↓ VIN pattern extraction ``` ### VIN Pattern Matching - 17 alphanumeric characters (modern vehicles 1981+) - Exclude I, O, Q (not used in VINs) - Validate check digit (position 9) - Support relaxed matching for pre-1981 (11-17 chars) - Return confidence based on: - Character recognition confidence - Pattern validity - Check digit validation ### VIN Validation Rules ```python VIN_PATTERN = r'^[A-HJ-NPR-Z0-9]{17}$' TRANSLITERATION = {'I': '1', 'O': '0', 'Q': '0'} # Common OCR errors def validate_vin(vin: str) -> tuple[bool, float]: # Check length # Check character set # Validate check digit # Return (is_valid, confidence_adjustment) ``` ## Directory Structure ``` ocr/app/ ├── extractors/ │ ├── __init__.py │ ├── base.py # Base extractor class │ └── vin_extractor.py # VIN-specific logic ├── preprocessors/ │ ├── __init__.py │ └── vin_preprocessor.py # VIN-optimized preprocessing └── validators/ ├── __init__.py └── vin_validator.py # VIN format validation ``` ## Test Cases | Input | Expected Output | |-------|-----------------| | Clear VIN sticker photo | Correct VIN, confidence > 90% | | Angled VIN plate | Correct VIN after deskew, confidence > 80% | | Dashboard VIN through windshield | Correct VIN, confidence > 70% | | Low light VIN photo | Best effort, confidence may be lower | | Non-VIN image | `success: false`, no VIN returned | | Partial VIN visible | Partial match with low confidence | ## Acceptance Criteria - [ ] Endpoint accepts HEIC, JPEG, PNG images - [ ] HEIC conversion works correctly - [ ] Preprocessing improves OCR accuracy - [ ] VIN pattern extraction returns valid VIN - [ ] Check digit validation implemented - [ ] Common OCR errors corrected (I→1, O→0) - [ ] Confidence scoring reflects quality - [ ] Alternatives returned when ambiguous - [ ] Processing time < 3 seconds - [ ] Handles edge cases gracefully - [ ] Unit tests for validation logic - [ ] Integration tests with sample images ## Technical Notes - Tesseract PSM mode: 7 (single text line) or 8 (single word) - Character whitelist: `ABCDEFGHJKLMNPRSTUVWXYZ0123456789` - Consider region-of-interest detection to crop VIN area - Log OCR attempts for debugging (redact actual VIN in production) ## Out of Scope - Camera capture UI (see #12c) - VehicleForm integration (see #12e) - NHTSA decode (existing feature, just wire up) - PaddleOCR fallback (add if Tesseract insufficient)
egullickson added the
status
backlog
type
feature
labels 2026-02-01 18:48:36 +00:00
egullickson added
status
in-progress
and removed
status
backlog
labels 2026-02-02 01:26:05 +00:00
egullickson added
status
review
and removed
status
in-progress
labels 2026-02-02 01:32:03 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: egullickson/motovaultpro#67