fix: VIN OCR scanning fails with "No VIN Pattern found" on all images (#113) #114
Reference in New Issue
Block a user
Delete Branch "issue-113-fix-vin-ocr-scanning"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #113
Summary
VIN scanning from the "Add Vehicle" screen failed on all images with "No VIN Pattern found in image". Root cause: Tesseract fragments VINs into multiple words (e.g., "1HGBH 41JXMN 109186") but candidate extraction required a continuous 17-char sequence, rejecting everything.
Changes
vin_validator.py): Two-strategy approach -- continuous run matching + sliding window concatenation of adjacent OCR fragmentsvin_extractor.py): Disable dictionaries, set LSTM engine (OEM 1)vin_extractor.py): Sparse text and raw line modes for difficult imagesvin_preprocessor.py): Upscale images below 600px width for Tesseract accuracyvin_preprocessor.py): Alternative preprocessing fallback when adaptive thresholding failsvin_validator.py): Remove incorrect B->8 and S->5 mappings (valid VIN chars)Test Plan
Save original, adaptive, and Otsu preprocessed images to /tmp/vin-debug/{timestamp}/ when LOG_LEVEL is set to debug. No images saved at info level. Volume mount added for access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>