motovaultpro

Author	SHA1	Message	Date
Eric Gullickson	b9fe222f12	fix: Build errors and tesseract removal Some checks failed Deploy to Staging / Build Images (pull_request) Failing after 4m14s Details Deploy to Staging / Deploy to Staging (pull_request) Has been skipped Details Deploy to Staging / Verify Staging (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Failure (pull_request) Successful in 8s Details	2026-02-07 12:12:04 -06:00
Eric Gullickson	013fb0c67a	feat: migrate VIN/receipt extractors and OCR service to engine abstraction (refs #117 ) Replace direct pytesseract calls with OcrEngine interface in vin_extractor.py, receipt_extractor.py, and ocr_service.py. PSM mode fallbacks replaced with engine-agnostic single-line/single-word configs. Dead _process_ocr_data removed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:56:27 -06:00
Eric Gullickson	432b3bda36	fix: remove char whitelist incompatible with Tesseract LSTM (refs #113 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 36s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s Details Deploy to Staging / Verify Staging (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details tessedit_char_whitelist does not work with OEM 1 (LSTM engine) and causes empty/erratic output. This was the root cause of Tesseract returning empty text despite clear, well-preprocessed images. Character filtering is already handled post-OCR by the VIN validator's correct_ocr_errors() method (I->1, O->0, Q->0, etc). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 21:52:08 -06:00
Eric Gullickson	63c027a454	fix: always use min-channel and add grayscale-only OCR path (refs #113 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 35s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 50s Details Deploy to Staging / Verify Staging (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 7s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Two fixes: 1. Always use min-channel for color images instead of gated comparison that was falling back to standard grayscale (which has only 23% contrast for white-on-green VIN stickers). 2. Add grayscale-only OCR path (CLAHE + denoise, no thresholding) between adaptive and Otsu attempts. Tesseract's LSTM engine is designed to handle grayscale input directly and often outperforms binarized input where thresholding creates artifacts. Pipeline order: adaptive threshold → grayscale-only → Otsu threshold Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 21:32:52 -06:00
Eric Gullickson	ff3858f750	fix: add debug image saving gated on LOG_LEVEL=debug (refs #113 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 36s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 21s Details Deploy to Staging / Verify Staging (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 7s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Save original, adaptive, and Otsu preprocessed images to /tmp/vin-debug/{timestamp}/ when LOG_LEVEL is set to debug. No images saved at info level. Volume mount added for access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 20:26:06 -06:00
Eric Gullickson	d5696320f1	fix: align VIN OCR logging with unified logging design (refs #113 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 3m25s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m36s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 9s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Replace filesystem-based debug system (VIN_DEBUG_DIR) with standard logger.debug() calls that flow through Loki when LOG_LEVEL=DEBUG. Use .env.logging variable for OCR LOG_LEVEL. Increase image capture quality to 0.95 for better OCR accuracy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 19:36:35 -06:00
Eric Gullickson	6a4c2137f7	fix: resolve VIN OCR scanning failures on all images (refs #113 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 35s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m31s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Root cause: Tesseract fragments VINs into multiple words but candidate extraction required continuous 17-char sequences, rejecting all results. Changes: - Fix candidate extraction to concatenate adjacent OCR fragments - Disable Tesseract dictionaries (VINs are not dictionary words) - Set OEM 1 (LSTM engine) for better accuracy - Add PSM 11 (sparse text) and PSM 13 (raw line) fallback modes - Add Otsu's thresholding as alternative preprocessing pipeline - Upscale small images to meet Tesseract's 300 DPI requirement - Remove incorrect B->8 and S->5 transliterations (valid VIN chars) - Fix pre-existing test bug in check digit expected value Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 15:57:14 -06:00
Eric Gullickson	54cbd49171	feat: add VIN photo OCR pipeline (refs #67 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 31s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m19s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Implement VIN-specific OCR extraction with optimized preprocessing: - Add POST /extract/vin endpoint for VIN extraction - VIN preprocessor: CLAHE, deskew, denoise, adaptive threshold - VIN validator: check digit validation, OCR error correction (I->1, O->0) - VIN extractor: PSM modes 6/7/8, character whitelist, alternatives - Response includes confidence, bounding box, and alternatives - Unit tests for validator and preprocessor - Integration tests for VIN extraction endpoint Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 19:31:36 -06:00

8 Commits