motovaultpro

Author	SHA1	Message	Date
Eric Gullickson	3eb54211cb	feat: add owner's manual OCR pipeline (refs #71 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 3m1s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m19s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 7s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Implement async PDF processing for owner's manuals with maintenance schedule extraction: - Add PDF preprocessor with PyMuPDF for text/scanned PDF handling - Add maintenance pattern matching (mileage, time, fluid specs) - Add service name mapping to maintenance subtypes - Add table detection and parsing for schedule tables - Add manual extractor orchestrating the complete pipeline - Add POST /extract/manual endpoint for async job submission - Add Redis job queue support for manual extraction jobs - Add progress tracking during processing Processing pipeline: 1. Analyze PDF structure (text layer vs scanned) 2. Find maintenance schedule sections 3. Extract text or OCR scanned pages at 300 DPI 4. Detect and parse maintenance tables 5. Normalize service names and extract intervals 6. Return structured maintenance schedules with confidence scores Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 21:30:20 -06:00
Eric Gullickson	6319d50fb1	feat: add receipt OCR pipeline (refs #69 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 32s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m20s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Implement receipt-specific OCR extraction for fuel receipts: - Pattern matching modules for date, currency, and fuel data extraction - Receipt-optimized image preprocessing for thermal receipts - POST /extract/receipt endpoint with field extraction - Confidence scoring per extracted field - Cross-validation of fuel receipt data - Unit tests for all pattern matchers Extracted fields: merchantName, transactionDate, totalAmount, fuelQuantity, pricePerUnit, fuelGrade Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 20:43:30 -06:00
Eric Gullickson	54cbd49171	feat: add VIN photo OCR pipeline (refs #67 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 31s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m19s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details Implement VIN-specific OCR extraction with optimized preprocessing: - Add POST /extract/vin endpoint for VIN extraction - VIN preprocessor: CLAHE, deskew, denoise, adaptive threshold - VIN validator: check digit validation, OCR error correction (I->1, O->0) - VIN extractor: PSM modes 6/7/8, character whitelist, alternatives - Response includes confidence, bounding box, and alternatives - Unit tests for validator and preprocessor - Integration tests for VIN extraction endpoint Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 19:31:36 -06:00
Eric Gullickson	852c9013b5	feat: add core OCR API integration (refs #65 ) All checks were successful Deploy to Staging / Build Images (pull_request) Successful in 5m59s Details Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s Details Deploy to Staging / Verify Staging (pull_request) Successful in 2m19s Details Deploy to Staging / Notify Staging Ready (pull_request) Successful in 7s Details Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped Details OCR Service (Python/FastAPI): - POST /extract for synchronous OCR extraction - POST /jobs and GET /jobs/{job_id} for async processing - Image preprocessing (deskew, denoise) for accuracy - HEIC conversion via pillow-heif - Redis job queue for async processing Backend (Fastify): - POST /api/ocr/extract - authenticated proxy to OCR - POST /api/ocr/jobs - async job submission - GET /api/ocr/jobs/:jobId - job polling - Multipart file upload handling - JWT authentication required File size limits: 10MB sync, 200MB async Processing time target: <3 seconds for typical photos Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 16:02:11 -06:00
Eric Gullickson	1ba491144b	feat: add OCR service container (refs #64 ) Some checks failed Deploy to Staging / Build Images (pull_request) Successful in 7m41s Details Deploy to Staging / Deploy to Staging (pull_request) Failing after 13s Details Deploy to Staging / Verify Staging (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped Details Deploy to Staging / Notify Staging Failure (pull_request) Successful in 8s Details Add Python-based OCR service container (mvp-ocr) as the 6th service: - Python 3.11-slim with FastAPI/uvicorn - Tesseract OCR with English language pack - pillow-heif for HEIC image support - opencv-python-headless for image preprocessing - Health endpoint at /health - Unit tests for health, HEIC support, and Tesseract availability Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 13:06:16 -06:00

5 Commits