feat: Receipt OCR Pipeline (#69) #77

Merged
egullickson merged 1 commits from issue-69-receipt-ocr-pipeline into main 2026-02-02 02:47:52 +00:00
Owner

Summary

  • Implements receipt-specific OCR extraction for fuel receipts
  • Pattern matching modules for date, currency, and fuel data extraction
  • Receipt-optimized image preprocessing for thermal receipts
  • POST /extract/receipt endpoint with field confidence scoring
  • Cross-validation of extracted fuel receipt data

Changes

New Files

  • ocr/app/patterns/ - Pattern matching modules
    • date_patterns.py - Multiple date format support (MM/DD/YYYY, Mon DD YYYY, etc.)
    • currency_patterns.py - Total/amount extraction (TOTAL, SALE, AMOUNT DUE, etc.)
    • fuel_patterns.py - Gallons, price per unit, fuel grade extraction
  • ocr/app/preprocessors/receipt_preprocessor.py - Thermal receipt optimizations
  • ocr/app/extractors/receipt_extractor.py - Main receipt extraction logic
  • ocr/app/extractors/fuel_receipt.py - Fuel-specific validation

API Changes

  • Added POST /extract/receipt endpoint
  • Accepts HEIC, JPEG, PNG images
  • Returns: merchantName, transactionDate, totalAmount, fuelQuantity, pricePerUnit, fuelGrade

Test Plan

  • Run unit tests for pattern matchers
  • Test with sample fuel receipts (clear, faded, crumpled)
  • Verify processing time <3 seconds
  • Test HEIC conversion from iPhone photos

Closes #69

Generated with Claude Code

## Summary - Implements receipt-specific OCR extraction for fuel receipts - Pattern matching modules for date, currency, and fuel data extraction - Receipt-optimized image preprocessing for thermal receipts - POST /extract/receipt endpoint with field confidence scoring - Cross-validation of extracted fuel receipt data ## Changes ### New Files - `ocr/app/patterns/` - Pattern matching modules - `date_patterns.py` - Multiple date format support (MM/DD/YYYY, Mon DD YYYY, etc.) - `currency_patterns.py` - Total/amount extraction (TOTAL, SALE, AMOUNT DUE, etc.) - `fuel_patterns.py` - Gallons, price per unit, fuel grade extraction - `ocr/app/preprocessors/receipt_preprocessor.py` - Thermal receipt optimizations - `ocr/app/extractors/receipt_extractor.py` - Main receipt extraction logic - `ocr/app/extractors/fuel_receipt.py` - Fuel-specific validation ### API Changes - Added `POST /extract/receipt` endpoint - Accepts HEIC, JPEG, PNG images - Returns: merchantName, transactionDate, totalAmount, fuelQuantity, pricePerUnit, fuelGrade ## Test Plan - [ ] Run unit tests for pattern matchers - [ ] Test with sample fuel receipts (clear, faded, crumpled) - [ ] Verify processing time <3 seconds - [ ] Test HEIC conversion from iPhone photos Closes #69 Generated with [Claude Code](https://claude.com/claude-code)
egullickson added 1 commit 2026-02-02 02:43:48 +00:00
feat: add receipt OCR pipeline (refs #69)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 32s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 31s
Deploy to Staging / Verify Staging (pull_request) Successful in 2m20s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
6319d50fb1
Implement receipt-specific OCR extraction for fuel receipts:

- Pattern matching modules for date, currency, and fuel data extraction
- Receipt-optimized image preprocessing for thermal receipts
- POST /extract/receipt endpoint with field extraction
- Confidence scoring per extracted field
- Cross-validation of fuel receipt data
- Unit tests for all pattern matchers

Extracted fields: merchantName, transactionDate, totalAmount,
fuelQuantity, pricePerUnit, fuelGrade

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
egullickson merged commit 2b9a0608f3 into main 2026-02-02 02:47:52 +00:00
egullickson deleted branch issue-69-receipt-ocr-pipeline 2026-02-02 02:47:53 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: egullickson/motovaultpro#77