feat: add maintenance receipt extraction pipeline with Gemini + regex (refs #150)

- New MaintenanceReceiptExtractor: Gemini-primary extraction with regex
  cross-validation for dates, amounts, and odometer readings
- New maintenance_receipt_validation.py: cross-validation patterns for
  structured field confidence adjustment
- New POST /extract/maintenance-receipt endpoint reusing
  ReceiptExtractionResponse model
- Per-field confidence scores (0.0-1.0) with Gemini base 0.85,
  boosted/reduced by regex agreement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Eric Gullickson
2026-02-12 21:14:13 -06:00
parent 0e97128a31
commit 90401dc1ba
5 changed files with 713 additions and 0 deletions

View File

@@ -8,6 +8,10 @@ from app.extractors.receipt_extractor import (
ExtractedField,
)
from app.extractors.fuel_receipt import FuelReceiptExtractor, fuel_receipt_extractor
from app.extractors.maintenance_receipt_extractor import (
MaintenanceReceiptExtractor,
maintenance_receipt_extractor,
)
from app.extractors.manual_extractor import (
ManualExtractor,
manual_extractor,
@@ -27,6 +31,8 @@ __all__ = [
"ExtractedField",
"FuelReceiptExtractor",
"fuel_receipt_extractor",
"MaintenanceReceiptExtractor",
"maintenance_receipt_extractor",
"ManualExtractor",
"manual_extractor",
"ManualExtractionResult",