feat: add receipt classifier and OCR integration (refs #157)

- New ReceiptClassifier module with keyword-based classification for
  fuel vs maintenance receipts from email text and OCR raw text
- Classifier-first pipeline: classify from email subject/body keywords
  before falling back to OCR-based classification
- Fuel keywords: gas, fuel, gallons, octane, pump, diesel, unleaded,
  shell, chevron, exxon, bp
- Maintenance keywords: oil change, brake, alignment, tire, rotation,
  inspection, labor, parts, service, repair, transmission, coolant
- Confident classification (>= 2 keyword matches) routes to specific
  OCR endpoint; unclassified falls back to both endpoints + rawText
  classification + field-count heuristic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Eric Gullickson
2026-02-13 08:44:03 -06:00
parent e7f3728771
commit d9a40f7d37
4 changed files with 234 additions and 32 deletions

View File

@@ -6,14 +6,17 @@
export { emailIngestionWebhookRoutes } from './api/email-ingestion.routes';
export { EmailIngestionService } from './domain/email-ingestion.service';
export { EmailIngestionRepository } from './data/email-ingestion.repository';
export { ReceiptClassifier } from './domain/receipt-classifier';
export { ResendInboundClient } from './external/resend-inbound.client';
export type { ParsedEmailResult, ParsedEmailAttachment } from './external/resend-inbound.client';
export type {
ClassificationResult,
EmailIngestionQueueRecord,
EmailIngestionStatus,
EmailProcessingResult,
ExtractedReceiptData,
PendingVehicleAssociation,
ReceiptClassificationType,
ResendWebhookEvent,
ResendWebhookEventData,
} from './domain/email-ingestion.types';