docs: update CLAUDE.md indexes and README for OCR expansion (refs #137)

Add/update documentation across backend, Python OCR service, and frontend
for receipt scanning, manual extraction, and Gemini integration. Create
new CLAUDE.md files for engines/, fuel-logs/, documents/, and maintenance/
features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Eric Gullickson
2026-02-11 11:04:19 -06:00
parent 40df5e5b58
commit ab0d8463be
11 changed files with 385 additions and 45 deletions

View File

@@ -14,7 +14,7 @@
| `config/` | Configuration loading (env, database, redis) | Environment setup, connection pools |
| `logging/` | Winston structured logging | Log configuration, debugging |
| `middleware/` | Fastify middleware | Request processing, user extraction |
| `plugins/` | Fastify plugins (auth, error, logging) | Plugin registration, hooks |
| `plugins/` | Fastify plugins (auth, error, logging, tier guard) | Plugin registration, hooks, tier gating |
| `scheduler/` | Job scheduling infrastructure | Scheduled tasks, cron jobs |
| `storage/` | Storage abstraction and adapters | File storage, S3/filesystem |
| `user-preferences/` | User preferences data and migrations | User settings storage |

View File

@@ -12,7 +12,7 @@
| `fuel-logs/` | Fuel consumption tracking | Fuel log CRUD, statistics |
| `maintenance/` | Maintenance record management | Service records, reminders |
| `notifications/` | Email and push notifications | Alert system, email templates |
| `ocr/` | OCR proxy to mvp-ocr service | Image text extraction, async jobs |
| `ocr/` | OCR proxy to mvp-ocr service (VIN, receipt, manual extraction) | Image text extraction, receipt scanning, manual PDF extraction, async jobs |
| `onboarding/` | User onboarding flow | First-time user setup |
| `ownership-costs/` | Ownership cost tracking and reports | Cost aggregation, expense analysis |
| `platform/` | Vehicle data and VIN decoding | Make/model lookup, VIN validation |

View File

@@ -1,16 +1,47 @@
# ocr/
Backend proxy for the Python OCR microservice. Handles authentication, tier gating, file validation, and request forwarding for VIN extraction, fuel receipt scanning, and maintenance manual extraction.
## Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `README.md` | Feature documentation | Understanding OCR proxy |
| `README.md` | Feature documentation with architecture diagrams | Understanding OCR proxy, data flows |
| `index.ts` | Feature barrel export | Importing OCR services |
## Subdirectories
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `api/` | HTTP endpoints and routes | API changes |
| `domain/` | Business logic, types | Core OCR proxy logic |
| `external/` | External OCR service client | OCR service integration |
| `api/` | HTTP endpoints, routes, request validation | API changes, adding endpoints |
| `domain/` | Business logic, TypeScript types | Core OCR proxy logic, type definitions |
| `external/` | HTTP client to Python OCR service | OCR service integration, error handling |
| `tests/` | Unit tests for receipt and manual extraction | Test changes, adding test coverage |
## api/
| File | What | When to read |
| ---- | ---- | ------------ |
| `ocr.controller.ts` | Request handlers for all OCR endpoints (extract, extractVin, extractReceipt, extractManual, submitJob, getJobStatus) | Adding/modifying endpoint behavior |
| `ocr.routes.ts` | Fastify route registration with auth and tier guard preHandlers | Route configuration, middleware changes |
| `ocr.validation.ts` | Request/response type definitions for route schemas | Changing request/response shapes |
## domain/
| File | What | When to read |
| ---- | ---- | ------------ |
| `ocr.service.ts` | Business logic layer: file validation, size limits (10MB sync, 200MB async), content type checks, service delegation | Core logic changes, validation rules |
| `ocr.types.ts` | TypeScript types: OcrResponse, VinExtractionResponse, ReceiptExtractionResponse, ManualExtractionResult, JobResponse, ManualJobResponse | Type changes, adding new response shapes |
## external/
| File | What | When to read |
| ---- | ---- | ------------ |
| `ocr-client.ts` | HTTP client to mvp-ocr Python service (extract, extractVin, extractReceipt, submitJob, submitManualJob, getJobStatus, isHealthy) | OCR service communication, error handling |
## tests/
| File | What | When to read |
| ---- | ---- | ------------ |
| `unit/ocr-receipt.test.ts` | Receipt extraction tests with mock client | Receipt flow changes |
| `unit/ocr-manual.test.ts` | Manual PDF extraction tests | Manual extraction flow changes |

View File

@@ -1,54 +1,180 @@
# OCR Feature
Backend proxy for OCR service communication. Handles authentication, validation, and file streaming to the OCR container.
Backend proxy for the Python OCR microservice. Handles authentication, tier gating, file validation, and request forwarding for three extraction types: VIN decoding, fuel receipt scanning, and maintenance manual extraction.
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/ocr/extract` | Synchronous OCR extraction (max 10MB) |
| POST | `/api/ocr/jobs` | Submit async OCR job (max 200MB) |
| GET | `/api/ocr/jobs/:jobId` | Poll async job status |
| Method | Endpoint | Description | Auth | Tier | Max Size |
|--------|----------|-------------|------|------|----------|
| POST | `/api/ocr/extract` | Synchronous general OCR extraction | Required | - | 10MB |
| POST | `/api/ocr/extract/vin` | VIN-specific extraction | Required | - | 10MB |
| POST | `/api/ocr/extract/receipt` | Fuel receipt extraction | Required | - | 10MB |
| POST | `/api/ocr/extract/manual` | Async maintenance manual extraction | Required | Pro | 200MB |
| POST | `/api/ocr/jobs` | Submit async OCR job | Required | - | 200MB |
| GET | `/api/ocr/jobs/:jobId` | Poll async job status | Required | - | - |
## Architecture
```
api/
ocr.controller.ts # Request handlers
ocr.routes.ts # Route registration
ocr.validation.ts # Request validation types
domain/
ocr.service.ts # Business logic
ocr.types.ts # TypeScript types
external/
ocr-client.ts # HTTP client to OCR service
Frontend
|
v
Backend Proxy (this feature)
|
+-- ocr.routes.ts --------> Route registration (auth + tier preHandlers)
|
+-- ocr.controller.ts ----> Request handlers (file validation, size checks)
|
+-- ocr.service.ts -------> Business logic (content type validation, delegation)
|
+-- ocr-client.ts --------> HTTP client to mvp-ocr:8000
|
v
Python OCR Service
```
## Receipt OCR Flow
```
Mobile Camera / File Upload
|
v
POST /api/ocr/extract/receipt (multipart/form-data)
|
v
OcrController.extractReceipt()
- Validates file size (<= 10MB)
- Validates content type (JPEG, PNG, HEIC)
|
v
OcrService.extractReceipt()
|
v
OcrClient.extractReceipt() --> HTTP POST --> Python /extract/receipt
| |
v v
ReceiptExtractionResponse ReceiptExtractor + HybridEngine
| (Vision API / PaddleOCR fallback)
v
Frontend receives extractedFields:
merchantName, transactionDate, totalAmount,
fuelQuantity, pricePerUnit, fuelGrade
```
After receipt extraction, the frontend calls `POST /api/stations/match` with the `merchantName` to auto-match a gas station via Google Places API. The station match is a separate request handled by the stations feature.
## Manual Extraction Flow
```
PDF Upload + "Scan for Maintenance Schedule"
|
v
POST /api/ocr/extract/manual (multipart/form-data)
- Requires Pro tier (document.scanMaintenanceSchedule)
- Validates file size (<= 200MB)
- Validates content type (application/pdf)
- Validates PDF magic bytes (%PDF header)
|
v
OcrService.submitManualJob()
|
v
OcrClient.submitManualJob() --> HTTP POST --> Python /extract/manual
| |
v v
{ jobId, status: 'pending' } GeminiEngine (Vertex AI)
Gemini 2.5 Flash
Frontend polls: (structured JSON output)
GET /api/ocr/jobs/:jobId |
(progress: 10% -> 50% -> 95% -> 100%) v
| ManualExtractionResult
v { vehicleInfo, maintenanceSchedules[] }
ManualJobResponse with result
|
v
Frontend displays MaintenanceScheduleReviewScreen
- User selects/edits items
- Batch creates maintenance schedules
```
Jobs expire after 2 hours (Redis TTL). Expired job polling returns HTTP 410 Gone.
## Supported File Types
### Sync Endpoints (extract, extractVin, extractReceipt)
- HEIC (converted server-side)
- JPEG
- PNG
- PDF (first page only)
## Response Format
### Async Endpoints (extractManual)
- PDF (validated via magic bytes)
## Response Types
### ReceiptExtractionResponse
```typescript
interface OcrResponse {
{
success: boolean;
documentType: 'vin' | 'receipt' | 'manual' | 'unknown';
receiptType: string;
extractedFields: {
merchantName: { value: string; confidence: number };
transactionDate: { value: string; confidence: number };
totalAmount: { value: string; confidence: number };
fuelQuantity: { value: string; confidence: number };
pricePerUnit: { value: string; confidence: number };
fuelGrade: { value: string; confidence: number };
};
rawText: string;
confidence: number; // 0.0 - 1.0
extractedFields: Record<string, { value: string; confidence: number }>;
processingTimeMs: number;
}
```
## Async Job Flow
### ManualJobResponse
```typescript
{
jobId: string;
status: 'pending' | 'processing' | 'completed' | 'failed';
progress?: { percent: number; message: string };
estimatedSeconds?: number;
result?: ManualExtractionResult;
error?: string;
}
```
1. POST `/api/ocr/jobs` with file
2. Receive `{ jobId, status: 'pending' }`
3. Poll GET `/api/ocr/jobs/:jobId`
4. When `status: 'completed'`, result contains OCR data
### ManualExtractionResult
```typescript
{
success: boolean;
vehicleInfo?: { make: string; model: string; year: number };
maintenanceSchedules: Array<{
serviceName: string;
intervalMiles: number | null;
intervalMonths: number | null;
details: string;
confidence: number;
subtypes: string[];
}>;
rawTables: any[];
processingTimeMs: number;
totalPages: number;
pagesProcessed: number;
}
```
Jobs expire after 1 hour.
## Error Handling
The backend proxy translates Python service error codes:
| Python Status | Backend Status | Meaning |
|---------------|----------------|---------|
| 413 | 413 | File too large |
| 415 | 415 | Unsupported media type |
| 422 | 422 | Extraction failed |
| 410 | 410 | Job expired (TTL) |
| Other | 500 | Internal server error |
## Tier Gating
Manual extraction requires Pro tier. The tier guard middleware (`requireTier` plugin) validates the user's subscription tier before processing. Free-tier users receive HTTP 403 with `TIER_REQUIRED` error code and an upgrade prompt.
Receipt and VIN extraction are available to all tiers.

View File

@@ -7,9 +7,9 @@
| `admin/` | Admin panel and catalog management | Admin UI, user management |
| `auth/` | Authentication pages and components | Login, logout, auth flows |
| `dashboard/` | Dashboard and fleet overview | Home page, summary widgets |
| `documents/` | Document management UI | File upload, document viewer |
| `fuel-logs/` | Fuel log tracking UI | Fuel entry forms, statistics |
| `maintenance/` | Maintenance record UI | Service tracking, reminders |
| `documents/` | Document management UI with maintenance manual extraction | File upload, document viewer, manual OCR extraction |
| `fuel-logs/` | Fuel log tracking UI with receipt OCR scanning | Fuel entry forms, receipt scanning, statistics |
| `maintenance/` | Maintenance record and schedule UI with OCR batch creation | Service tracking, extraction review, schedule management |
| `notifications/` | Notification display | Alert UI, notification center |
| `onboarding/` | Onboarding wizard | First-time user experience |
| `ownership-costs/` | Ownership cost tracking UI | Cost displays, expense forms |

View File

@@ -0,0 +1,49 @@
# documents/
Document management UI with maintenance manual extraction. Handles file uploads, document viewing, and PDF-based maintenance schedule extraction via Gemini.
## Subdirectories
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `api/` | Document API endpoints | API integration |
| `components/` | Document forms, dialogs, preview, metadata display | UI changes |
| `hooks/` | Document CRUD, manual extraction, upload progress | Business logic |
| `mobile/` | Mobile-specific document layout | Mobile UI |
| `pages/` | DocumentsPage, DocumentDetailPage | Page layout |
| `types/` | TypeScript type definitions | Type changes |
| `utils/` | Utility functions (vehicle label formatting) | Helper logic |
## Key Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `hooks/useManualExtraction.ts` | Manual extraction orchestration: submit PDF to /ocr/extract/manual, poll job status via /ocr/jobs/:jobId, return extraction results | Manual extraction flow, job polling |
| `components/DocumentForm.tsx` | Document metadata form with "Scan for Maintenance Schedule" checkbox (Pro tier) | Document upload, extraction trigger |
| `components/AddDocumentDialog.tsx` | Add document dialog integrating DocumentForm, upload progress, and manual extraction trigger | Document creation flow |
| `hooks/useDocuments.ts` | CRUD operations for documents | Document data management |
| `hooks/useUploadWithProgress.ts` | File upload with progress tracking | Upload UI |
| `components/DocumentPreview.tsx` | Document viewer/preview | Document display |
| `components/EditDocumentDialog.tsx` | Edit document metadata | Document editing |
| `types/documents.types.ts` | DocumentType, DocumentRecord, CreateDocumentRequest | Type definitions |
## Manual Extraction Flow
```
DocumentForm ("Scan for Maintenance Schedule" checkbox, Pro tier)
|
v
AddDocumentDialog -> useManualExtraction.submit(file, vehicleId)
|
v
POST /api/ocr/extract/manual (async job)
|
v
Poll GET /api/ocr/jobs/:jobId (progress: 10% -> 50% -> 95% -> 100%)
|
v
Job completed -> MaintenanceScheduleReviewScreen (in maintenance/ feature)
|
v
User selects/edits items -> Batch create maintenance schedules
```

View File

@@ -0,0 +1,48 @@
# fuel-logs/
Fuel log tracking UI with receipt OCR scanning. Captures fuel purchases, calculates statistics, and supports camera-based receipt scanning that auto-extracts fields and matches gas stations.
## Subdirectories
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `api/` | Fuel log API endpoints | API integration |
| `components/` | Form components, receipt OCR UI, stats display | UI changes |
| `hooks/` | Data fetching, receipt OCR orchestration, user settings | Business logic |
| `pages/` | FuelLogsPage | Page layout |
| `types/` | TypeScript type definitions | Type changes |
## Key Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `hooks/useReceiptOcr.ts` | Receipt OCR orchestration: camera capture, OCR extraction via /ocr/extract/receipt, station matching via /stations/match, field mapping | Receipt scanning flow, OCR integration |
| `components/ReceiptOcrReviewModal.tsx` | Modal for reviewing OCR-extracted receipt fields with confidence indicators, inline editing, station match display | Receipt review UI, field editing |
| `components/ReceiptCameraButton.tsx` | Button to trigger receipt camera capture (tier-gated) | Receipt capture entry point |
| `components/FuelLogForm.tsx` | Main fuel log form with OCR integration (setValue from accepted receipt) | Form fields, OCR field mapping |
| `components/ReceiptPreview.tsx` | Receipt image preview | Receipt display |
| `components/StationPicker.tsx` | Gas station selection with search | Station selection UI |
| `components/FuelLogsList.tsx` | Fuel log list display | Log listing |
| `components/FuelStatsCard.tsx` | Fuel statistics summary | Statistics display |
| `hooks/useFuelLogs.tsx` | CRUD operations for fuel logs | Data management |
| `types/fuel-logs.types.ts` | FuelLogResponse, CreateFuelLogRequest, LocationData, UnitSystem | Type definitions |
## Receipt OCR Flow
```
ReceiptCameraButton (tier check)
|
v
useReceiptOcr.startCapture() -> CameraCapture (shared component)
|
v
useReceiptOcr.processImage() -> POST /api/ocr/extract/receipt
|
v
ReceiptOcrReviewModal (display extracted fields, confidence indicators)
|
+-- POST /api/stations/match (merchantName -> station match)
|
v
useReceiptOcr.acceptResult() -> FuelLogForm.setValue() (pre-fill form)
```

View File

@@ -0,0 +1,51 @@
# maintenance/
Maintenance record and schedule management UI. Supports manual schedule creation and batch creation from OCR-extracted maintenance data. Three categories: routine maintenance, repair, performance upgrade.
## Subdirectories
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `api/` | Maintenance API endpoints | API integration |
| `components/` | Forms, lists, review screen, subtype selection | UI changes |
| `hooks/` | Data fetching, batch schedule creation from extraction | Business logic |
| `mobile/` | Mobile-specific maintenance layout | Mobile UI |
| `pages/` | MaintenancePage (tabs: records, schedules) | Page layout |
| `types/` | TypeScript type definitions (categories, subtypes, schedules) | Type changes |
## Key Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `hooks/useCreateSchedulesFromExtraction.ts` | Batch-creates maintenance schedules from OCR extraction results, maps MaintenanceScheduleItem to CreateScheduleRequest | OCR-to-schedule creation flow |
| `components/MaintenanceScheduleReviewScreen.tsx` | Dialog for reviewing OCR-extracted maintenance items: checkboxes for selection, confidence indicators, inline editing, batch create action | Extraction review UI, item editing |
| `components/MaintenanceScheduleForm.tsx` | Form for manual schedule creation | Schedule creation UI |
| `components/MaintenanceRecordForm.tsx` | Form for manual record creation | Record creation UI |
| `components/MaintenanceSchedulesList.tsx` | Schedule list with edit/delete | Schedule display |
| `components/MaintenanceRecordsList.tsx` | Record list display | Record display |
| `components/SubtypeCheckboxGroup.tsx` | Multi-select checkbox group for maintenance subtypes (27 routine, repair, performance) | Subtype selection UI |
| `hooks/useMaintenanceRecords.ts` | CRUD operations for maintenance records and schedules | Data management |
| `types/maintenance.types.ts` | MaintenanceCategory, ScheduleType, ROUTINE_MAINTENANCE_SUBTYPES, MaintenanceSchedule | Type definitions, subtype constants |
| `components/MaintenanceScheduleReviewScreen.test.tsx` | Tests for extraction review screen | Test changes |
## Extraction Review Flow
```
ManualExtractionResult (from documents/ feature useManualExtraction)
|
v
MaintenanceScheduleReviewScreen
- Displays extracted items with confidence scores
- Checkboxes for select/deselect
- Inline editing of service name, intervals, details
- Touch targets >= 44px for mobile
|
v
useCreateSchedulesFromExtraction.mutate(selectedItems)
|
v
POST /api/maintenance/schedules (batch create)
|
v
Query invalidation -> MaintenanceSchedulesList refreshes
```

View File

@@ -1,6 +1,6 @@
# ocr/
Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Pluggable engine abstraction in `app/engines/`.
Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Gemini 2.5 Flash for maintenance manual PDF extraction. Pluggable engine abstraction in `app/engines/`.
## Files
@@ -14,5 +14,5 @@ Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `app/` | FastAPI application source | OCR endpoint development |
| `app/engines/` | Engine abstraction layer (OcrEngine ABC, factory, hybrid) | Adding or changing OCR engines |
| `app/engines/` | Engine abstraction layer (OcrEngine ABC, factory, hybrid) and Gemini module | Adding or changing OCR engines, Gemini integration |
| `tests/` | Test suite | Adding or modifying tests |

View File

@@ -1,23 +1,25 @@
# ocr/app/
Python OCR microservice (FastAPI). Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Gemini 2.5 Flash for maintenance manual PDF extraction (standalone module, not an OcrEngine subclass).
## Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `main.py` | FastAPI application entry point | Route registration, app setup |
| `config.py` | Configuration settings | Environment variables, settings |
| `config.py` | Configuration settings (OCR engines, Vertex AI, Redis, Vision API limits) | Environment variables, settings |
| `__init__.py` | Package init | Package structure |
## Subdirectories
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `engines/` | OCR engine abstraction (PaddleOCR primary, Google Vision fallback) | Engine changes, adding new engines |
| `extractors/` | Data extraction logic | Adding new extraction types |
| `engines/` | OCR engine abstraction (PaddleOCR, Google Vision, Hybrid) and Gemini module | Engine changes, adding new engines |
| `extractors/` | Domain-specific data extraction (receipts, fuel receipts, maintenance manuals) | Adding new extraction types, modifying extraction logic |
| `models/` | Data models and schemas | Request/response types |
| `patterns/` | Regex and parsing patterns | Pattern matching rules |
| `patterns/` | Regex patterns and service name mapping (27 maintenance subtypes) | Pattern matching rules, service categorization |
| `preprocessors/` | Image preprocessing pipeline | Image preparation before OCR |
| `routers/` | FastAPI route handlers | API endpoint changes |
| `services/` | Business logic services | Core OCR processing |
| `table_extraction/` | Table detection and parsing | Structured data extraction |
| `routers/` | FastAPI route handlers (/extract, /extract/receipt, /extract/manual, /jobs) | API endpoint changes |
| `services/` | Business logic services (job queue with Redis) | Core OCR processing, async job management |
| `table_extraction/` | Table detection and parsing | Structured data extraction from images |
| `validators/` | Input validation | Validation rules |

33
ocr/app/engines/CLAUDE.md Normal file
View File

@@ -0,0 +1,33 @@
# ocr/app/engines/
OCR engine abstraction layer. Two categories of engines:
1. **OcrEngine subclasses** (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
2. **GeminiEngine** (PDF-to-structured-data): Standalone module for maintenance schedule extraction via Vertex AI. Accepts PDF bytes, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
## Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `__init__.py` | Public engine API exports (OcrEngine, create_engine, exceptions) | Importing engine interfaces |
| `base_engine.py` | OcrEngine ABC, OcrConfig, OcrEngineResult, WordBox, exception hierarchy | Engine interface contract, adding new engines |
| `paddle_engine.py` | PaddleOCR PP-OCRv4 primary engine | Local OCR debugging, accuracy tuning |
| `cloud_engine.py` | Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota |
| `hybrid_engine.py` | Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior |
| `engine_factory.py` | Factory function and engine registry for instantiation | Adding new engine types |
| `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction (Vertex AI SDK, 20MB PDF limit, structured JSON output) | Manual extraction debugging, Gemini configuration |
## Engine Selection
```
create_engine(config)
|
+-- Primary: PaddleOCR (local, fast, no API limits)
|
+-- Fallback: Google Vision (cloud, 1000/month limit)
|
v
HybridEngine (tries primary, falls back if confidence < threshold)
```
GeminiEngine is created independently by ManualExtractor, not through the engine factory.