feat: Optional Google Vision cloud fallback engine (#115) #118

Closed
opened 2026-02-07 16:12:59 +00:00 by egullickson · 0 comments
Owner

Relates to #115

Add optional Google Vision API cloud fallback for when PaddleOCR confidence is below threshold. Disabled by default.

Changes

  • Create ocr/app/engines/cloud_engine.py - Google Vision TEXT_DETECTION wrapper
  • Create ocr/app/engines/hybrid_engine.py - Primary + fallback with confidence threshold
  • Update ocr/app/config.py - Add OCR_FALLBACK_ENGINE, OCR_FALLBACK_THRESHOLD, GOOGLE_VISION_KEY_PATH env vars
  • Update ocr/requirements.txt - Add google-cloud-vision dependency

Design Notes

  • Cloud fallback is optional (off by default) per Decision Critic analysis
  • PaddleOCR (8.3/10) scores higher than cloud APIs (8.0/10) for scene text
  • Cloud adds 2-8 seconds latency; processing target relaxed to 5-6s when fallback activates
  • Google Vision free tier (1,000 units/month) covers personal usage
  • Auth: Service account JSON mounted at /run/secrets/google-vision-key.json

Acceptance Criteria

  • CloudEngine wraps Google Vision TEXT_DETECTION
  • HybridEngine calls primary, falls back to cloud when confidence < threshold
  • Fallback is disabled by default (requires GOOGLE_VISION_KEY_PATH to be set)
  • Confidence threshold configurable via OCR_FALLBACK_THRESHOLD (default: 0.6)
  • Graceful degradation if cloud API is unavailable (returns primary result)
Relates to #115 Add optional Google Vision API cloud fallback for when PaddleOCR confidence is below threshold. Disabled by default. ## Changes - Create `ocr/app/engines/cloud_engine.py` - Google Vision TEXT_DETECTION wrapper - Create `ocr/app/engines/hybrid_engine.py` - Primary + fallback with confidence threshold - Update `ocr/app/config.py` - Add `OCR_FALLBACK_ENGINE`, `OCR_FALLBACK_THRESHOLD`, `GOOGLE_VISION_KEY_PATH` env vars - Update `ocr/requirements.txt` - Add `google-cloud-vision` dependency ## Design Notes - Cloud fallback is **optional** (off by default) per Decision Critic analysis - PaddleOCR (8.3/10) scores higher than cloud APIs (8.0/10) for scene text - Cloud adds 2-8 seconds latency; processing target relaxed to 5-6s when fallback activates - Google Vision free tier (1,000 units/month) covers personal usage - Auth: Service account JSON mounted at `/run/secrets/google-vision-key.json` ## Acceptance Criteria - [ ] CloudEngine wraps Google Vision TEXT_DETECTION - [ ] HybridEngine calls primary, falls back to cloud when confidence < threshold - [ ] Fallback is disabled by default (requires GOOGLE_VISION_KEY_PATH to be set) - [ ] Confidence threshold configurable via OCR_FALLBACK_THRESHOLD (default: 0.6) - [ ] Graceful degradation if cloud API is unavailable (returns primary result)
egullickson added the
status
backlog
type
feature
labels 2026-02-07 16:13:24 +00:00
egullickson added this to the Sprint 2026-02-02 milestone 2026-02-07 16:13:31 +00:00
egullickson added
status
in-progress
and removed
status
backlog
labels 2026-02-07 17:10:03 +00:00
egullickson added
status
backlog
and removed
status
in-progress
labels 2026-02-07 17:12:39 +00:00
egullickson added
status
in-progress
and removed
status
backlog
labels 2026-02-07 17:30:50 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: egullickson/motovaultpro#118