CloudEngine wraps Google Vision TEXT_DETECTION with lazy init. HybridEngine runs primary engine, falls back to cloud when confidence is below threshold. Disabled by default (OCR_FALLBACK_ENGINE=none). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
34 lines
834 B
Python
34 lines
834 B
Python
"""OCR engine abstraction layer.
|
|
|
|
Provides a pluggable engine interface for OCR processing,
|
|
decoupling extractors from specific OCR libraries.
|
|
|
|
Engines:
|
|
- PaddleOcrEngine: PaddleOCR PP-OCRv4 (primary, CPU-only)
|
|
- TesseractEngine: pytesseract wrapper (backward compatibility)
|
|
- CloudEngine: Google Vision TEXT_DETECTION (optional cloud fallback)
|
|
- HybridEngine: Primary + fallback with confidence threshold
|
|
"""
|
|
|
|
from app.engines.base_engine import (
|
|
EngineError,
|
|
EngineProcessingError,
|
|
EngineUnavailableError,
|
|
OcrConfig,
|
|
OcrEngine,
|
|
OcrEngineResult,
|
|
WordBox,
|
|
)
|
|
from app.engines.engine_factory import create_engine
|
|
|
|
__all__ = [
|
|
"OcrEngine",
|
|
"OcrConfig",
|
|
"OcrEngineResult",
|
|
"WordBox",
|
|
"EngineError",
|
|
"EngineUnavailableError",
|
|
"EngineProcessingError",
|
|
"create_engine",
|
|
]
|