# ocr/

Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Gemini 2.5 Flash for maintenance manual PDF extraction. Pluggable engine abstraction in `app/engines/`.

## Files

| File | What | When to read |
| ---- | ---- | ------------ |
| `Dockerfile` | Container build (PaddleOCR models baked in) | Docker builds, deployment |
| `requirements.txt` | Python dependencies | Adding dependencies |

## Subdirectories

| Directory | What | When to read |
| --------- | ---- | ------------ |
| `app/` | FastAPI application source | OCR endpoint development |
| `app/engines/` | Engine abstraction layer (OcrEngine ABC, factory, hybrid) and Gemini module | Adding or changing OCR engines, Gemini integration |
| `tests/` | Test suite | Adding or modifying tests |