chore: update OCR tests and documentation (refs #121)
Some checks failed
Deploy to Staging / Build Images (pull_request) Failing after 7m4s
Deploy to Staging / Deploy to Staging (pull_request) Has been skipped
Deploy to Staging / Verify Staging (pull_request) Has been skipped
Deploy to Staging / Notify Staging Ready (pull_request) Has been skipped
Deploy to Staging / Notify Staging Failure (pull_request) Successful in 7s

Add engine abstraction tests and update docs to reflect PaddleOCR primary
architecture with optional Google Vision cloud fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Eric Gullickson
2026-02-07 11:42:51 -06:00
parent 1e96baca6f
commit 47c5676498
7 changed files with 870 additions and 68 deletions

View File

@@ -1,10 +1,12 @@
# ocr/
Python OCR microservice. Primary engine: PaddleOCR PP-OCRv4 with optional Google Vision cloud fallback. Pluggable engine abstraction in `app/engines/`.
## Files
| File | What | When to read |
| ---- | ---- | ------------ |
| `Dockerfile` | Container build definition | Docker builds, deployment |
| `Dockerfile` | Container build (PaddleOCR models baked in) | Docker builds, deployment |
| `requirements.txt` | Python dependencies | Adding dependencies |
## Subdirectories
@@ -12,4 +14,5 @@
| Directory | What | When to read |
| --------- | ---- | ------------ |
| `app/` | FastAPI application source | OCR endpoint development |
| `app/engines/` | Engine abstraction layer (OcrEngine ABC, factory, hybrid) | Adding or changing OCR engines |
| `tests/` | Test suite | Adding or modifying tests |