feat: Migrate Gemini SDK to google-genai (#231) #236
@@ -7,7 +7,7 @@ Python OCR microservice (FastAPI). Primary engine: PaddleOCR PP-OCRv4 with optio
|
||||
| File | What | When to read |
|
||||
| ---- | ---- | ------------ |
|
||||
| `main.py` | FastAPI application entry point | Route registration, app setup |
|
||||
| `config.py` | Configuration settings (OCR engines, Vertex AI, Redis, Vision API limits) | Environment variables, settings |
|
||||
| `config.py` | Configuration settings (OCR engines, Google GenAI, Redis, Vision API limits) | Environment variables, settings |
|
||||
| `__init__.py` | Package init | Package structure |
|
||||
|
||||
## Subdirectories
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
OCR engine abstraction layer. Two categories of engines:
|
||||
|
||||
1. **OcrEngine subclasses** (image-to-text): PaddleOCR, Google Vision, Hybrid. Accept image bytes, return text + confidence + word boxes.
|
||||
2. **GeminiEngine** (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via Vertex AI. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
|
||||
2. **GeminiEngine** (PDF-to-structured-data and VIN decode): Standalone module for maintenance schedule extraction and VIN decoding via google-genai SDK. Accepts PDF bytes or VIN strings, returns structured JSON. Not an OcrEngine subclass because the interface signatures differ.
|
||||
|
||||
## Files
|
||||
|
||||
@@ -15,7 +15,7 @@ OCR engine abstraction layer. Two categories of engines:
|
||||
| `cloud_engine.py` | Google Vision TEXT_DETECTION fallback engine (WIF authentication) | Cloud OCR configuration, API quota |
|
||||
| `hybrid_engine.py` | Combines primary + fallback engine with confidence threshold switching | Engine selection logic, fallback behavior |
|
||||
| `engine_factory.py` | Factory function and engine registry for instantiation | Adding new engine types |
|
||||
| `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (Vertex AI SDK, 20MB PDF limit, structured JSON output) | Manual extraction debugging, VIN decode, Gemini configuration |
|
||||
| `gemini_engine.py` | Gemini 2.5 Flash integration for maintenance schedule extraction and VIN decoding (google-genai SDK, 20MB PDF limit, structured JSON output, Google Search grounding for VIN decode) | Manual extraction debugging, VIN decode, Gemini configuration |
|
||||
|
||||
## Engine Selection
|
||||
|
||||
|
||||
Reference in New Issue
Block a user