Initial Commit

This commit is contained in:
Eric Gullickson
2025-09-17 16:09:15 -05:00
parent 0cdb9803de
commit a052040e3a
373 changed files with 437090 additions and 6773 deletions

View File

@@ -0,0 +1,99 @@
# Vehicles Dropdown V2 - Manual JSON ETL Implementation
## Overview
This directory contains comprehensive documentation for implementing manual JSON processing in the MVP Platform Vehicles ETL system. The goal is to add capability to process 55 JSON files containing vehicle data directly, bypassing the MSSQL source dependency.
## Quick Start for AI Instances
### Current State (As of Implementation Start)
- **55 JSON files** exist in `mvp-platform-services/vehicles/etl/sources/makes/`
- Current ETL only supports MSSQL → PostgreSQL pipeline
- Need to add JSON → PostgreSQL capability
### Key Files to Load for Context
```bash
# Load these files for complete understanding
mvp-platform-services/vehicles/etl/sources/makes/toyota.json # Large file example
mvp-platform-services/vehicles/etl/sources/makes/tesla.json # Electric vehicle example
mvp-platform-services/vehicles/etl/pipeline.py # Current pipeline
mvp-platform-services/vehicles/etl/loaders/postgres_loader.py # Current loader
mvp-platform-services/vehicles/sql/schema/001_schema.sql # Target schema
```
### Implementation Status
See [08-status-tracking.md](08-status-tracking.md) for current progress.
## Critical Requirements Discovered
### 1. Make Name Normalization
- JSON filenames: `alfa_romeo.json`, `land_rover.json`
- Database display: `"Alfa Romeo"`, `"Land Rover"` (spaces, title case)
### 2. Engine Configuration Normalization
- **CRITICAL**: `L3``I3` (L-configuration treated as Inline)
- Standard format: `{displacement}L {config}{cylinders} {descriptions}`
- Examples: `"1.5L L3"``"1.5L I3"`, `"2.4L H4"` (Subaru Boxer)
### 3. Hybrid/Electric Patterns Found
- `"PLUG-IN HYBRID EV- (PHEV)"` - Plug-in hybrid
- `"FULL HYBRID EV- (FHEV)"` - Full hybrid
- `"ELECTRIC"` - Pure electric
- `"FLEX"` - Flex-fuel
- Empty engines arrays for Tesla/electric vehicles
### 4. Transmission Limitation
- **Manual selection only**: Automatic/Manual choice
- **No automatic detection** from JSON data
## Document Structure
| File | Purpose | Status |
|------|---------|--------|
| [01-analysis-findings.md](01-analysis-findings.md) | JSON data patterns analysis | ⏳ Pending |
| [02-implementation-plan.md](02-implementation-plan.md) | Technical roadmap | ⏳ Pending |
| [03-engine-spec-parsing.md](03-engine-spec-parsing.md) | Engine parsing rules | ⏳ Pending |
| [04-make-name-mapping.md](04-make-name-mapping.md) | Make name normalization | ⏳ Pending |
| [05-database-schema-updates.md](05-database-schema-updates.md) | Schema change requirements | ⏳ Pending |
| [06-cli-commands.md](06-cli-commands.md) | New CLI command design | ⏳ Pending |
| [07-testing-strategy.md](07-testing-strategy.md) | Testing and validation approach | ⏳ Pending |
| [08-status-tracking.md](08-status-tracking.md) | Implementation progress tracker | ⏳ Pending |
## AI Handoff Instructions
### To Continue This Work:
1. **Read this README.md** - Current state and critical requirements
2. **Check [08-status-tracking.md](08-status-tracking.md)** - See what's completed/in-progress
3. **Review [02-implementation-plan.md](02-implementation-plan.md)** - Technical roadmap
4. **Load specific documentation** based on what you're implementing
### To Understand the Data:
1. **Load [01-analysis-findings.md](01-analysis-findings.md)** - JSON structure analysis
2. **Load [03-engine-spec-parsing.md](03-engine-spec-parsing.md)** - Engine parsing rules
3. **Load [04-make-name-mapping.md](04-make-name-mapping.md)** - Make name conversion rules
### To Start Coding:
1. **Check status tracker** - See what needs to be implemented next
2. **Load implementation plan** - Step-by-step technical guide
3. **Reference examples/** directory - Code samples and patterns
## Success Criteria
- [ ] New CLI command: `python -m etl load-manual`
- [ ] Process all 55 JSON make files
- [ ] Proper make name normalization (`alfa_romeo.json``"Alfa Romeo"`)
- [ ] Engine spec parsing with L→I normalization
- [ ] Clear/append mode support with duplicate handling
- [ ] Electric vehicle support (default engines for empty arrays)
- [ ] Integration with existing PostgreSQL schema
## Architecture Integration
This feature integrates with:
- **Existing ETL pipeline**: `mvp-platform-services/vehicles/etl/`
- **PostgreSQL schema**: `vehicles` schema with make/model/engine tables
- **Platform API**: Hierarchical dropdown endpoints remain unchanged
- **Application service**: No changes required
## Notes for Future Implementations
- Maintain compatibility with existing MSSQL pipeline
- Follow existing code patterns in `etl/` directory
- Use existing `PostgreSQLLoader` where possible
- Preserve referential integrity during data loading