motovaultpro

This directory contains comprehensive documentation for implementing manual JSON processing in the MVP Platform Vehicles ETL system. The goal is to add capability to process 55 JSON files containing vehicle data directly, bypassing the MSSQL source dependency.

Quick Start for AI Instances

Current State (As of Implementation Start)

55 JSON files exist in mvp-platform-services/vehicles/etl/sources/makes/
Current ETL only supports MSSQL → PostgreSQL pipeline
Need to add JSON → PostgreSQL capability

Key Files to Load for Context

# Load these files for complete understanding
mvp-platform-services/vehicles/etl/sources/makes/toyota.json  # Large file example
mvp-platform-services/vehicles/etl/sources/makes/tesla.json   # Electric vehicle example
mvp-platform-services/vehicles/etl/pipeline.py               # Current pipeline
mvp-platform-services/vehicles/etl/loaders/postgres_loader.py # Current loader
mvp-platform-services/vehicles/sql/schema/001_schema.sql     # Target schema

Implementation Status

See 08-status-tracking.md for current progress.

Critical Requirements Discovered

1. Make Name Normalization

JSON filenames: alfa_romeo.json, land_rover.json
Database display: "Alfa Romeo", "Land Rover" (spaces, title case)

2. Engine Configuration Normalization

CRITICAL: L3 → I3 (L-configuration treated as Inline)
Standard format: {displacement}L {config}{cylinders} {descriptions}
Examples: "1.5L L3" → "1.5L I3", "2.4L H4" (Subaru Boxer)

3. Hybrid/Electric Patterns Found

"PLUG-IN HYBRID EV- (PHEV)" - Plug-in hybrid
"FULL HYBRID EV- (FHEV)" - Full hybrid
"ELECTRIC" - Pure electric
"FLEX" - Flex-fuel
Empty engines arrays for Tesla/electric vehicles

4. Transmission Limitation

Manual selection only: Automatic/Manual choice
No automatic detection from JSON data

Document Structure

File	Purpose	Status
01-analysis-findings.md	JSON data patterns analysis	⏳ Pending
02-implementation-plan.md	Technical roadmap	⏳ Pending
03-engine-spec-parsing.md	Engine parsing rules	⏳ Pending
04-make-name-mapping.md	Make name normalization	⏳ Pending
05-database-schema-updates.md	Schema change requirements	⏳ Pending
06-cli-commands.md	New CLI command design	⏳ Pending
07-testing-strategy.md	Testing and validation approach	⏳ Pending
08-status-tracking.md	Implementation progress tracker	⏳ Pending

AI Handoff Instructions

To Continue This Work:

Read this README.md - Current state and critical requirements
Check 08-status-tracking.md - See what's completed/in-progress
Review 02-implementation-plan.md - Technical roadmap
Load specific documentation based on what you're implementing

To Understand the Data:

Load 01-analysis-findings.md - JSON structure analysis
Load 03-engine-spec-parsing.md - Engine parsing rules
Load 04-make-name-mapping.md - Make name conversion rules

To Start Coding:

Check status tracker - See what needs to be implemented next
Load implementation plan - Step-by-step technical guide
Reference examples/ directory - Code samples and patterns

Success Criteria

New CLI command: python -m etl load-manual
Process all 55 JSON make files
Proper make name normalization (alfa_romeo.json → "Alfa Romeo")
Engine spec parsing with L→I normalization
Clear/append mode support with duplicate handling
Electric vehicle support (default engines for empty arrays)
Integration with existing PostgreSQL schema

Architecture Integration

This feature integrates with:

Existing ETL pipeline: mvp-platform-services/vehicles/etl/
PostgreSQL schema: vehicles schema with make/model/engine tables
Platform API: Hierarchical dropdown endpoints remain unchanged
Application service: No changes required

Notes for Future Implementations

Maintain compatibility with existing MSSQL pipeline
Follow existing code patterns in etl/ directory
Use existing PostgreSQLLoader where possible
Preserve referential integrity during data loading

README.md

Vehicles Dropdown V2 - Manual JSON ETL Implementation

Overview