Files
motovaultpro/docs/changes/vehicles-dropdown-v2/README.md
Eric Gullickson a052040e3a Initial Commit
2025-09-17 16:09:15 -05:00

4.6 KiB

Vehicles Dropdown V2 - Manual JSON ETL Implementation

Overview

This directory contains comprehensive documentation for implementing manual JSON processing in the MVP Platform Vehicles ETL system. The goal is to add capability to process 55 JSON files containing vehicle data directly, bypassing the MSSQL source dependency.

Quick Start for AI Instances

Current State (As of Implementation Start)

  • 55 JSON files exist in mvp-platform-services/vehicles/etl/sources/makes/
  • Current ETL only supports MSSQL → PostgreSQL pipeline
  • Need to add JSON → PostgreSQL capability

Key Files to Load for Context

# Load these files for complete understanding
mvp-platform-services/vehicles/etl/sources/makes/toyota.json  # Large file example
mvp-platform-services/vehicles/etl/sources/makes/tesla.json   # Electric vehicle example
mvp-platform-services/vehicles/etl/pipeline.py               # Current pipeline
mvp-platform-services/vehicles/etl/loaders/postgres_loader.py # Current loader
mvp-platform-services/vehicles/sql/schema/001_schema.sql     # Target schema

Implementation Status

See 08-status-tracking.md for current progress.

Critical Requirements Discovered

1. Make Name Normalization

  • JSON filenames: alfa_romeo.json, land_rover.json
  • Database display: "Alfa Romeo", "Land Rover" (spaces, title case)

2. Engine Configuration Normalization

  • CRITICAL: L3I3 (L-configuration treated as Inline)
  • Standard format: {displacement}L {config}{cylinders} {descriptions}
  • Examples: "1.5L L3""1.5L I3", "2.4L H4" (Subaru Boxer)

3. Hybrid/Electric Patterns Found

  • "PLUG-IN HYBRID EV- (PHEV)" - Plug-in hybrid
  • "FULL HYBRID EV- (FHEV)" - Full hybrid
  • "ELECTRIC" - Pure electric
  • "FLEX" - Flex-fuel
  • Empty engines arrays for Tesla/electric vehicles

4. Transmission Limitation

  • Manual selection only: Automatic/Manual choice
  • No automatic detection from JSON data

Document Structure

File Purpose Status
01-analysis-findings.md JSON data patterns analysis Pending
02-implementation-plan.md Technical roadmap Pending
03-engine-spec-parsing.md Engine parsing rules Pending
04-make-name-mapping.md Make name normalization Pending
05-database-schema-updates.md Schema change requirements Pending
06-cli-commands.md New CLI command design Pending
07-testing-strategy.md Testing and validation approach Pending
08-status-tracking.md Implementation progress tracker Pending

AI Handoff Instructions

To Continue This Work:

  1. Read this README.md - Current state and critical requirements
  2. Check 08-status-tracking.md - See what's completed/in-progress
  3. Review 02-implementation-plan.md - Technical roadmap
  4. Load specific documentation based on what you're implementing

To Understand the Data:

  1. Load 01-analysis-findings.md - JSON structure analysis
  2. Load 03-engine-spec-parsing.md - Engine parsing rules
  3. Load 04-make-name-mapping.md - Make name conversion rules

To Start Coding:

  1. Check status tracker - See what needs to be implemented next
  2. Load implementation plan - Step-by-step technical guide
  3. Reference examples/ directory - Code samples and patterns

Success Criteria

  • New CLI command: python -m etl load-manual
  • Process all 55 JSON make files
  • Proper make name normalization (alfa_romeo.json"Alfa Romeo")
  • Engine spec parsing with L→I normalization
  • Clear/append mode support with duplicate handling
  • Electric vehicle support (default engines for empty arrays)
  • Integration with existing PostgreSQL schema

Architecture Integration

This feature integrates with:

  • Existing ETL pipeline: mvp-platform-services/vehicles/etl/
  • PostgreSQL schema: vehicles schema with make/model/engine tables
  • Platform API: Hierarchical dropdown endpoints remain unchanged
  • Application service: No changes required

Notes for Future Implementations

  • Maintain compatibility with existing MSSQL pipeline
  • Follow existing code patterns in etl/ directory
  • Use existing PostgreSQLLoader where possible
  • Preserve referential integrity during data loading