Files
motovaultpro/backend/src/features/user-import

User Import Feature

Provides user data import functionality, allowing authenticated users to restore previously exported data or migrate data from external sources. Supports two import modes: merge (update existing, add new) and replace (complete data replacement).

Overview

This feature processes TAR.GZ archives containing user data in JSON format plus associated files (vehicle images, document PDFs). The import validates archive structure, detects conflicts, and uses batch operations for optimal performance. Import operations are idempotent and support partial success scenarios.

Architecture

user-import/
├── domain/
│   ├── user-import.types.ts            # Type definitions and constants
│   ├── user-import.service.ts          # Main import orchestration service
│   └── user-import-archive.service.ts  # Archive extraction and validation
├── api/
│   ├── user-import.controller.ts       # HTTP handlers for multipart uploads
│   ├── user-import.routes.ts           # Route definitions
│   └── user-import.validation.ts       # Request validation schemas
└── tests/
    └── user-import.integration.test.ts # End-to-end integration tests

Data Flow

┌─────────────────┐
│  User uploads   │
│  tar.gz archive │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────────┐
│ UserImportArchiveService            │
│ - Extract to /tmp/user-import-work/ │
│ - Validate manifest.json            │
│ - Validate data files structure     │
│ - Detect VIN conflicts              │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│ UserImportService                   │
│ - Generate preview (optional)       │
│ - Execute merge or replace mode     │
│ - Batch operations (100 per chunk)  │
│ - Copy files to storage             │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│ Repositories (Batch Operations)     │
│ - VehiclesRepository.batchInsert()  │
│ - FuelLogsRepository.batchInsert()  │
│ - MaintenanceRepo.batchInsert*()    │
│ - DocumentsRepository.batchInsert() │
└─────────────────────────────────────┘

Import Modes

Merge Mode (Default)

  • UPDATE existing vehicles by VIN match
  • INSERT new vehicles without VIN match
  • INSERT all fuel logs, documents, maintenance (skip duplicates)
  • Partial success: continues on errors, reports in summary
  • User data preserved if import fails

Use Cases:

  • Restoring data after device migration
  • Adding records from external source
  • Merging data from multiple backups

Replace Mode

  • DELETE all existing user data
  • INSERT all records from archive
  • All-or-nothing transaction (ROLLBACK on any failure)
  • Complete data replacement

Use Cases:

  • Clean slate restore from backup
  • Testing with known dataset
  • Disaster recovery

Archive Structure

Expected structure (created by user-export feature):

motovaultpro_export_YYYY-MM-DDTHH-MM-SS.tar.gz
├── manifest.json                       # Archive metadata (version, counts)
├── data/
│   ├── vehicles.json                   # Vehicle records
│   ├── fuel-logs.json                  # Fuel log records
│   ├── documents.json                  # Document metadata
│   ├── maintenance-records.json        # Maintenance records
│   └── maintenance-schedules.json      # Maintenance schedules
└── files/                              # Optional
    ├── vehicle-images/
    │   └── {vehicleId}/
    │       └── {filename}              # Actual vehicle image files
    └── documents/
        └── {documentId}/
            └── {filename}              # Actual document files

API Endpoints

Import User Data

Uploads and imports a user data archive.

Endpoint: POST /api/user/import

Authentication: Required (JWT)

Request:

  • Content-Type: multipart/form-data
  • Body Fields:
    • file: tar.gz archive (required)
    • mode: "merge" or "replace" (optional, defaults to "merge")

Response:

{
  "success": true,
  "mode": "merge",
  "summary": {
    "imported": 150,
    "updated": 5,
    "skipped": 0,
    "errors": []
  },
  "warnings": [
    "2 vehicle images not found in archive"
  ]
}

Example:

curl -X POST \
  -H "Authorization: Bearer <token>" \
  -F "file=@motovaultpro_export_2025-01-11.tar.gz" \
  -F "mode=merge" \
  https://app.motovaultpro.com/api/user/import

Generate Import Preview

Analyzes archive and generates preview without executing import.

Endpoint: POST /api/user/import/preview

Authentication: Required (JWT)

Request:

  • Content-Type: multipart/form-data
  • Body Fields:
    • file: tar.gz archive (required)

Response:

{
  "manifest": {
    "version": "1.0.0",
    "createdAt": "2025-01-11T10:00:00.000Z",
    "userId": "auth0|123456",
    "contents": {
      "vehicles": { "count": 3, "withImages": 2 },
      "fuelLogs": { "count": 150 },
      "documents": { "count": 10, "withFiles": 8 },
      "maintenanceRecords": { "count": 25 },
      "maintenanceSchedules": { "count": 5 }
    },
    "files": {
      "vehicleImages": 2,
      "documentFiles": 8,
      "totalSizeBytes": 5242880
    },
    "warnings": []
  },
  "conflicts": {
    "vehicles": 2
  },
  "sampleRecords": {
    "vehicles": [ {...}, {...}, {...} ],
    "fuelLogs": [ {...}, {...}, {...} ]
  }
}

Batch Operations Performance

Why Batch Operations First?

The user-import feature was built on batch operations added to repositories as a prerequisite. This architectural decision provides:

  1. Performance: Single SQL INSERT for 100 records vs 100 individual INSERTs
  2. Transaction Efficiency: Reduced round-trips to database
  3. Memory Management: Chunked processing prevents memory exhaustion on large datasets
  4. Scalability: Handles 1000+ vehicles, 5000+ fuel logs efficiently

Performance Benchmarks:

  • 1000 vehicles: <10 seconds (batch) vs ~60 seconds (individual)
  • 5000 fuel logs: <10 seconds (batch) vs ~120 seconds (individual)
  • Large dataset (1000 vehicles + 5000 logs + 100 docs): <30 seconds total

Repository Batch Methods

  • VehiclesRepository.batchInsert(vehicles[], client?)
  • FuelLogsRepository.batchInsert(fuelLogs[], client?)
  • MaintenanceRepository.batchInsertRecords(records[], client?)
  • MaintenanceRepository.batchInsertSchedules(schedules[], client?)
  • DocumentsRepository.batchInsert(documents[], client?)

All batch methods accept optional PoolClient for transaction support (replace mode).

Conflict Resolution

VIN Conflicts (Merge Mode Only)

When importing vehicles with VINs that already exist in the database:

  1. Detection: Query database for existing VINs before import
  2. Resolution: UPDATE existing vehicle with new data (preserves vehicle ID)
  3. Reporting: Count conflicts in preview, track updates in summary

Tradeoffs:

  • Merge Mode: Preserves related data (fuel logs, documents linked to vehicle ID)
  • Replace Mode: No conflicts (all data deleted first), clean slate

Duplicate Prevention

  • Fuel logs: No natural key, duplicates may occur if archive imported multiple times
  • Documents: No natural key, duplicates may occur
  • Maintenance: No natural key, duplicates may occur

Recommendation: Use replace mode for clean imports, merge mode only for incremental updates.

Implementation Details

User Scoping

All data is strictly scoped to authenticated user via userId. Archive manifest userId is informational only - all imported data uses authenticated user's ID.

File Handling

  • Vehicle images: Copied from archive /files/vehicle-images/{vehicleId}/{filename} to storage
  • Document files: Copied from archive /files/documents/{documentId}/{filename} to storage
  • Missing files are logged as warnings but don't fail import

Temporary Storage

  • Archive extracted to: /tmp/user-import-work/import-{userId}-{timestamp}/
  • Cleanup happens automatically after import (success or failure)
  • Upload temp files: /tmp/import-upload-{userId}-{timestamp}.tar.gz

Chunking Strategy

  • Default chunk size: 100 records per batch
  • Configurable via USER_IMPORT_CONFIG.chunkSize
  • Processes all chunks sequentially (maintains order)

Error Handling

Merge Mode:

  • Partial success: continues on chunk errors
  • Errors collected in summary.errors[]
  • Returns success: false if any errors occurred

Replace Mode:

  • All-or-nothing: transaction ROLLBACK on any error
  • Original data preserved on failure
  • Throws error to caller

Dependencies

Internal

  • VehiclesRepository - Vehicle data access and batch insert
  • FuelLogsRepository - Fuel log data access and batch insert
  • DocumentsRepository - Document metadata access and batch insert
  • MaintenanceRepository - Maintenance data access and batch insert
  • StorageService - File storage for vehicle images and documents

External

  • tar - TAR.GZ archive extraction
  • file-type - Magic byte validation for uploaded archives
  • fs/promises - File system operations
  • pg (Pool, PoolClient) - Database transactions

Testing

Unit Tests

  • Archive validation logic
  • Manifest structure validation
  • Data file parsing
  • Conflict detection

Integration Tests

See tests/user-import.integration.test.ts:

  • End-to-end: Export → Modify → Import cycle
  • Performance: 1000 vehicles in <10s, 5000 fuel logs in <10s
  • Large dataset: 1000 vehicles + 5000 logs + 100 docs without memory exhaustion
  • Conflict resolution: VIN matches update existing vehicles
  • Replace mode: Complete deletion and re-import
  • Partial failure: Valid records imported despite some errors
  • Archive validation: Version check, missing files detection
  • Preview generation: Conflict detection and sample records

Run Tests:

npm test user-import.integration.test.ts

Security Considerations

  • User authentication required (JWT)
  • Data strictly scoped to authenticated user (archive manifest userId ignored)
  • Magic byte validation prevents non-gzip uploads
  • Archive version validation prevents incompatible imports
  • Temporary files cleaned up after processing
  • No cross-user data leakage possible

Performance

  • Batch operations: 100 records per INSERT
  • Streaming file extraction (no full buffer in memory)
  • Sequential chunk processing (predictable memory usage)
  • Cleanup prevents disk space accumulation
  • Parallel file copy operations where possible

Tradeoffs: Merge vs Replace

Aspect Merge Mode Replace Mode
Data Safety Preserves existing data on failure Rollback on failure (all-or-nothing)
Conflicts Updates existing vehicles by VIN No conflicts (deletes all first)
Partial Success Continues on errors, reports summary Fails entire transaction on any error
Performance Slightly slower (conflict checks) Faster (no conflict detection)
Use Case Incremental updates, data migration Clean slate restore, testing
Risk Duplicates possible (fuel logs, docs) Data loss if archive incomplete

Recommendation: Default to merge mode for safety. Use replace mode only when complete data replacement is intended.

Future Enhancements

Potential improvements:

  • Selective import (e.g., only vehicles and fuel logs)
  • Dry-run mode (simulate import, report what would happen)
  • Import progress streaming (long-running imports)
  • Duplicate detection for fuel logs and documents
  • Import history tracking (audit log of imports)
  • Scheduled imports (automated periodic imports)
  • External format support (CSV, Excel)