feat: API Performance Grafana dashboard (#105) #108

Closed
opened 2026-02-06 14:01:46 +00:00 by egullickson · 1 comment
Owner

Parent Issue

Relates to #105

Summary

Create the API Performance dashboard showing request latency, throughput, status codes, and endpoint analysis. Note: These are log-based approximations, not true metrics - acceptable for current Loki-only stack.

Scope

Create config/grafana/dashboards/api-performance.json with these panels:

  1. Request Rate Over Time - Timeseries (requests per second)
    • LogQL: rate({container="mvp-backend"} | json | msg="Request processed" [1m])
  2. Response Time Distribution - Timeseries with p50/p95/p99
    • LogQL: quantile_over_time(0.50, {container="mvp-backend"} | json | msg="Request processed" | unwrap duration [5m]) by ()
    • Repeat for 0.95 and 0.99
  3. HTTP Status Code Distribution - Pie chart or bar
    • LogQL: sum by (status) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))
  4. Slowest Endpoints - Table (top-N by duration)
    • LogQL: topk(10, avg by (path) (avg_over_time({container="mvp-backend"} | json | msg="Request processed" | unwrap duration [5m])))
  5. Request Volume by Endpoint - Bar chart
    • LogQL: sum by (path) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))
  6. Status Code Breakdown by Endpoint - Table
    • LogQL: sum by (path, status) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))

Files Changed

  • config/grafana/dashboards/api-performance.json (NEW)

Acceptance Criteria

  • Dashboard auto-loads in Grafana
  • Request rate visible over configurable time range
  • Percentile latency panels render (p50/p95/p99)
  • Status code distribution accurate
  • Endpoint tables populated with real paths
## Parent Issue Relates to #105 ## Summary Create the API Performance dashboard showing request latency, throughput, status codes, and endpoint analysis. Note: These are log-based approximations, not true metrics - acceptable for current Loki-only stack. ## Scope Create `config/grafana/dashboards/api-performance.json` with these panels: 1. **Request Rate Over Time** - Timeseries (requests per second) - LogQL: `rate({container="mvp-backend"} | json | msg="Request processed" [1m])` 2. **Response Time Distribution** - Timeseries with p50/p95/p99 - LogQL: `quantile_over_time(0.50, {container="mvp-backend"} | json | msg="Request processed" | unwrap duration [5m]) by ()` - Repeat for 0.95 and 0.99 3. **HTTP Status Code Distribution** - Pie chart or bar - LogQL: `sum by (status) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))` 4. **Slowest Endpoints** - Table (top-N by duration) - LogQL: `topk(10, avg by (path) (avg_over_time({container="mvp-backend"} | json | msg="Request processed" | unwrap duration [5m])))` 5. **Request Volume by Endpoint** - Bar chart - LogQL: `sum by (path) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))` 6. **Status Code Breakdown by Endpoint** - Table - LogQL: `sum by (path, status) (count_over_time({container="mvp-backend"} | json | msg="Request processed" [5m]))` ## Files Changed - `config/grafana/dashboards/api-performance.json` (NEW) ## Acceptance Criteria - [ ] Dashboard auto-loads in Grafana - [ ] Request rate visible over configurable time range - [ ] Percentile latency panels render (p50/p95/p99) - [ ] Status code distribution accurate - [ ] Endpoint tables populated with real paths
egullickson added the
status
backlog
type
feature
labels 2026-02-06 14:02:17 +00:00
egullickson added this to the Sprint 2026-02-02 milestone 2026-02-06 14:02:22 +00:00
egullickson added
status
in-progress
and removed
status
backlog
labels 2026-02-06 14:25:43 +00:00
Author
Owner

Milestone: API Performance Dashboard Implementation

Phase: Execution | Agent: Platform | Status: PASS

Changes

  • Created config/grafana/dashboards/api-performance.json with 6 panels:
    1. Request Rate Over Time (timeseries) - rate() of backend request logs per 1m
    2. Response Time Distribution (timeseries) - p50/p95/p99 via quantile_over_time with __error__="" filter
    3. HTTP Status Code Distribution (piechart) - donut chart grouped by status code
    4. Request Volume by Endpoint (barchart) - horizontal bar chart grouped by path
    5. Slowest Endpoints (table) - top 10 by avg duration, sorted descending
    6. Status Code Breakdown by Endpoint (table) - path x status matrix

Technical Details

  • Follows same provisioning pattern as application-overview.json
  • Uses ${datasource} template variable for Loki datasource
  • All LogQL queries filter on container="mvp-backend" and msg="Request processed"
  • quantile_over_time queries include | __error__="" after unwrap duration to filter parse failures
  • Dashboard UID: api-performance, schema version 39, 30s auto-refresh
  • Tags: api, performance, backend

Commit

9e6f130 - feat: add API Performance Grafana dashboard (refs #108)

Verdict: PASS | Next: Push branch and include in parent PR for #105

## Milestone: API Performance Dashboard Implementation **Phase**: Execution | **Agent**: Platform | **Status**: PASS ### Changes - Created `config/grafana/dashboards/api-performance.json` with 6 panels: 1. **Request Rate Over Time** (timeseries) - `rate()` of backend request logs per 1m 2. **Response Time Distribution** (timeseries) - p50/p95/p99 via `quantile_over_time` with `__error__=""` filter 3. **HTTP Status Code Distribution** (piechart) - donut chart grouped by status code 4. **Request Volume by Endpoint** (barchart) - horizontal bar chart grouped by path 5. **Slowest Endpoints** (table) - top 10 by avg duration, sorted descending 6. **Status Code Breakdown by Endpoint** (table) - path x status matrix ### Technical Details - Follows same provisioning pattern as `application-overview.json` - Uses `${datasource}` template variable for Loki datasource - All LogQL queries filter on `container="mvp-backend"` and `msg="Request processed"` - `quantile_over_time` queries include `| __error__=""` after `unwrap duration` to filter parse failures - Dashboard UID: `api-performance`, schema version 39, 30s auto-refresh - Tags: `api`, `performance`, `backend` ### Commit `9e6f130` - `feat: add API Performance Grafana dashboard (refs #108)` *Verdict*: PASS | *Next*: Push branch and include in parent PR for #105
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: egullickson/motovaultpro#108