feat: add Grafana alerting rules and documentation (refs #111)
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 36s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s
Deploy to Staging / Verify Staging (pull_request) Successful in 2m36s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
All checks were successful
Deploy to Staging / Build Images (pull_request) Successful in 36s
Deploy to Staging / Deploy to Staging (pull_request) Successful in 51s
Deploy to Staging / Verify Staging (pull_request) Successful in 2m36s
Deploy to Staging / Notify Staging Ready (pull_request) Successful in 8s
Deploy to Staging / Notify Staging Failure (pull_request) Has been skipped
Configure Grafana Unified Alerting with file-based provisioned alert rules, contact points, and notification policies. Add stable UID to Loki datasource for alert rule references. Update LOGGING.md with dashboard descriptions, alerting rules table, and LogQL query reference. Alert rules: Error Rate Spike (critical), Container Silence for backend/postgres/redis (warning), 5xx Response Spike (critical). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
11
config/grafana/alerting/notification-policies.yml
Normal file
11
config/grafana/alerting/notification-policies.yml
Normal file
@@ -0,0 +1,11 @@
|
||||
apiVersion: 1
|
||||
|
||||
policies:
|
||||
- orgId: 1
|
||||
receiver: mvp-default
|
||||
group_by:
|
||||
- alertname
|
||||
- severity
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 4h
|
||||
Reference in New Issue
Block a user