Files
aralez/METRICS.md
2025-06-07 15:51:23 +02:00

3.1 KiB

📈 Gazan Prometheus Metrics Reference

This document outlines Prometheus metrics for the Gazan reverse proxy. These metrics can be used for monitoring, alerting and performance analysis.

Exposed to http://config_address/metrics

By default http://127.0.0.1:3000/metrics

📊 Example Grafana dashboard during stress test :

Gazan


🛠️ Prometheus Metrics

1. gazan_requests_total

  • Type: Counter
  • Purpose: Total amount requests served by Gazan.

PromQL example:

rate(gazan_requests_total[5m])

2. gazan_errors_total

  • Type: Counter
  • Purpose: Count of requests that resulted in an error.

PromQL example:

rate(gazan_errors_total[5m])

3. gazan_responses_total{status="200"}

  • Type: CounterVec
  • Purpose: Count of responses by HTTP status code.

PromQL example:

rate(gazan_responses_total{status=~"5.."}[5m]) > 0

Useful for alerting on 5xx errors.


4. gazan_response_latency_seconds

  • Type: Histogram
  • Purpose: Tracks the latency of responses in seconds.

Example bucket output:

gazan_response_latency_seconds_bucket{le="0.01"}  15
gazan_response_latency_seconds_bucket{le="0.1"}   120
gazan_response_latency_seconds_bucket{le="0.25"}  245
gazan_response_latency_seconds_bucket{le="0.5"}   500
...
gazan_response_latency_seconds_count  1023
gazan_response_latency_seconds_sum    42.6
Metric Meaning
bucket{le="0.1"} 120 120 requests were ≤ 100ms
bucket{le="0.25"} 245 245 requests were ≤ 250ms
count Total number of observations (i.e., total responses measured)
sum Total time of all responses, in seconds

🔍 How to interpret:

  • le means “less than or equal to”.
  • count is total amount of observations.
  • sum is the total time (in seconds) of all responses.

PromQL examples:

🔹 95th percentile latency

histogram_quantile(0.95, rate(gazan_response_latency_seconds_bucket[5m]))

🔹 Average latency

rate(gazan_response_latency_seconds_sum[5m]) / rate(gazan_response_latency_seconds_count[5m])

Notes

  • Metrics are registered after the first served request.

Summary of key metrics

Metric Name Type What it Tells You
gazan_requests_total Counter Total requests served
gazan_errors_total Counter Number of failed requests
gazan_responses_total{status="200"} CounterVec Response status breakdown
gazan_response_latency_seconds Histogram How fast responses are

📘 Last updated: May 2025