0
0
Fork 0
mirror of https://github.com/renovatebot/renovate.git synced 2025-05-12 23:51:55 +00:00
renovatebot_renovate/docs/usage/examples/opentelemetry.md
renovate[bot] b81d80269d
chore(deps): update jaegertracing/jaeger docker tag to v2.6.0 (main) (#35852)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-05-10 20:19:56 +00:00

256 lines
11 KiB
Markdown

# OpenTelemetry
Requirements:
- docker-compose
## Prepare setup
Create a `docker-compose.yaml` and `otel-collector-config.yml` file as seen below in a folder.
```yaml title="docker-compose.yaml"
name: renovate-otel-demo
services:
# Jaeger for storing traces
jaeger:
image: jaegertracing/jaeger:2.6.0
ports:
- '16686:16686' # Web UI
- '4317' # OTLP gRPC
- '4318' # OTLP HTTP
# Prometheus for storing metrics
prometheus:
image: prom/prometheus:v3.3.1
ports:
- '9090:9090' # Web UI
- '4318' # OTLP HTTP
command:
- --web.enable-otlp-receiver
# Mirror these flags from the Dockerfile, because `command` overwrites the default flags.
# https://github.com/prometheus/prometheus/blob/5b5fee08af4c73230b2dae35964816f7b3c29351/Dockerfile#L23-L24
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus
otel-collector:
# Using the Contrib version to access the spanmetrics connector.
# If you don't need the spanmetrics connector, you can use the standard version
image: otel/opentelemetry-collector-contrib:0.123.0
volumes:
- ./otel-collector-config.yml:/etc/otelcol-contrib/config.yaml
ports:
- '4318:4318' # OTLP HTTP ( exposed to the host )
- '4317:4317' # OTLP gRPC ( exposed to the host )
depends_on:
- jaeger
- prometheus
```
```yaml title="otel-collector-config.yml"
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
otlphttp/prometheus:
endpoint: http://prometheus:9090/api/v1/otlp
debug:
# verbosity: normal
connectors:
spanmetrics:
histogram:
exponential:
dimensions:
- name: http.method
default: GET
- name: http.status_code
- name: http.host
dimensions_cache_size: 1000
aggregation_temporality: 'AGGREGATION_TEMPORALITY_CUMULATIVE'
exemplars:
enabled: true
processors:
batch:
extensions:
health_check:
pprof:
zpages:
service:
extensions: [pprof, zpages, health_check]
pipelines:
traces:
receivers: [otlp]
exporters:
- otlp/jaeger
# Send traces to connector for metrics calculation
- spanmetrics
# Enable debug exporter to see traces in the logs
#- debug
processors: [batch]
metrics:
receivers:
- otlp # Receive metrics from Renovate.
- spanmetrics # Receive metrics calculated by the spanmetrics connector.
processors: [batch]
exporters:
- otlphttp/prometheus
# Enable debug exporter to see metrics in the logs
# - debug
```
Start setup using this command inside the folder containing the files created in the earlier steps:
```
docker-compose up
```
This command will start:
- an [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector-contrib)
- an instance of [Jaeger for traces](https://www.jaegertracing.io/)
- and [Prometheus](https://prometheus.io/)
Jaeger will be now reachable under [http://localhost:16686](http://localhost:16686).
## Run Renovate with OpenTelemetry
To start Renovate with OpenTelemetry enabled run following command, after pointing to your `config.js` config file:
```
docker run \
--rm \
--network renovate-otel-demo_default \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318 \
-v "/path/to/your/config.js:/usr/src/app/config.js" \
renovate/renovate:latest
```
You should now see `trace_id` and `span_id` fields in the logs.
```
INFO: Repository finished (repository=org/example)
"durationMs": 5574,
"trace_id": "f9a4c33852333fc2a0fbdc163100c987",
"span_id": "4ac1323eeaee
```
### Traces
Open now Jaeger under [http://localhost:16686](http://localhost:16686).
You should now be able to pick `renovate` under in the field `service` field.
![service picker](../assets/images/opentelemetry_pick_service.png)
Select `Find Traces` to search for all Renovate traces and then select one of the found traces to open the trace view.
![pick trace](../assets/images/opentelemetry_choose_trace.png)
You should be able to see now the full trace view which shows each HTTP request and internal spans.
![trace view](../assets/images/opentelemetry_trace_viewer.png)
### Metrics
Additional to the received traces some metrics are calculated.
This is achieved using the [spanmetrics connector](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector).
The previously implemented setup will produce following metrics, which pushed to [Prometheus](http://localhost:9090):
```
### Example of internal spans
traces_span_metrics_calls_total{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 2
traces_span_metrics_calls_total{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="run", status_code="STATUS_CODE_UNSET"} 2
### Example of http calls from Renovate to external services
traces_span_metrics_calls_total{http_host="api.github.com:443", http_method="POST", http_status_code="200", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_CLIENT", span_name="POST", status_code="STATUS_CODE_UNSET"} 4
### Example histogram metrics
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="8", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 0
...
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="2000", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 0
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="5000", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 1
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="15000", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 1
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="10000", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 1
traces_span_metrics_duration_milliseconds_bucket{http_method="GET", job="renovatebot.com/renovate", le="+Inf", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 1
traces_span_metrics_duration_milliseconds_sum{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 4190.694209
traces_span_metrics_duration_milliseconds_count{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repository", status_code="STATUS_CODE_UNSET"} 1
```
The [spanmetrics connector](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector) creates two sets of metrics.
#### Calls metric
At first there are the `traces_span_metrics_calls_total` metrics.
These metrics show how often _specific_ trace spans have been observed.
For example:
- `traces_span_metrics_calls_total{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="repositories", status_code="STATUS_CODE_UNSET"} 2` signals that 2 repositories have been renovated.
- `traces_span_metrics_calls_total{http_method="GET", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_INTERNAL", span_name="run", status_code="STATUS_CODE_UNSET"} 1` represents how often Renovate has been run.
If we combine this using the PrometheusQueryLanguage ( PromQL ), we can calculate the average count of repositories each Renovate run handles.
```
traces_span_metrics_calls_total{span_name="repository",service_name="renovate"} / traces_span_metrics_calls_total{span_name="run",service_name="renovate"}
```
These metrics are generated for HTTP call spans too:
```yaml
traces_span_metrics_calls_total{http_host="prometheus-community.github.io:443", http_method="GET", http_status_code="200", job="renovatebot.com/renovate", service_name="renovate", span_kind="SPAN_KIND_CLIENT", span_name="GET", status_code="STATUS_CODE_UNSET"} 5
```
#### Latency buckets
The second class of metrics exposed are the latency-focused buckets, that allow creating [heatmaps](https://grafana.com/docs/grafana/latest/basics/intro-histograms/#heatmaps).
A request is added to a backed if the latency is bigger than the bucket value (`le`). `request_duration => le`
As an example if we receive a request which need `1.533s` to complete get following metrics:
```
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="0.1"} 0
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="1"} 0
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="2"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="6"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="10"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="100"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="250"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="9.223372036854775e+12"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="+Inf"} 1
traces_span_metrics_duration_milliseconds_sum{http_host="api.github.com:443"} 1.533
traces_span_metrics_duration_milliseconds_count{http_host="api.github.com:443"} 1
```
Now we have another request which this time takes 10s to complete:
```
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="0.1"} 0
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="1"} 0
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="2"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="6"} 1
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="10"} 2
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="100"} 2
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="250"} 2
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="9.223372036854775e+12"} 2
traces_span_metrics_duration_milliseconds_bucket{http_host="api.github.com:443",le="+Inf"} 2
traces_span_metrics_duration_milliseconds_sum{http_host="api.github.com:443"} 11.533
traces_span_metrics_duration_milliseconds_count{http_host="api.github.com:443"} 2
```
More about the functionality can be found on the Prometheus page for [metric types](https://prometheus.io/docs/concepts/metric_types/#histogram).