Observability

Hemdall implements different observability mechanisms to support easier operation and integration in different APM products. The following sections provide insights about what can be expected.

Logging in Heimdall

Heimdall’s implementation uses zerolog - Zero Allocation JSON Logger, which can however also log in plain text. All emitted log statements include information related to distributed tracing (if tracing is enabled) so that not only log statements can be correlated to traces, but also all log statements belonging to single request/transaction can be correlated as well.

Available Logging configuration options are described in Logging Configuration.

Regular Log Events

If you configure heimdall to log in text format, you can expect output similar to the one shown below:

2022-08-03T12:51:48+02:00 INF Opentelemetry tracing initialized.
2022-08-03T12:51:48+02:00 INF Instantiating in memory cache
2022-08-03T12:51:48+02:00 DBG Creating rule set event queue.
2022-08-03T12:51:48+02:00 INF Loading pipeline definitions
2022-08-03T12:51:48+02:00 DBG Loading definitions for authenticators
2022-08-03T12:51:48+02:00 DBG Loading pipeline definition id=anonymous_authenticator type=anonymous
...
2022-08-03T12:51:52+02:00 DBG Decision endpoint called
2022-08-03T12:51:52+02:00 DBG Executing default rule
2022-08-03T12:51:52+02:00 DBG Authenticating using anonymous authenticator
2022-08-03T12:51:52+02:00 DBG Finalizing using JWT finalizer
2022-08-03T12:51:52+02:00 DBG Generating new JWT
2022-08-03T12:51:52+02:00 DBG Finalizing request
Usage of this format is not recommended for production deployments as it requires more computational resources and is hence slow.

Otherwise, if you configure it to use gelf (see GELF for format details) format, the output will look as follows:

{"_level_name": "INFO", "version":"1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Opentracing tracer initialized."}
{"_level_name": "INFO", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Instantiating in memory cache"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 7, "short_message": "Creating rule set event queue."}
{"_level_name": "INFO", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Loading pipeline definitions"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 7,"short_message": "Loading definitions for authenticators"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "id": "anonymous_authenticator",
 "type": "anonymous","timestamp": 1659523288,
 "level": 7, "short_message": "Loading pipeline definition"}

...

{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Decision endpoint called"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp":1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Executing default rule"}
{"_level_name": "DEBUG", "version":"1.1", "host": "unknown", "timestamp":1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Authenticating using anonymous authenticator"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Finalizing using JWT finalizer"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Generating new JWT"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Finalizing request"}

Each log statement does also include the following fields in both log formats for incoming requests if tracing is enabled:

  • _trace_id - The trace id as defined by OpenTelemetry.

  • _span_id - The span id as defined by OpenTelemetry of the current transaction.

  • _parent_id - The span id of the callee, which started the given transaction. Only present if the callee set the corresponding tracing header.

Access Log Events

In addition to regular logs, heimdall emits access logs. These events are always emitted regardless the configured log level and the log level of these events in the log output is always INFO.

Each request to any of heimdall’s endpoints will result in two access events:

  • event describing the start of the transaction, and an

  • event describing the finalization of the transaction.

Following are the fields, which are always set for both events:

  • _tx_start - Timestamp in Unix epoch format, when the transaction started.

  • _client_ip - The IP of the client of the request.

If the event has been emitted for an HTTP request, following fields are set as well:

  • _http_method - The HTTP method used by the client while calling heimdall’s endpoint.

  • _http_path - The used HTTP path.

  • _http_user_agent - The agent used by the client. The value is taken from the HTTP "User-Agent" header.

  • _http_host - The host part of the URI, the client is using while communicating with Heimdall.

  • _http_scheme - The scheme part of the URI, the client is using while communicating with Heimdall.

If the event has been emitted for a GRPC request, following fields are set:

  • _grpc_method - The full GRPC method used.

If the request comes from an intermediary, like e.g. an API Gateway and heimdall is configured to trust that "proxy" (see trusted_proxies configuration of the Decision service, as well as the trusted_proxies configuration of the Proxy service), then following fields will be part of the events as well if the corresponding HTTP headers were sent.

  • _http_x_forwarded_proto - The value of the "X-Forwarded-Proto" header.

  • _http_x_forwarded_host - The value of the "X-Forwarded-Host" header.

  • _http_x_forwarded_uri - The value of the "X-Forwarded-Uri" header.

  • _http_x_forwarded_for - The value of the "X-Forwarded-For" header.

  • _http_forwarded - The value of the "Forwarded" header.

Following are the fields, which are set in the transaction finalization event in addition:

  • _body_bytes_sent - The length of the response body.

  • _tx_duration_ms - The duration of the transaction in milliseconds. If heimdall is operated in proxy mode, it will also include the time used to communicate with the upstream service.

  • _access_granted - Set either to true or false, indicating whether heimdall granted access or not.

  • _subject - The subject identifier if the access was granted.

  • _error - The information about an error, which e.g. led to the denial of the request.

If the finalization event has been emitted for an HTTP request, following fields are set as well:

  • _http_status_code - The numeric HTTP response status code

If the finalization event has been emitted for a GRPC request, following fields are set:

  • _grpc_status_code - The numeric GRPC status code.

Following are the fields, which are set if tracing is enabled:

  • _trace_id - The trace id as defined by OpenTelemetry.

  • _span_id - The span id as defined by OpenTelemetry of the current transaction.

  • _parent_id - The span id of the callee, which started the given transaction. Only present if the callee set the corresponding tracing header.

If you configure heimdall to log in text format, you can expect output as shown below:

2022-08-03T12:40:16+02:00 INF TX started _client_ip=127.0.0.1 _http_host=127.0.0.1:4468 _http_method=GET
 _http_path=/foo _http_scheme=http _http_user_agent=curl/7.74.0 _parent_id=3449bda63ed70206
 _span_id=f57c007257fee0ed _trace_id=00000000000000000a5af97bffe6a8a2 _tx_start=1659523216

....

2022-08-03T12:40:16+02:00 INF TX finished _access_granted=true _body_bytes_sent=0 _client_ip=127.0.0.1
 _http_host=127.0.0.1:4468 _http_method=GET _http_path=/foo _http_scheme=http _http_status_code=202
 _http_user_agent=curl/7.74.0 _subject=anonymous _parent_id=3449bda63ed70206 _span_id=f57c007257fee0ed
 _trace_id=00000000000000000a5af97bffe6a8a2 _tx_duration_ms=0 _tx_start=1659523216

Otherwise, if you configure it to use gelf format, the output will look as follows:

{"_level_name": "INFO", "version":"1.1", "host":"unknown", "_tx_start":1659523295,
 "_client_ip": "127.0.0.1", "_http_method": "GET", "_http_path":"/foo",
 "_http_user_agent": "curl/7.74.0", "_http_host": "127.0.0.1:4468", "_http_scheme": "http",
 "timestamp": 1659523295, "level": 6, "_parent_id": "3449bda63ed70206",
 "_span_id": "f57c007257fee0ed", "_trace_id": "00000000000000000a5af97bffe6a8a2",
 "short_message": "TX started"}

....

{"_level_name": "INFO", "version": "1.1", "host": "unknown", "_tx_start": 1659523295,
 "_client_ip": "127.0.0.1", "_http_method": "GET", "_http_path": "/foo",
 "_http_user_agent": "curl/7.74.0", "_http_host": "127.0.0.1:4468", "_http_scheme": "http",
 "_body_bytes_sent": 0, "_http_status_code":200, "_tx_duration_ms":0, "_subject": "anonymous",
 "_access_granted": true, "timestamp":1659523295, "level": 6, "_parent_id": "3449bda63ed70206",
 "_span_id": "f57c007257fee0ed", "_trace_id": "00000000000000000a5af97bffe6a8a2",
 "short_message": "TX finished"}

Tracing in Heimdall

Heimdall makes use of OpenTelemetry for distributed tracing to support recording of paths taken by requests and supports all environment variables including the defined values according to OpenTelemetry Environment Variables and OpenTelemetry SDK Configuration specifications. In addition to these environment variables, heimdall defines some additional options, which are described in Tracing Configuration and can be used to tune the behaviour.

Tracing is enabled by default.

On one hand, this chapter serves the purpose of a quick introduction to distributed tracing with heimdall and on the other hand to list the options that go beyond the standard OTEL definitions.

Tracing Context Propagation

When a request arrives at heimdall, it will create a trace context object based on the received headers, which according to OTEL are the traceparent and tracestate HTTP headers, defined in W3C Trace Context as well as baggage HTTP header, defined in W3C Baggage. The creation of that context, as well as the transformation of it into new HTTP headers, set by heimdall while communicating with further services is named propagation and the components responsible for the creation of such context object are named propagators.

Since not every service in a multi-service system may set or understand the above OTEL specific HTTP headers (as these might still be using tracing vendor specific headers), interoperability can be achieved by configuring the required propagators by making use of the OTEL_PROPAGATORS environment variable. OTEL defines the following values for this variable:

All of these are supported by heimdall. In addition, following propagators can be configured as well:

Configured propagators are used for inbound, as well as for outbound traffic.

Span Exporters

Span Exporter handle the delivery of spans to external receivers (collectors, or agents). This is the final component in the trace export pipeline and typically provided by the APM vendor, like Jaeger, Zipkin, Instana, etc. Since not every multi-service system may have an up to day telemetry receiver supporting protocols defined by OTEL, interoperability can be achieved by configuring the required exporters by making use of the OTEL_TRACES_EXPORTER environment variable. OTEL defines the following values for this variable[2]:

  • otlp - OTLP exporter. Enabled by default if OTEL_TRACES_EXPORTER is not set.

  • zipkin - Zipkin exporter to export spans in Zipkin data model.

  • none - No automatically configured exporter for traces.

All of these are supported by heimdall. In addition, following exporters can be configured as well:

  • instana - Instana exporter to export spans in Instana data model.[3]

Example Configuration

The environment variables set below configure heimdall to use Jaeger propagator and to export the spans via OTLP over grpc to the collector available under http://collector:4317.

OTEL_PROPAGATORS=jaeger
OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://collector:4317

If your environment supports OpenTelemetry and usage of defaults is ok, most probably, the only required environment variable to be set might be the OTEL_EXPORTER_OTLP_TRACES_ENDPOINT.

Metrics in Heimdall

Heimdall makes use of OpenTelemetry to emit metrics. Depending on the configuration, push or pull based metrics export is supported. As with tracing, the entire configuration happens via environment variables as defined by OpenTelemetry Environment Variables and OpenTelemetry SDK Configuration specifications. In addition to these environment variables, heimdall allows disabling collection and exposure of metrics if required, which is described under Tracing Configuration.

Metric exposure is enabled by default.

This chapter serves the purpose of a quick introduction to metrics collected and exposed by heimdall.

Metric Exporters

By default, metrics are pushed to the OTEL collector using the http/protobuf transport protocol. You can change that behavior by making use of either the OTEL_EXPORTER_OTLP_METRICS_PROTOCOL, or the OTEL_EXPORTER_OTLP_PROTOCOL environment variable.

To let heimdall know where to push the metrics to, either OTEL_EXPORTER_OTLP_METRICS_ENDPOINT, or OTEL_EXPORTER_OTLP_ENDPOINT must be defined.

To let heimdall expose metrics over a pull based service (Prometheus style), the OTEL_METRICS_EXPORTER environment variable must be set to "prometheus". In that case heimdall will expose the 127.0.0.1:9464/metrics endpoint which can be queried using the HTTP GET verb. You can change the host and port of that service by making use of the OTEL_EXPORTER_PROMETHEUS_HOST and OTEL_EXPORTER_PROMETHEUS_PORT environment variables.

Available Metrics

All, but custom metrics adhere to the OpenTelementry semantic conventions. For that reason, only the custom metrics are listed in the table below.

Metric: certificate.expiry

Number of seconds until a certificate used by a particular service (decision, proxy, management), as well as signer expires. The metric type is UpDownCounter und the unit is s.

AttributeTypeDescription

service

string

The service, the certificate is configured for.

issuer

string

Issuer DN of the certificate.

serial_nr

string

The serial number of the certificate.

subject

string

Subject DN of the certificate.

dns_names

string

DNS entries in the SAN extension

Runtime Profiling in Heimdall

If enabled, heimdall exposes a /debug/pprof HTTP endpoint on port 10251 (See also Runtime Profiling Configuration) on which runtime profiling data in the profile.proto format (also known as pprof format) can be consumed by APM tools, like Google’s pprof, Grafana Phlare, Pyroscope and many more for visualization purposes. Following information is available:

  • allocs - A sampling of all past memory allocations.

  • block - Stack traces that led to blocking on synchronization primitives.

  • cmdline - The command line invocation of the current program, with arguments separated by NUL bytes.

  • goroutine - Stack traces of all current goroutines.

  • heap - A sampling of memory allocations of live objects.

  • mutex - Stack traces of holders of contended mutexes.

  • profile - Cpu profile. Profiling lasts for duration specified in seconds parameter, or for 30 seconds if not specified

  • symbol - Looks up the program counters listed in the request, responding with a table mapping program counters to function names.

  • threadcreate - Stack traces that led to the creation of new OS threads.

  • trace - Execution trace in binary form. Tracing lasts for duration specified in seconds parameter, or for 1 second if not specified.

See also the API documentation for the documentation of the actual API.


1. Datadog supports the OTLP protokoll. For that reason, there is no exporter available.
2. jaeger exporter has been marked as deprecated and is not supported anymore
3. Instana supports the W3C header used by OTEL. For that reason, there is no propagator available.

Last updated on Nov 29, 2023