Observability

To support operations and integration with diverse APM products, heimdall implements different observability mechanisms. The following sections describe what to expect.

Logging

Heimdall uses zerolog - Zero Allocation JSON Logger, which can also log in plain text. All emitted log statements include distributed tracing information (if tracing is enabled), so log entries can be correlated with traces and grouped by request/transaction.

Configuration

Logging can be configured in heimdall’s log section and supports the following properties.

  • format: string (optional)

    Supported values are text and gelf. text is the default format. gelf emits JSON that adheres to GELF.

    Using text format (the default) is not recommended for production deployments, as it requires more computational resources and is therefore slower.
    Example 1. Configuring logging to emit logs using GELF format.
    log:
      format: gelf
  • level: string (optional)

    Following log levels are available: trace, debug, info, warn, error, fatal, and panic. By default, the level is set to error.

    debug and trace are not intended for production use. Setting either level results in high log verbosity and noticeable performance impact. Both are meant for setup analysis and debugging only. The trace level also dumps all incoming and outgoing HTTP requests and responses, as well as the contents of objects used in templates. This dump is unedited, which means sensitive data may appear in logs.
    Due to limitations in the gRPC framework, setting log level to trace will not dump gRPC requests and responses. If you need these for analysis (for example, when debugging integration with Envoy Proxy), you must also set GODEBUG as well as GRPC_GO_LOG_VERBOSITY_LEVEL and GRPC_GO_LOG_SEVERITY_LEVEL.
    Example 2. Configuring logging to emit logs in debug level.
    log:
      level: debug

Regular Log Events

If you configure heimdall to log in text format, you can expect output similar to the example below:

2022-08-03T12:51:48+02:00 INF Opentelemetry tracing initialized.
2022-08-03T12:51:48+02:00 INF Instantiating in memory cache
2022-08-03T12:51:48+02:00 DBG Creating rule set event queue.
2022-08-03T12:51:48+02:00 INF Loading pipeline definitions
2022-08-03T12:51:48+02:00 DBG Loading definitions for authenticators
2022-08-03T12:51:48+02:00 DBG Loading pipeline definition id=anonymous_authenticator type=anonymous
...
2022-08-03T12:51:52+02:00 DBG Decision endpoint called
2022-08-03T12:51:52+02:00 DBG Executing default rule
2022-08-03T12:51:52+02:00 DBG Authenticating using anonymous authenticator
2022-08-03T12:51:52+02:00 DBG Finalizing using JWT finalizer
2022-08-03T12:51:52+02:00 DBG Generating new JWT
2022-08-03T12:51:52+02:00 DBG Finalizing request
This format is not recommended for production deployments, as it requires more computational resources and is therefore slower.

If you configure gelf format instead (see GELF for details), the output will look as follows:

{"_level_name": "INFO", "version":"1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Opentracing tracer initialized."}
{"_level_name": "INFO", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Instantiating in memory cache"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 7, "short_message": "Creating rule set event queue."}
{"_level_name": "INFO", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 6, "short_message": "Loading pipeline definitions"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523288,
 "level": 7,"short_message": "Loading definitions for authenticators"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "id": "anonymous_authenticator",
 "type": "anonymous","timestamp": 1659523288,
 "level": 7, "short_message": "Loading pipeline definition"}

...

{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Decision endpoint called"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp":1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Executing default rule"}
{"_level_name": "DEBUG", "version":"1.1", "host": "unknown", "timestamp":1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Authenticating using anonymous authenticator"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Finalizing using JWT finalizer"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Generating new JWT"}
{"_level_name": "DEBUG", "version": "1.1", "host": "unknown", "timestamp": 1659523295,
 "level": 7, "_parent_id": "3449bda63ed70206", "_span_id": "f57c007257fee0ed",
 "_trace_id": "00000000000000000a5af97bffe6a8a2", "short_message": "Finalizing request"}

Each log statement for incoming requests also includes the following fields in both formats when tracing is enabled:

  • _trace_id - The trace ID as defined by OpenTelemetry.

  • _span_id - The span ID of the current transaction, as defined by OpenTelemetry.

  • _parent_id - The span ID of the caller that started the transaction. Present only if the caller set the corresponding tracing header.

Access Log Events

In addition to regular logs, heimdall emits access log events. These events are always emitted, regardless of the configured log level, and always use INFO in log output.

Each request to any heimdall endpoint produces two access log events:

  • An event describing the start of the transaction.

  • An event describing the finalization of the transaction.

The following fields are always set for both events:

  • _tx_start - Timestamp in Unix epoch format when the transaction started.

  • _client_ip - The IP of the client of the request.

If the event is emitted for an HTTP request, the following fields are also set:

  • _http_method - The HTTP method used by the client while calling heimdall’s endpoint.

  • _http_path - The HTTP path.

  • _http_user_agent - The agent used by the client. The value is taken from the HTTP "User-Agent" header.

  • _http_host - The host part of the URI the client uses while communicating with heimdall.

  • _http_scheme - The scheme part of the URI the client uses while communicating with heimdall.

If the event is emitted for a gRPC request, the following field is set:

  • _grpc_method - The full gRPC method used.

If the request comes from an intermediary (for example, an API Gateway) and heimdall is configured to trust that proxy (see trusted_proxies configuration), the following fields are also included when the corresponding HTTP headers are present.

  • _http_x_forwarded_proto - The value of the "X-Forwarded-Proto" header.

  • _http_x_forwarded_host - The value of the "X-Forwarded-Host" header.

  • _http_x_forwarded_uri - The value of the "X-Forwarded-Uri" header.

  • _http_x_forwarded_for - The value of the "X-Forwarded-For" header.

  • _http_forwarded - The value of the "Forwarded" header.

The following fields are additionally set in the transaction finalization event:

  • _body_bytes_sent - The length of the response body.

  • _tx_duration_ms - The duration of the transaction in milliseconds. If heimdall is operated in proxy mode, this also includes the time used to communicate with the upstream service.

  • _access_granted - Set either to true or false, indicating whether heimdall granted access or not.

  • _subject - The subject identifier if the access was granted.

  • _error - The information about an error, which e.g. led to the denial of the request.

If the finalization event is emitted for an HTTP request, the following field is also set:

  • _http_status_code - The numeric HTTP response status code.

If the finalization event is emitted for a gRPC request, the following field is set:

  • _grpc_status_code - The numeric gRPC status code.

The following fields are set when tracing is enabled:

  • _trace_id - The trace ID as defined by OpenTelemetry.

  • _span_id - The span ID of the current transaction, as defined by OpenTelemetry.

  • _parent_id - The span ID of the caller that started the transaction. Present only if the caller set the corresponding tracing header.

If you configure heimdall to log in text format, you can expect output as shown below:

2022-08-03T12:40:16+02:00 INF TX started _client_ip=127.0.0.1 _http_host=127.0.0.1:4468 _http_method=GET
 _http_path=/foo _http_scheme=http _http_user_agent=curl/7.74.0 _parent_id=3449bda63ed70206
 _span_id=f57c007257fee0ed _trace_id=00000000000000000a5af97bffe6a8a2 _tx_start=1659523216

....

2022-08-03T12:40:16+02:00 INF TX finished _access_granted=true _body_bytes_sent=0 _client_ip=127.0.0.1
 _http_host=127.0.0.1:4468 _http_method=GET _http_path=/foo _http_scheme=http _http_status_code=202
 _http_user_agent=curl/7.74.0 _subject=anonymous _parent_id=3449bda63ed70206 _span_id=f57c007257fee0ed
 _trace_id=00000000000000000a5af97bffe6a8a2 _tx_duration_ms=0 _tx_start=1659523216

Otherwise, if you configure it to use gelf format, the output will look as follows:

{"_level_name": "INFO", "version":"1.1", "host":"unknown", "_tx_start":1659523295,
 "_client_ip": "127.0.0.1", "_http_method": "GET", "_http_path":"/foo",
 "_http_user_agent": "curl/7.74.0", "_http_host": "127.0.0.1:4468", "_http_scheme": "http",
 "timestamp": 1659523295, "level": 6, "_parent_id": "3449bda63ed70206",
 "_span_id": "f57c007257fee0ed", "_trace_id": "00000000000000000a5af97bffe6a8a2",
 "short_message": "TX started"}

....

{"_level_name": "INFO", "version": "1.1", "host": "unknown", "_tx_start": 1659523295,
 "_client_ip": "127.0.0.1", "_http_method": "GET", "_http_path": "/foo",
 "_http_user_agent": "curl/7.74.0", "_http_host": "127.0.0.1:4468", "_http_scheme": "http",
 "_body_bytes_sent": 0, "_http_status_code":200, "_tx_duration_ms":0, "_subject": "anonymous",
 "_access_granted": true, "timestamp":1659523295, "level": 6, "_parent_id": "3449bda63ed70206",
 "_span_id": "f57c007257fee0ed", "_trace_id": "00000000000000000a5af97bffe6a8a2",
 "short_message": "TX finished"}

Tracing

Heimdall uses OpenTelemetry for distributed tracing to record request paths. It supports OpenTelemetry environment variables and values as defined in the OpenTelemetry Environment Variables and OpenTelemetry SDK Configuration specifications. In addition, heimdall provides extra tracing options in its own configuration, described below.

Configuration

By default, tracing is enabled. You can customize this behavior in heimdall’s tracing configuration using the following properties.

  • enabled: boolean (optional)

    Enables or disables tracing. Defaults to true.

    Example 3. Disabling tracing.
    tracing:
      enabled: false
  • cover_rules: boolean (optional)

    Enables tracing at rule and rule-step level. Defaults to false for performance reasons. This setting is ignored when tracing is disabled (see above). See Rule-Specific Spans for details about generated spans and attributes.

  • span_processor: string (optional)

    Configures how heimdall processes created spans. Supported values are simple and batch. Defaults to batch. This property exists because there is no corresponding OTEL tracing environment variable.

    Example 4. Setting the span processor to export completed spans in batches.
    tracing:
      span_processor: batch

    Available options:

    • simple - Exports spans synchronously via the configured exporter. Not recommended for production use, as it is slower and has higher overhead. Useful for testing, debugging, or demos.

    • batch - Exports completed spans in batches. Recommended for production use.

Rule-Specific Spans

When tracing.cover_rules is set to true (see above), heimdall emits additional internal spans (span kind is set to internal) for rule and step execution.

Span: Rule Execution

Emitted for each executed rule.

AttributeTypeDescription

rule.id

string

The ID of the executed rule.

ruleset.id

string

Provider-specific ID of the ruleset.

ruleset.name

string

The name of the ruleset the rule belongs to (if set).

provider

string

The provider used to load the ruleset (e.g., kubernetes).

Span: Step Execution

Emitted for each executed step of a rule.

AttributeTypeDescription

step.id

string

The ID of the executed step.

mechanism.kind

string

The mechanism kind of the step (e.g., authenticator, authorizer, etc).

mechanism.name

string

The mechanism type of the step.

For both span types, heimdall records encountered errors and sets the span status to Error.

Tracing Context Propagation

When a request arrives at heimdall, it creates a trace context from the incoming headers. In OTEL, these are the traceparent and tracestate headers defined by W3C Trace Context, and the baggage header defined by W3C Baggage. Creating this context and forwarding it to downstream services via outgoing headers is called propagation. The components that perform this are called propagators.

Not every service in a distributed system sets or understands OTEL-specific headers, as some still use vendor-specific tracing headers. Interoperability can be achieved by configuring propagators via the OTEL_PROPAGATORS environment variable. OTEL defines the following values:

All of these are supported by heimdall. In addition, the following propagator can be configured:

Configured propagators are used for both inbound and outbound traffic.

Span Exporters

Span exporters deliver spans to external receivers (collectors or agents). They are the final component in the trace export pipeline and are typically provided by APM vendors such as Jaeger, Zipkin, or Instana. Since not every distributed system has an up-to-date telemetry receiver that supports OTEL protocols, interoperability can be achieved by configuring exporters via the OTEL_TRACES_EXPORTER environment variable. OTEL defines the following values for this variable [2]:

  • otlp - OTLP exporter. Enabled by default if OTEL_TRACES_EXPORTER is not set.

  • zipkin - Zipkin exporter to export spans in Zipkin data model.

  • none - No automatically configured exporter for traces.

All of these are supported by heimdall. In addition, the following exporter can be configured:

  • instana - Instana exporter [3] to export spans in Instana data model.

Example Configuration

The environment variables below configure heimdall to use the Jaeger propagator and export spans via OTLP over gRPC to a collector at https://collector:4317.

OTEL_PROPAGATORS=jaeger
OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=https://collector:4317

If your environment already supports OpenTelemetry and the defaults are acceptable, OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is likely the only required environment variable.

Metrics

Heimdall uses OpenTelemetry to emit metrics. Depending on the configuration, both push-based and pull-based metric export are supported.

Configuration

As with tracing, most metrics configuration is done via environment variables, as defined in the OpenTelemetry Environment Variables and OpenTelemetry SDK Configuration specifications.

By default, metrics are pushed to an OTEL collector. Alternatively, metrics can be exposed for pull-based scraping in Prometheus format.

As with tracing, metrics are generated and exported by default. You can customize this behavior in heimdall’s metrics configuration using the following properties:

  • enabled: boolean (optional)

    Enables or disables metrics collection. Defaults to true.

    Example 5. Disabling metrics.
    metrics:
      enabled: false
  • cover_rules: boolean (optional)

    Enables metrics collection for rule executions. Defaults to false for performance reasons. When enabled, the rule.execution.duration metric is available. This setting is ignored when metrics collection is disabled (see above).

  • cover_cache: boolean (optional)

    Enables metrics collection for cache usage. Defaults to false for performance reasons. When enabled, the cache.get.requests and cache.set.requests metrics are available. This setting is ignored when metrics collection is disabled (see above).

Metric Exporters

By default, metrics are pushed to the OTEL collector using the http/protobuf transport protocol. You can change this behavior by setting either OTEL_EXPORTER_OTLP_METRICS_PROTOCOL or OTEL_EXPORTER_OTLP_PROTOCOL.

To configure where heimdall pushes metrics, define either OTEL_EXPORTER_OTLP_METRICS_ENDPOINT or OTEL_EXPORTER_OTLP_ENDPOINT.

To expose metrics via a pull-based endpoint (Prometheus style), set OTEL_METRICS_EXPORTER to "prometheus". In this mode, heimdall exposes 127.0.0.1:9464/metrics, which can be queried using HTTP GET. You can change the host and port by setting OTEL_EXPORTER_PROMETHEUS_HOST and OTEL_EXPORTER_PROMETHEUS_PORT.

You can also disable metrics export by setting the OTEL_METRICS_EXPORTER environment variable to none.

Available Metrics

All metrics except custom metrics adhere to the OpenTelemetry semantic conventions. For that reason, only custom metrics are listed in the tables below.

Metric: cache.get.requests

Provides insights into cache hits, misses, and errors when retrieving a key. The metric type is a Counter.

AttributeTypeDescription

backend

string

The configured cache backend (e.g., redis).

result

string

The result of the request (hit, miss, or error).

Metric: cache.set.requests

Provides insights into cache write operations, including successes and errors. The metric type is a Counter.

AttributeTypeDescription

backend

string

The configured cache backend (e.g., redis).

result

string

The result of the request (success or error).

Metric: certificate.expiry

Number of seconds until a certificate used by heimdall expires. The metric type is Gauge and the unit is s.

AttributeTypeDescription

issuer

string

Issuer DN of the certificate.

serial_nr

string

The serial number of the certificate.

subject

string

Subject DN of the certificate.

dns_names

string

DNS entries in the SAN extension.

Metric: rule.execution.duration

Provides insights into the execution duration of individual rules. The metric type is Histogram and the unit is s.

AttributeTypeDescription

rule.id

string

The ID of the executed rule.

ruleset.id

string

Provider-specific ID of the ruleset.

ruleset.name

string

The name of the ruleset the rule belongs to (if set).

provider

string

The provider used to load the ruleset (e.g., kubernetes).

Metric: rules.loaded

The number of loaded rules per rule set. The metric type is Gauge.

AttributeTypeDescription

ruleset.id

string

Provider-specific ID of the ruleset.

ruleset.name

string

The name of the ruleset the rule belongs to (if set).

provider

string

The provider used to load the ruleset (e.g., kubernetes).

Runtime Profiling

If enabled, heimdall exposes a /debug/pprof HTTP endpoint on port 10251 (see configuration options below), where runtime profiling data in profile.proto format (also known as pprof) can be consumed by APM tools such as Google’s pprof, Grafana Phlare, Pyroscope, and others for visualization. The following information is available:

  • allocs - A sampling of all past memory allocations.

  • block - Stack traces that led to blocking on synchronization primitives.

  • cmdline - The command line invocation of the current program, with arguments separated by NUL bytes.

  • goroutine - Stack traces of all current goroutines.

  • heap - A sampling of memory allocations of live objects.

  • mutex - Stack traces of holders of contended mutexes.

  • profile - CPU profile. Profiling lasts for the duration specified in the seconds parameter, or 30 seconds if not specified.

  • symbol - Looks up the program counters listed in the request, responding with a table mapping program counters to function names.

  • threadcreate - Stack traces that led to the creation of new OS threads.

  • trace - Execution trace in binary form. Tracing lasts for the duration specified in the seconds parameter, or 1 second if not specified.

See also the API documentation for details about the actual API.

Configuration

Configuration for this service can be adjusted in heimdall’s profiling section using the following properties.

  • enabled: boolean (optional)

    Enables or disables runtime profiling. Defaults to false.

    Example 6. Enabling profiling.
    profiling:
      enabled: true
  • host: string (optional)

    Specifies the TCP/IP address on which heimdall listens for client connections requesting profiling data. The value 0.0.0.0 listens on all IPv4 addresses. Defaults to 127.0.0.1, which allows only local TCP/IP loopback connections.

    If you run heimdall in a container, set this property to a value that allows your APM system to scrape this information.
    Example 7. Configure heimdall to listen on 192.168.2.10.
    profiling:
      host: 192.168.2.10
  • port: integer (optional)

    Specifies the TCP port heimdall should listen on. Defaults to 10251.

    Example 8. Configure heimdall to listen on port 9999 for runtime profiling requests.
    profiling:
      port: 9999

1. Datadog supports the OTLP protocol. For that reason, there is no exporter available.
2. The Jaeger exporter has been marked as deprecated and is no longer supported.
3. Instana supports the W3C header used by OTEL. For that reason, there is no propagator available.

Last updated on Apr 5, 2026