Rule Providers

Providers define the sources to load the Rule Sets from. These make Heimdall’s behavior dynamic. All providers, you want to enable for a Heimdall instance must be configured within the providers section of Heimdall’s rules configuration.

Supported providers, including the corresponding configuration options are described below

Filesystem

The filesystem provider allows loading of rule sets from a file system. The configuration of this provider goes into the file_system property. This provider is handy for e.g. starting playing around with Heimdall, e.g. locally, or using Docker, as well as if your deployment strategy considers deploying a Heimdall instance as a Side-Car for each of your services.

Following configuration options are supported:

  • src: string (mandatory)

    Can either be a single file, containing a rule set, or a directory with files, each containing a rule set.

  • watch: boolean (optional)

    Whether the configured src should be watched for updates. Defaults to false. If the src has been configured to a single file, the provider will watch for changes in that file. Otherwise, if the src has been configured to a directory, the provider will watch for files appearing and disappearing in this directory, as well as for changes in each particular file in this directory. Recursive lookup is not supported. That is, if the configured directory contains further directories, these, as well as their contents are ignored.

This provider doesn’t need any additional configuration for a rule set. So the contents of files can be just a list of rules as described in Rule Sets.

Example 1. Load rule sets from the files residing in the /path/to/rules/dir directory and watch for changes.
file_system:
  src: /path/to/rules/dir
  watch: true
Example 2. Load rule sets from the /path/to/rules.yaml file without watching it for changes.
file_system:
  src: /path/to/rules.yaml

HTTP Endpoint

This provider allows loading of Rule Sets from any remote endpoint accessible via HTTP(s) and supports rule sets in YAML, as well as in JSON format. The differentiation happens based on the Content-Type set in the response from the endpoint, which must be either application/yaml or application/json, otherwise an error is logged and the response from the endpoint is ignored.

The loading and removal of rules happens as follows:

  • if the response status code is an HTTP 200 OK and contains a Rule Sets in a known format (see above), the corresponding rules are loaded (if the definitions are valid)

  • in case of network issues, like dns errors, timeouts and alike, the rule sets previously received from the corresponding endpoints are preserved.

  • in any other case related to network communication (e.g. not 200 status code, empty response body, unsupported format, network issues, etc.), the corresponding rules are removed if these were previously loaded.

The configuration of this provider goes into the http_endpoint property. In contrast to the Filesystem provider it can be configured with as many endpoints to load rule sets from as required for the particular use case.

Following configuration options are supported:

  • watch_interval: Duration (optional)

    Whether the configured endpoints should be polled for updates. Defaults to 0s (polling disabled).

  • endpoints: RuleSetEndpoint array (mandatory)

    Each entry of that array supports all the properties defined by Endpoint, except method, which is always GET. enable_http_cacheAs with the Endpoint type, at least the url must be configured. Following properties are defined in addition:

    • rule_path_match_prefix: string (optional)

      This property can be used to create kind of a namespace for the rule sets retrieved from the different endpoints. If set, the provider checks whether the urls specified in all rules retrieved from the referenced endpoint have the defined path prefix. If not, a warning is emitted and the rule set is ignored. This can be used to ensure a rule retrieved from one endpoint does not collide with a rule from another endpoint.

HTTP caching according to RFC 7234 is enabled by default. It can be disabled by setting enable_http_cache to false.

This provider doesn’t need any additional configuration for a rule set. So the contents of files can be just a list of rules as described in Rule Sets.

Example 3. Minimal possible configuration

Here the provider is configured to load a rule set from one endpoint without polling it for changes.

http_endpoint:
  endpoints:
    - url: http://foo.bar/ruleset1
Example 4. Load rule sets from remote endpoints and watch for changes.

Here, the provider is configured to poll the two defined rule set endpoints for changes every 5 minutes.

The configuration for the first endpoint instructs heimdall to ensure all urls defined in the rules coming from that endpoint must match the defined path prefix.

The configuration for the second endpoint defines the rule_path_match_prefix as well. It also defines a couple of other properties. One to ensure the communication to that endpoint is more resilient by setting the retry options and since this endpoint is protected by an API key, it defines the corresponding options as well.

http_endpoint:
  watch_interval: 5m
  endpoints:
    - url: http://foo.bar/ruleset1
      rule_path_match_prefix: /foo/bar
    - url: http://foo.bar/ruleset2
      rule_path_match_prefix: /bar/foo
      retry:
        give_up_after: 5s
        max_delay: 250ms
      auth:
        type: api_key
        config:
          name: X-Api-Key
          value: super-secret
          in: header

Cloud Blob

This provider allows loading of Rule Sets from cloud blobs, like AWS S3 buckets, Google Cloud Storage, Azure Blobs, or other API compatible implementations and supports rule sets in YAML, as well as in JSON format. The differentiation happens based on the Content-Type set in the metadata of the loaded blob, which must be either application/yaml or application/json, otherwise an error is logged and the blob is ignored.

The loading and removal of rules happens as follows:

  • if the response status code is an HTTP 200 OK and contains a rule set in a known format (see above), the corresponding rules are loaded (if the definitions are valid)

  • in case of network issues, like dns errors, timeouts and alike, the rule sets previously received from the corresponding buckets are preserved.

  • in any other case related to network communication (like, not 200 status code, empty response body, unsupported format, etc.), the corresponding rules are removed if these were previously loaded.

The configuration of this provider goes into the cloud_blob property. As with HTTP Endpoint provider, it can be configured with as many buckets/blobs to load rule sets from as required for the particular use case.

Following configuration options are supported:

  • watch_interval: Duration (optional)

    Whether the configured buckets should be polled for updates. Defaults to 0s (polling disabled).

  • buckets: BlobReference array (mandatory)

    Each BlobReference entry in that array supports the following properties:

    • url: string (mandatory)

      The actual url to the bucket or to a specific blob in the bucket.

    • prefix: string (optional)

      Indicates that only blobs with a key starting with this prefix should be retrieved

    • rule_path_match_prefix: string (optional)

      Creates kind of a namespace for the rule sets retrieved from the blobs. If set, the provider checks whether the urls patterns specified in all rules retrieved from the referenced bucket have the defined path prefix. If that rule is violated, a warning is emitted and the rule set is ignored. This can be used to ensure a rule retrieved from one endpoint does not override a rule from another endpoint.

The differentiation which storage is used is based on the URL scheme. These are:

Other API compatible storage services, like Minio, Ceph, SeaweedFS, etc. can be used as well. The corresponding and other options can be found in the Go CDK Blob documentation, the implementation of this provider is based on.

The communication to the storage services requires an active session to the corresponding cloud provider. The session information is taken from the vendor specific environment variables, respectively configuration. See AWS Session, GC Application Default Credentials and Azure Storage Access for more information.
Example 5. Minimal possible configuration

Here the provider is configured to load rule sets from all blobs stored on the Google Cloud Storage bucket named "my-bucket" without polling for changes.

cloud_blob:
  buckets:
    - url: gs://my-bucket
Example 6. Load rule sets from AWS S3 buckets and watch for changes.
cloud_blob:
  watch_interval: 2m
  buckets:
    - url: gs://my-bucket
      prefix: service1
      rule_path_match_prefix: /service1
    - url: gs://my-bucket
      prefix: service2
      rule_path_match_prefix: /service2
    - url: s3://my-bucket/my-rule-set?region=us-west-1

Here, the provider is configured to poll multiple buckets with rule sets for changes every 2 minutes.

The first two bucket reference configurations reference actually the same bucket on Google Cloud Storage, but different blobs based on the configured blob prefix. The first one will let heimdall loading only those blobs, which start with service1, the second only those, which start with service2. As rule_path_match_prefix are defined for both as well, heimdall will ensure, that rule sets loaded from the corresponding blobs will not overlap in their url matching definitions.

The last one instructs heimdall to load rule set from a specific blob, namely a blob named my-rule-set, which resides on the my-bucket AWS S3 bucket, which is located in the us-west-1 AWS region.

Last updated on Nov 9, 2022