Skip to main content

cached

EXPERIMENTAL

This component is experimental and therefore subject to change or removal outside of major version releases.

Cache the result of applying one or more processors to messages identified by a key. If the key already exists within the cache the contents of the message will be replaced with the cached result instead of applying the processors. This component is therefore useful in situations where an expensive set of processors need only be executed periodically.

Introduced in version 4.3.0.

# Config fields, showing default values
label: ""
cached:
cache: ""
key: ""
ttl: ""
processors: []

The format of the data when stored within the cache is a custom and versioned schema chosen to balance performance and storage space. It is therefore not possible to point this processor to a cache that is pre-populated with data that this processor has not created itself.

Fields

cache

The cache resource to read and write processor results from.

Type: string

key

A key to be resolved for each message, if the key already exists in the cache then the cached result is used, otherwise the processors are applied and the result is cached under this key. The key could be static and therefore apply generally to all messages or it could be an interpolated expression that is potentially unique for each message. This field supports interpolation functions.

Type: string

# Examples
key: my_foo_result
key: ${! this.document.id }
key: ${! meta("kafka_key") }
key: ${! meta("kafka_topic") }

ttl

An optional expiry period to set for each cache entry. Some caches only have a general TTL and will therefore ignore this setting.

Type: string

processors

The list of processors whose result will be cached.

Type: array

Examples

In the following example we want to we enrich messages consumed from Kafka with data specific to the origin topic partition, we do this by placing an http processor within a branch, where the HTTP URL contains interpolation functions with the topic and partition in the path.

However, it would be inefficient to make this HTTP request for every single message as the result is consistent for all data of a given topic partition. We can solve this by placing our enrichment call within a cached processor where the key contains the topic and partition, resulting in messages that originate from the same topic/partition combination using the cached result of the prior.

pipeline:
processors:
- branch:
processors:
- cached:
key: '${! meta("kafka_topic") }-${! meta("kafka_partition") }'
cache: foo_cache
processors:
- bloblang: 'root = ""'
- http:
url: http://example.com/enrichment/${! meta("kafka_topic") }/${! meta("kafka_partition") }
verb: GET
result_map: 'root.enrichment = this'
cache_resources:
- label: foo_cache
memory:
# Disable compaction so that cached items never expire
compaction_interval: ""