Skip to main content



This component is experimental and therefore subject to change or removal outside of major version releases.

Decodes Parquet files into a batch of structured messages.

Introduced in version 4.4.0.

# Config fields, showing default values
label: ""
byte_array_as_string: false

This processor uses, which is itself experimental. Therefore changes could be made into how this processor functions outside of major version releases.

By default any BYTE_ARRAY or FIXED_LEN_BYTE_ARRAY value will be extracted as a byte slice ([]byte) unless the logical type is UTF8, in which case they are extracted as a string (string).

When a value extracted as a byte slice exists within a document which is later JSON serialized by default it will be base 64 encoded into strings, which is the default for arbitrary data fields. It is possible to convert these binary values to strings (or other data types) using Bloblang transformations such as = or ="hex"), etc.

However, in cases where all BYTE_ARRAY values are strings within your data it may be easier to set the config field byte_array_as_string to true in order to automatically extract all of these values as strings.



Whether to extract BYTE_ARRAY and FIXED_LEN_BYTE_ARRAY values as strings rather than byte slices in all cases. Values with a logical type of UTF8 will automatically be extracted as strings irrespective of this field. Enabling this field makes serialising the data as JSON more intuitive as []byte values are serialised as base64 encoded strings by default.

Type: bool
Default: false


In this example we consume files from AWS S3 as they're written by listening onto an SQS queue for upload events. We make sure to use the all-bytes codec which means files are read into memory in full, which then allows us to use a parquet_decode processor to expand each file into a batch of messages. Finally, we write the data out to local files as newline delimited JSON.

bucket: TODO
prefix: foos/
codec: all-bytes
url: TODO
- parquet_decode:
byte_array_as_string: true

codec: lines
path: './foos/${! meta("s3_key") }.jsonl'