unarchive

Unarchives messages according to the selected archive format into multiple messages within a batch.

unarchive:
format: binary

When a message is unarchived the new messages replace the original message in the batch. Messages that are selected but fail to unarchive (invalid format) will remain unchanged in the message batch but will be flagged as having failed, allowing you to error handle them.

For the unarchive formats that contain file information (tar, zip), a metadata field is added to each message called archive_filename with the extracted filename.

Fields

format

string The unarchive format to use.

Options are: tar, zip, binary, lines, json_documents, json_array, json_map.

parts

array An optional array of message indexes of a batch that the processor should apply to. If left empty all messages are processed. This field is only applicable when batching messages at the input level.

Indexes can be negative, and if so the part will be selected from the end counting backwards starting from -1.

Formats

tar

Extract messages from a unix standard tape archive.

zip

Extract messages from a zip file.

binary

Extract messages from a binary blob format consisting of:

  • Four bytes containing number of messages in the batch (in big endian)
  • For each message part:
    • Four bytes containing the length of the message (in big endian)
    • The content of message

lines

Extract the lines of a message each into their own message.

json_documents

Attempt to parse a message as a stream of concatenated JSON documents. Each parsed document is expanded into a new message.

json_array

Attempt to parse a message as a JSON array, and extract each element into its own message.

json_map

Attempt to parse the message as a JSON map and for each element of the map expands its contents into a new message. A metadata field is added to each message called archive_key with the relevant key from the top-level map.