Skip to main content

file

Consumes data from files on disk, emitting messages according to a chosen codec.

# Common config fields, showing default values
input:
label: ""
file:
paths: [] # No default (required)
scanner:
lines: {}

Metadata

This input adds the following metadata fields to each message:

- path
- mod_time_unix
- mod_time (RFC3339)

You can access these metadata fields using function interpolation.

Fields

paths

A list of paths to consume sequentially. Glob patterns are supported, including super globs (double star).

Type: array

scanner

The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once.

Type: scanner
Default: {"lines":{}}
Requires version 4.25.0 or newer

delete_on_finish

Whether to delete input files from the disk once they are fully consumed.

Type: bool
Default: false

Examples

If we wished to consume a directory of CSV files as structured documents we can use a glob pattern and the csv scanner:

input:
file:
paths: [ ./data/*.csv ]
scanner:
csv: {}