codec

package
v4.0.0-...-13a3402 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2023 License: MIT Imports: 19 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ReaderDocs = docs.FieldString(
	"codec", "The way in which the bytes of a data source should be converted into discrete messages, codecs are useful for specifying how large files or continuous streams of data might be processed in small chunks rather than loading it all in memory. It's possible to consume lines using a custom delimiter with the `delim:x` codec, where x is the character sequence custom delimiter. Codecs can be chained with `/`, for example a gzip compressed CSV file can be consumed with the codec `gzip/csv`.", "lines", "delim:\t", "delim:foobar", "gzip/csv",
).HasAnnotatedOptions(
	"auto", "EXPERIMENTAL: Attempts to derive a codec for each file based on information such as the extension. For example, a .tar.gz file would be consumed with the `gzip/tar` codec. Defaults to all-bytes.",
	"all-bytes", "Consume the entire file as a single binary message.",
	"avro-ocf:marshaler=x", "EXPERIMENTAL: Consume a stream of Avro OCF datum. The `marshaler` parameter is optional and has the options: `goavro` (default), `json`. Use `goavro` if OCF contains logical types.",
	"chunker:x", "Consume the file in chunks of a given number of bytes.",
	"csv", "Consume structured rows as comma separated values, the first row must be a header row.",
	"csv:x", "Consume structured rows as values separated by a custom delimiter, the first row must be a header row. The custom delimiter must be a single character, e.g. the codec `\"csv:\\t\"` would consume a tab delimited file.",
	"csv-safe", "Consume structured rows like `csv`, but sends messages with empty maps on failure to parse. Includes row number and parsing errors (if any) in the message's metadata.",
	"delim:x", "Consume the file in segments divided by a custom delimiter.",
	"gzip", "Decompress a gzip file, this codec should precede another codec, e.g. `gzip/all-bytes`, `gzip/tar`, `gzip/csv`, etc.",
	"pgzip", "Decompress a gzip file in parallel, this codec should precede another codec, e.g. `pgzip/all-bytes`, `pgzip/tar`, `pgzip/csv`, etc.",
	"lines", "Consume the file in segments divided by linebreaks.",
	"multipart", "Consumes the output of another codec and batches messages together. A batch ends when an empty message is consumed. For example, the codec `lines/multipart` could be used to consume multipart messages where an empty line indicates the end of each batch.",
	"regex:(?m)^\\d\\d:\\d\\d:\\d\\d", "Consume the file in segments divided by regular expression.",
	"skipbom", "Skip one or more byte order marks for each opened reader, this codec should precede another codec, e.g. `skipbom/csv`, etc.",
	"tar", "Parse the file as a tar archive, and consume each file of the archive as a message.",
)

ReaderDocs is a static field documentation for input codecs.

View Source
var WriterDocs = docs.FieldString(
	"codec", "The way in which the bytes of messages should be written out into the output data stream. It's possible to write lines using a custom delimiter with the `delim:x` codec, where x is the character sequence custom delimiter.", "lines", "delim:\t", "delim:foobar",
).HasAnnotatedOptions(
	"all-bytes", "Only applicable to file based outputs. Writes each message to a file in full, if the file already exists the old content is deleted.",
	"append", "Append each message to the output stream without any delimiter or special encoding.",
	"lines", "Append each message to the output stream followed by a line break.",
	"delim:x", "Append each message to the output stream followed by a custom delimiter.",
)

WriterDocs is a static field documentation for output codecs.

Functions

func GetWriter

func GetWriter(codec string) (WriterConstructor, WriterConfig, error)

GetWriter returns a constructor that creates write codecs.

Types

type Reader

type Reader interface {
	Next(context.Context) ([]*message.Part, ReaderAckFn, error)
	Close(context.Context) error
}

Reader is a codec type that reads message parts from a source.

type ReaderAckFn

type ReaderAckFn func(context.Context, error) error

ReaderAckFn is a function provided to a reader codec that it should call once the underlying io.ReadCloser is fully consumed.

type ReaderConfig

type ReaderConfig struct {
	MaxScanTokenSize int
}

ReaderConfig is a general configuration struct that covers all reader codecs.

func NewReaderConfig

func NewReaderConfig() ReaderConfig

NewReaderConfig creates a reader configuration with default values.

type ReaderConstructor

type ReaderConstructor func(string, io.ReadCloser, ReaderAckFn) (Reader, error)

ReaderConstructor creates a reader from a filename, an io.ReadCloser and an ack func which is called by the reader once the io.ReadCloser is finished with. The filename can be empty and is usually ignored, but might be necessary for certain codecs.

func GetReader

func GetReader(codec string, conf ReaderConfig) (ReaderConstructor, error)

GetReader returns a constructor that creates reader codecs.

type Writer

type Writer interface {
	Write(context.Context, *message.Part) error
	Close(context.Context) error
}

Writer is a codec type that reads message parts from a source.

type WriterConfig

type WriterConfig struct {
	Append     bool
	Truncate   bool
	CloseAfter bool
}

WriterConfig contains custom configuration specific to a codec describing how handles should be provided.

type WriterConstructor

type WriterConstructor func(io.WriteCloser) (Writer, error)

WriterConstructor creates a writer from an io.WriteCloser.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL