jsonrl

package
v0.0.0-...-86e9f11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 7, 2024 License: Apache-2.0 Imports: 18 Imported by: 0

Documentation

Overview

Package jsonrl implements a Ragel-generated JSON parser that converts JSON data into ion data (see Convert).

Index

Constants

View Source
const MaxDatumSize = 4 * 1024 * 1024

MaxDatumSize is the maximum size of a terminal datum in the JSON input. Fields that exceed this size are rejected. (In practice this is the upper bound on the size of strings in the source data.)

View Source
const MaxIndexingDepth = 3

MaxIndexingDepth is the maximum depth at which sparse indexing metadata will be collected.

View Source
const MaxObjectDepth = 64

MaxObjectDepth is the maximum level of recursion allowed in a JSON object.

Variables

View Source
var (
	// ErrNoMatch is returned from Convert
	// when the size of one of the fields
	// of the input object exceeds MaxObjectSize
	ErrNoMatch = errors.New("jsonrl: bad JSON object")
	// ErrTooLarge is returned from Convert
	// when the input would require more than
	// MaxObjectSize bytes of buffering in order
	// for a complete object to be parsed.
	ErrTooLarge = errors.New("jsonrl: object too large")
)

Functions

func Convert

func Convert(src io.Reader, dst *ion.Chunker, hints *Hint, cons []ion.Field) error

Convert reads JSON records from src and writes them to dst. If hints is non-nil, it uses hints to determine how certain fields are interpreted.

The JSON in src should be zero or more records, optionally wrapped in a JSON array. Convert will automatically flatten top-level arrays-of-records.

Convert will return an error if the input JSON is malformed, if it violates some internal limit (see MaxDatumSize, MaxObjectDepth, MaxIndexingDepth), or if the object does not fit in dst.Align after being serialized as ion data.

func ConvertCloudtrail

func ConvertCloudtrail(src io.Reader, dst *ion.Chunker, cons []ion.Field) error

ConvertCloudtrail works like Convert, except that it expects src to be formatted like AWS Cloudtrail logs, and it automatically flattens the elements of the top-level "Records" array into the structure fields.

For example, an input like this:

{"Records": [{"a": "b"}, {"c": "d"}]}

would become

{"a": "b"}
{"c": "d"}

Types

type Hint

type Hint struct {
	// contains filtered or unexported fields
}

Hint represents a structure containing type-hints and/or other flags to be used by the json parser. See ParseHint for further information.

func ParseHint

func ParseHint(rules []byte) (hint *Hint, err error)

ParseHint parses a json byte array into a Hint structure which can later be used to pass type-hints and/or other flags to the json parser.

The input must contain a valid JSON array with the individual rules:

[
  { "path": "path.to.value.a", "hints": "hint" },
  { "path": "path.to.value.b", "hints": ["hint_a", "hint_b"] }
]

A JSON object may be used as an alternative (not recommended):

{
  "path.to.value.a": "hint",
  "path.to.value.b": ["hint_a", "hint_b"]
}

The precedence of overlapping rules is determined by the order in which the rules are written.

The '?'/'[?]' wildcard can be used to match all keys of the current level.

The '*'/'[*]' wildcard can be used to match all keys of the current level and all following levels. Must be the last segment in the path.

Supported actions:

  • `ignore` -> do not parse this property
  • `no_index` -> do not add this property to the sparse index

Supported hints:

  • string
  • number -> either float or int
  • int
  • bool
  • datetime -> RFC3339Nano
  • unix_seconds

func (*Hint) String

func (n *Hint) String() string

func (*Hint) UnmarshalJSON

func (n *Hint) UnmarshalJSON(data []byte) error

type MultiWriter

type MultiWriter interface {
	// Open should open a new output stream.
	// All calls to the Write method of
	// the output stream are guaranteed to
	// be of a fixed block alignment.
	// Close will be called on each stream
	// when the blocks are done being written.
	//
	// Calls to Write on the returned io.Writer
	// are allowed to return io.EOF if they
	// would no longer like to receive input.
	Open() (io.WriteCloser, error)
	io.Closer
	CloseError(error)
}

MultiWriter is an interface satisfied by ion output destinations that support multi-stream output.

type SimpleWriter

type SimpleWriter struct {
	W io.WriteCloser
	// contains filtered or unexported fields
}

SimpleWriter is a MultiWriter that wraps a single output io.Writer.

func (*SimpleWriter) Close

func (s *SimpleWriter) Close() error

Close implements io.Closer

func (*SimpleWriter) CloseError

func (s *SimpleWriter) CloseError(err error)

func (*SimpleWriter) Open

func (s *SimpleWriter) Open() (io.WriteCloser, error)

Open implements MultiWriter.Open

func (*SimpleWriter) Write

func (s *SimpleWriter) Write(p []byte) (int, error)

Write implements io.Writer

type Splitter

type Splitter struct {
	// WindowSize is the window with which
	// the Splitter searches for newlines
	// on which to split its input
	WindowSize int
	// MaxParallel is the maximum parallelism
	// with which the input ndjson is translated
	MaxParallel int
	// Alignment is the alignment of output
	// chunks written to Output
	Alignment int
	// Output is the multi-stream output
	// of the translation
	Output MultiWriter
	// contains filtered or unexported fields
}

Splitter is configuration for splitting newline-delimited json

func (*Splitter) Split

func (s *Splitter) Split(r io.ReaderAt, size int64) error

Split processes the given io.ReaderAt up to (but not including) the byte at index 'size' as newline-delimited JSON

The input data is processed in parallel; the value of s.MaxParallel determines the maximum level of parallelism at which the data is processed.

The s.WindowSize variable determines how much data is read from r at once. The window size should be significantly larger than the maximum size of a line.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL