jstream

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 12, 2020 License: MIT Imports: 8 Imported by: 106

README

jstream

GoDoc

jstream is a streaming JSON parser and value extraction library for Go.

Unlike most JSON parsers, jstream is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:

Using the below example document: jstream

we can choose to extract and act only the objects within the top-level array:

f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
  fmt.Printf("%v\n ", mv.Value)
}

output:

map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]

likewise, increasing depth level to 3 yields:

red
green
blue
cyan
magenta
yellow
black

optionally, kev:value pairs can be emitted as an individual struct:

decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}

Installing

go get github.com/bcicen/jstream

Commandline

jstream comes with a cli tool for quick viewing of parsed values from JSON input:

jstream -d 1 < input.json
{"colors":["red","green","blue"],"desc":"RGB"}
{"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}

detailed output with -v option:

cat input.json | jstream -v -d -1

depth	start	end	type   | value
2	018	023	string | "RGB"
3	041	046	string | "red"
3	048	055	string | "green"
3	057	063	string | "blue"
2	039	065	array  | ["red","green","blue"]
1	004	069	object | {"colors":["red","green","blue"],"desc":"RGB"}
2	087	093	string | "CMYK"
3	111	117	string | "cyan"
3	119	128	string | "magenta"
3	130	138	string | "yellow"
3	140	147	string | "black"
2	109	149	array  | ["cyan","magenta","yellow","black"]
1	073	153	object | {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}
0	000	155	array  | [{"colors":["red","green","blue"],"desc":"RGB"},{"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}]
Options
Opt Description
-d <n> emit values at depth n. if n < 0, all values will be emitted
-kv output inner key value pairs as newly formed objects
-v output depth and offset details for each value
-h display help dialog

Benchmarks

Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.

Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)

input size lib MB/s Allocated
regular standard 97 3.6MB
regular jstream 175 2.1MB
large standard 92 305MB
large jstream 404 69MB

In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below: jstream

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrSyntax        = DecoderError{/* contains filtered or unexported fields */}
	ErrUnexpectedEOF = DecoderError{/* contains filtered or unexported fields */}
)

Predefined errors

Functions

This section is empty.

Types

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder wraps an io.Reader to provide incremental decoding of JSON values

func NewDecoder

func NewDecoder(r io.Reader, emitDepth int) *Decoder

NewDecoder creates new Decoder to read JSON values at the provided emitDepth from the provider io.Reader. If emitDepth is < 0, values at every depth will be emitted.

func (*Decoder) EmitKV

func (d *Decoder) EmitKV() *Decoder

EmitKV enables emitting a jstream.KV struct when the items(s) parsed at configured emit depth are within a JSON object. By default, only the object values are emitted.

func (*Decoder) Err

func (d *Decoder) Err() error

Err returns the most recent decoder error if any, or nil

func (*Decoder) ObjectAsKVS

func (d *Decoder) ObjectAsKVS() *Decoder

ObjectAsKVS - by default JSON returns map[string]interface{} this is usually fine in most cases, but when you need to preserve the input order its not a right data structure. To preserve input order please use this option.

func (*Decoder) Pos

func (d *Decoder) Pos() int

Pos returns the number of bytes consumed from the underlying reader

func (*Decoder) Recursive

func (d *Decoder) Recursive() *Decoder

Recursive enables emitting all values at a depth higher than the configured emit depth; e.g. if an array is found at emit depth, all values within the array are emitted to the stream, then the array containing those values is emitted.

func (*Decoder) Stream

func (d *Decoder) Stream() chan *MetaValue

Stream begins decoding from the underlying reader and returns a streaming MetaValue channel for JSON values at the configured emitDepth.

type DecoderError added in v1.0.1

type DecoderError struct {
	// contains filtered or unexported fields
}

func (DecoderError) Error added in v1.0.1

func (e DecoderError) Error() string

func (DecoderError) ReaderErr added in v1.0.1

func (e DecoderError) ReaderErr() error

type KV

type KV struct {
	Key   string      `json:"key"`
	Value interface{} `json:"value"`
}

KV contains a key and value pair parsed from a decoded object

type KVS

type KVS []KV

KVS - represents key values in an JSON object

func (KVS) MarshalJSON

func (kvs KVS) MarshalJSON() ([]byte, error)

MarshalJSON - implements converting a KVS datastructure into a JSON object with multiple keys and values.

type MetaValue

type MetaValue struct {
	Offset    int
	Length    int
	Depth     int
	Value     interface{}
	ValueType ValueType
}

MetaValue wraps a decoded interface value with the document position and depth at which the value was parsed

type ValueType

type ValueType int

ValueType - defines the type of each JSON value

const (
	Unknown ValueType = iota
	Null
	String
	Number
	Boolean
	Array
	Object
)

Different types of JSON value

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL