jstream

package module
v0.0.0-...-00e8530 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 4, 2020 License: MIT Imports: 8 Imported by: 0

README

jstream

GoDoc

jstream is a streaming JSON parser and value extraction library for Go.

Unlike most JSON parsers, jstream is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:

Using the below example document: jstream

we can choose to extract and act only the objects within the top-level array:

f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
  fmt.Printf("%v\n ", mv.Value)
}

output:

map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]

likewise, increasing depth level to 3 yields:

red
green
blue
cyan
magenta
yellow
black

optionally, kev:value pairs can be emitted as an individual struct:

decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}

Installing

go get github.com/bcicen/jstream

Commandline

jstream comes with a cli tool for quick viewing of parsed values from JSON input:

cat input.json | jstream -v -d 1
depth	start	end	type   | value

1	004	069	object | {"colors":["red","green","blue"],"desc":"RGB"}
1	073	153	object | {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}
Options
Opt Description
-d <n> emit values at depth n. if n < 0, all values will be emitted
-v output depth and offset details for each value
-h display help dialog

Benchmarks

Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.

Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)

input size lib MB/s Allocated
regular standard 97 3.6MB
regular jstream 175 2.1MB
large standard 92 305MB
large jstream 404 69MB

In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below: jstream

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrSyntax        = SyntaxError{/* contains filtered or unexported fields */}
	ErrUnexpectedEOF = SyntaxError{/* contains filtered or unexported fields */}
)

Predefined errors

Functions

This section is empty.

Types

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder wraps an io.Reader to provide incremental decoding of JSON values

func NewDecoder

func NewDecoder(r io.Reader, emitDepth int) *Decoder

NewDecoder creates new Decoder to read JSON values at the provided emitDepth from the provider io.Reader. If emitDepth is < 0, values at every depth will be emitted.

func (*Decoder) EmitKV

func (d *Decoder) EmitKV() *Decoder

EmitKV enables emitting a jstream.KV struct when the items(s) parsed at configured emit depth are within a JSON object. By default, only the object values are emitted.

func (*Decoder) Err

func (d *Decoder) Err() error

Err returns the most recent decoder error if any, or nil

func (*Decoder) ObjectAsKVS

func (d *Decoder) ObjectAsKVS() *Decoder

ObjectAsKVS - by default JSON returns map[string]interface{} this is usually fine in most cases, but when you need to preserve the input order its not a right data structure. To preserve input order please use this option.

func (*Decoder) Pos

func (d *Decoder) Pos() int

Pos returns the number of bytes consumed from the underlying reader

func (*Decoder) Recursive

func (d *Decoder) Recursive() *Decoder

Recursive enables emitting all values at a depth higher than the configured emit depth; e.g. if an array is found at emit depth, all values within the array are emitted to the stream, then the array containing those values is emitted.

func (*Decoder) Stream

func (d *Decoder) Stream() chan *MetaValue

Stream begins decoding from the underlying reader and returns a streaming MetaValue channel for JSON values at the configured emitDepth.

type KV

type KV struct {
	Key   string      `json:"key"`
	Value interface{} `json:"value"`
}

KV contains a key and value pair parsed from a decoded object

type KVS

type KVS []KV

KVS - represents key values in an JSON object

func (KVS) MarshalJSON

func (kvs KVS) MarshalJSON() ([]byte, error)

MarshalJSON - implements converting a KVS datastructure into a JSON object with multiple keys and values.

type MetaValue

type MetaValue struct {
	Offset    int
	Length    int
	Depth     int
	Value     interface{}
	ValueType ValueType
}

MetaValue wraps a decoded interface value with the document position and depth at which the value was parsed

type SyntaxError

type SyntaxError struct {
	// contains filtered or unexported fields
}

func (SyntaxError) Error

func (e SyntaxError) Error() string

type ValueType

type ValueType int

ValueType - defines the type of each JSON value

const (
	Unknown ValueType = iota
	Null
	String
	Number
	Boolean
	Array
	Object
)

Different types of JSON value

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL