Documentation ¶
Index ¶
Constants ¶
const ( KindInvalid uint8 = iota KindString KindNumber KindObject KindArray KindTrue KindFalse KindNull )
nanojson packs all JSON data types into a Value - to do that, we can identify which data type it is by inspecting the Kind field, which will have one of the 9 constants below.
Variables ¶
var Pools = struct { ValueSlice *sync.Pool EncodeStateBuf *sync.Pool Value *sync.Pool PropertyMap *sync.Pool }{ ValueSlice: &sync.Pool{ New: func() interface{} { return make([]Value, 0, 1024) }, }, EncodeStateBuf: &sync.Pool{ New: func() interface{} { return make([]byte, 255) }, }, Value: &sync.Pool{ New: func() interface{} { return &Value{} }, }, PropertyMap: &sync.Pool{ New: func() interface{} { return make(map[string]int) }, }, }
Pools are sync.Pools used to efficiently reuse data structures that would otherwise escape to heap (and require to be allocated every time, thus leading to overall slowness of the package.) Normally, users of the package don't need to fine-tune this, however if you have particular needs it might come in handy to change some of them.
ValueSlice ¶
ValueSlice is used when creating children elements to a Value - which is to say when there is an object or an array. Since ValueSlice enters the domain of the user, Children slices are not automatically returned to the pool - however they will be if the user practices good hygene and calls Recycle on the root value once it's done dealing with the parsed JSON data. Recycling greatly improves the speed of parsing.
The cap of the slices in the pool is vital. When parsing a JSON object or array, nanojson will cycle through the children elements and will append values to the slice, as long as the cap is not reached. Once len == cap, the parser will give back the slice to the pool, and will switch to use append to grow and add new elements to the slice. This, of course, incurs in a costly memory allocation.
By default, the pool always returns a []Value of size 1024 - on the assuption that most APIs often return less than (or equal to) that in arrays and objects. The downside of this is that even a simple [1,2,3] takes up 120kb of data (on 64-bit systems). This may seem disastrous, especially after considering a potentially dangerous payload like the following Python code:
"[" + ("[" * 40 + "1337" + ",[[1337]]]" * 40 + ",") * 500 + "[[7331]]" + "]"
However, you should not forget that in modern times our machines have virtual RAM which can help handle such abuses of memory - so you should probably not worry about exceeding of the physical RAM.
If you want to replace ValueSlice and want to have an estimate of how memory a []Value takes, it's unsafe.Sizeof(Value{}) * cap.
EncodeStateBuf ¶
When calling Encode on a value, EncodeStateBuf is called to retrieve a []byte of size 255. (note: for the moment it MUST be 255, no other size is allowed.) The buffer is used mostly to batch calls to Write - the change showed roughly a 0.75x improvement in speed in our benchmarks, although it did place more strain on the encoder rather than the writer.
Functions ¶
Types ¶
type LeftoverError ¶
type LeftoverError []byte
LeftoverError is returned by Parse when the given data is more than just the expected JSON value to be parsed, and any additional trailing whitespace in the set of ' ', '\t', '\n', '\r'. It will contain the additional data, and the error itself will tell the amount of bytes left over.
func (*LeftoverError) Error ¶
func (e *LeftoverError) Error() string
type ParseError ¶
type ParseError struct { // What were we parsing when the error was found? Kind uint8 // Position in the byte slice passed to Parse, and character that triggered // the error. Pos int Char byte // Proper reason why the error happened. Reason string }
ParseError is a general error happened during parsing.
func (*ParseError) Error ¶
func (d *ParseError) Error() string
type UnmarshalOptions ¶
type UnmarshalOptions struct { // By default (false), Unmarshal will copy its data parameter to a new array // - that is because the caller might want to retain the original data, // whereas Value.Parse in nanojson actually rewrites the original byte // slice. Set to true if you don't care if we touch your data parameter. DisableDataCopy bool // When assigning a Value to a []byte, normally it is enough to do a simple // assignment which points at the reference in the original data array. // If CopyData is true, however, the value is copied over. // It only makes sense to set this to true if DisableDataCopy is also true. // (An example of appropriate use would be if you use a []byte from a pool // to pass to Unmarshal and you want to retain the struct in which you // Unmarshal for long after you give back the []byte to the pool. This would // ensure that the data in the struct doesn't become invalid.) // An important note: when the destination is a string the value is always // copied over regardless. CopyData bool }
UnmarshalOptions specifies the options for parsing JSON data in nanojson.
func (*UnmarshalOptions) Unmarshal ¶
func (u *UnmarshalOptions) Unmarshal(data []byte, v interface{}) error
Unmarshal will parse the first valid JSON value inside of data, and attempt the best it can to unmarshal it into v. If v is not a pointer or if v is nil, then Unmarshal will return an error. Unmarshal is not exactly backwards-compatible with the encoding/json equivalent, so some changes may be necessary, but the process should be rather painless if not using interfaces.
If after the first value in data, there are more non-whitespace bytes, then an error of type LeftoverError is returned, and unmarshaling is interrupted.
In unmarshaling itself, this leads to some kind of loose typing, and some cases will be automatically converted. Specifically:
bool: true if Kind == KindTrue, a string containing only "true", or a non-0 number. Rejected if it's an array, object or null. numbers: strconv.ParseInt/Uint/Float, even if the JSON is a string. Will return an error if the strconv functions return one, or if it overflows the Go value. string, []byte, [X]byte: the JSON string, or the raw unparsed number. Empty otherwise. slices, arrays: will convert each child into its Go representation, except if the element is uint8/byte. (see above) maps: rejected if key is not a string or JSON is not an object. Will set each key to match the converted JSON value. interface: if *Value implements the interface, then a clone of the original *Value will be assigned to it. Otherwise, it is rejected. struct: matching like encoding/json, except it's case sensitive.
The biggest difference to note here is that the unmarshaling process will not automatically create a Go value for you when you specify interface{}. On the contrary - it will simply set it to a *nanojson.Value, and you will have to take care of handling the value dynamically. Note that in this case, as well as in the case of having a field in your struct which is a Value or *Value, to ensure data integrity the Value must be cloned first, which incurs in a costly alloc+copy (especially if the element has children!).
So what should you do when you need to handle a JSON value dynamically? Implement the Unmarshaler interface in a type you define. This way, the *Value will not be cloned, instead it will be passed by reference - and it will be your burden to ensure data integrity.
Unmarshal is also backwards-compatible with json.Unmarshaler - it is important to note, though, that since the parsing process of nanojson involves even rewriting the original byte slice to decode strings, that this will incur in an EncodeJSON to a temporary buffer before calling UnmarshalJSON.
type Unmarshaler ¶
Unmarshaler is the interface of types capable of unmarhshaling a description of themselves as nanojson.Values. UnmarshalValue, if retaining the Value, should always clone it and never hold the reference beyond its lifespan.
type Value ¶
type Value struct { // Kind of the Value. See the constants Kind*. Kind uint8 // Value is filled in the cases of KindNumber and KindString. For // KindString, Value is actually the parsed string, with all the slash // escaping replaced with their Go representation. Numbers are placed as // they are, and parsing of them is left to the user. // In encoding, if Kind is KindString, the Value is appropriately escaped. // Otherwise, in the case of KindNumber, Value is copied with no operation // in-between. (So yes, KindNumber can be used to encode raw JSON data.) Value []byte // Key is only set if the upper Value is of KindObject - in this case, // it will be set to the parsed string of the key. Key []byte // Children is set in case the Kind is KindObject or KindArray. The children // properties are listed in the Children - in the case of KindObject, the // children will also have their "Key" field set. Children []Value // contains filtered or unexported fields }
Value specifies a single value in the JSON data. Value is the raw representation of JSON data before it is placed into a Go value. Efficient use of a Value should get it and place it from a pool (e.g. Pools.Value), of course taking care in resetting it before placing it back.
func (*Value) Clone ¶
Clone creates a copy of a Value that is completely independent of the the original. It will still be dependent on the original slice of bytes not being changed.
func (*Value) EncodeJSON ¶
EncodeJSON encodes v to its JSON representation, and writes the result to w.
func (*Value) Parse ¶
Parse parses a single value from the Reader. To ensure zero allocation, Parse reserves the right to modify b's content (e.g. for parsing strings); for this reason, you must create a copy to pass to b in case you wish to retain the original.
It is also important to note that v will hold references to parts of b, therefore v's content is only valid as long as b is not modified.
Parse will read only the first value inside of b, and expects the rest to be exclusively whitespace. If anything else is found, then an error of type LeftoverError is returned.
func (*Value) Property ¶
Property gets the children element in the object which Key is s, or nil if it does not exist.
Property will use an internal property map to find the desired element if possible; this is the case of a Value which has been created or modified by Parse. If not available, it will iterate through the items. The only issue which may arise with this is the case where v.Children has been modified and v was created through Parse - in that case, Property may return nil even if one of the children does have the desired key. In that case, the user can create their own logic for finding the key, which should be pretty trivial.
func (*Value) Recycle ¶
func (v *Value) Recycle()
Recycle gives all the Children slices back to the pool, recursively, so that they can be reused in future parses. Callers must not retain references to v.Children or any of its values - if they wish to retain values, they should copy them or not call Recycle(). Keeping references to the children's .Value or .Key is allowed, as it is a reference to the slice and not actually reused.