jscan

package module
v2.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2023 License: BSD-3-Clause Imports: 7 Imported by: 3

README ΒΆ

GoDoc GoReportCard Coverage Status

jscan

jscan provides high-performance zero-allocation JSON iterator and validator for Go. This module doesn't provide Marshal/Unmarshal capabilities, instead it focuses on highly efficient iteration over JSON data with on-the-fly validation.

jscan is tested against https://github.com/nst/JSONTestSuite, a comprehensive test suite for RFC 8259 compliant JSON parsers.

See jscan-benchmark for benchmark results 🏎️ 🏁.

Example

https://go.dev/play/p/moP3l9EkebF

package main

import (
	"fmt"

	"github.com/romshark/jscan/v2"
)

func main() {
	j := `{
		"s": "value",
		"t": true,
		"f": false,
		"0": null,
		"n": -9.123e3,
		"o0": {},
		"a0": [],
		"o": {
			"k": "\"v\"",
			"a": [
				true,
				null,
				"item",
				-67.02e9,
				["foo"]
			]
		},
		"a3": [
			0,
			{
				"a3.a3":8
			}
		]
	}`

	err := jscan.Scan(j, func(i *jscan.Iterator[string]) (err bool) {
		fmt.Printf("%q:\n", i.Pointer())
		fmt.Printf("β”œβ”€ valueType:  %s\n", i.ValueType().String())
		if k := i.Key(); k != "" {
			fmt.Printf("β”œβ”€ key:        %q\n", k[1:len(k)-1])
		}
		if ai := i.ArrayIndex(); ai != -1 {
			fmt.Printf("β”œβ”€ arrayIndex: %d\n", ai)
		}
		if v := i.Value(); v != "" {
			fmt.Printf("β”œβ”€ value:      %q\n", v)
		}
		fmt.Printf("└─ level:      %d\n", i.Level())
		return false // Resume scanning
	})

	if err.IsErr() {
		fmt.Printf("ERR: %s\n", err)
		return
	}
}

Documentation ΒΆ

Index ΒΆ

Examples ΒΆ

Constants ΒΆ

View Source
const (
	DefaultStackSizeIterator  = 64
	DefaultStackSizeValidator = 128
)

Default stack sizes

Variables ΒΆ

This section is empty.

Functions ΒΆ

func Valid ΒΆ

func Valid[S ~string | ~[]byte](s S) bool

Valid returns true if s is a valid JSON value, otherwise returns false.

Unlike (*Validator).Valid this function will take a validator instance from a global pool and can therefore be less efficient. Consider reusing a Validator instance instead.

Types ΒΆ

type Error ΒΆ

type Error[S ~string | ~[]byte] struct {
	// Src refers to the original source.
	Src S

	// Index points to the error start index in the source.
	Index int

	// Code indicates the type of the error.
	Code ErrorCode
}

Error is a syntax error encountered during validation or iteration. The only exception is ErrorCodeCallback which indicates a callback explicitly breaking by returning true instead of a syntax error. (Error).IsErr() returning false is equivalent to err == nil.

func Scan ΒΆ

func Scan[S ~string | ~[]byte](
	s S, fn func(*Iterator[S]) (err bool),
) (err Error[S])

Scan calls fn for every encountered value including objects and arrays. When an object or array is encountered fn will also be called for each of its member and element values.

Unlike (*Parser).Scan this function will take an iterator instance from a global iterator pool and can therefore be less efficient. Consider reusing a Parser instance instead.

TIP: Explicitly cast s to string or []byte to use the global iterator pools and avoid an unecessary iterator allocation such as when dealing with json.RawMessage and similar types derived from string or []byte.

m := json.RawMessage(`1`)
jscan.Scan([]byte(m), // Cast m to []byte to avoid allocation!

WARNING: Don't use or alias *Iterator[S] after fn returns!

Example ΒΆ
j := `{
		"s": "value",
		"t": true,
		"f": false,
		"0": null,
		"n": -9.123e3,
		"o0": {},
		"a0": [],
		"o": {
			"k": "\"v\"",
			"a": [
				true,
				null,
				"item",
				-67.02e9,
				["foo"]
			]
		},
		"a3": [
			0,
			{
				"a3.a3":8
			}
		]
	}`

err := jscan.Scan(j, func(i *jscan.Iterator[string]) (err bool) {
	fmt.Printf("%q:\n", i.Pointer())
	fmt.Printf("β”œβ”€ valueType:  %s\n", i.ValueType().String())
	if k := i.Key(); k != "" {
		fmt.Printf("β”œβ”€ key:        %q\n", k[1:len(k)-1])
	}
	if ai := i.ArrayIndex(); ai != -1 {
		fmt.Printf("β”œβ”€ arrayIndex: %d\n", ai)
	}
	if v := i.Value(); v != "" {
		fmt.Printf("β”œβ”€ value:      %q\n", v)
	}
	fmt.Printf("└─ level:      %d\n", i.Level())
	return false // No Error, resume scanning
})

if err.IsErr() {
	fmt.Printf("ERR: %s\n", err)
	return
}
Output:

"":
β”œβ”€ valueType:  object
└─ level:      0
"/s":
β”œβ”€ valueType:  string
β”œβ”€ key:        "s"
β”œβ”€ value:      "\"value\""
└─ level:      1
"/t":
β”œβ”€ valueType:  true
β”œβ”€ key:        "t"
β”œβ”€ value:      "true"
└─ level:      1
"/f":
β”œβ”€ valueType:  false
β”œβ”€ key:        "f"
β”œβ”€ value:      "false"
└─ level:      1
"/0":
β”œβ”€ valueType:  null
β”œβ”€ key:        "0"
β”œβ”€ value:      "null"
└─ level:      1
"/n":
β”œβ”€ valueType:  number
β”œβ”€ key:        "n"
β”œβ”€ value:      "-9.123e3"
└─ level:      1
"/o0":
β”œβ”€ valueType:  object
β”œβ”€ key:        "o0"
└─ level:      1
"/a0":
β”œβ”€ valueType:  array
β”œβ”€ key:        "a0"
└─ level:      1
"/o":
β”œβ”€ valueType:  object
β”œβ”€ key:        "o"
└─ level:      1
"/o/k":
β”œβ”€ valueType:  string
β”œβ”€ key:        "k"
β”œβ”€ value:      "\"\\\"v\\\"\""
└─ level:      2
"/o/a":
β”œβ”€ valueType:  array
β”œβ”€ key:        "a"
└─ level:      2
"/o/a/0":
β”œβ”€ valueType:  true
β”œβ”€ arrayIndex: 0
β”œβ”€ value:      "true"
└─ level:      3
"/o/a/1":
β”œβ”€ valueType:  null
β”œβ”€ arrayIndex: 1
β”œβ”€ value:      "null"
└─ level:      3
"/o/a/2":
β”œβ”€ valueType:  string
β”œβ”€ arrayIndex: 2
β”œβ”€ value:      "\"item\""
└─ level:      3
"/o/a/3":
β”œβ”€ valueType:  number
β”œβ”€ arrayIndex: 3
β”œβ”€ value:      "-67.02e9"
└─ level:      3
"/o/a/4":
β”œβ”€ valueType:  array
β”œβ”€ arrayIndex: 4
└─ level:      3
"/o/a/4/0":
β”œβ”€ valueType:  string
β”œβ”€ arrayIndex: 0
β”œβ”€ value:      "\"foo\""
└─ level:      4
"/a3":
β”œβ”€ valueType:  array
β”œβ”€ key:        "a3"
└─ level:      1
"/a3/0":
β”œβ”€ valueType:  number
β”œβ”€ arrayIndex: 0
β”œβ”€ value:      "0"
└─ level:      2
"/a3/1":
β”œβ”€ valueType:  object
β”œβ”€ arrayIndex: 1
└─ level:      2
"/a3/1/a3.a3":
β”œβ”€ valueType:  number
β”œβ”€ key:        "a3.a3"
β”œβ”€ value:      "8"
└─ level:      3
Example (Decode2DIntArray) ΒΆ
j := `[[1,2,34,567],[8901,2147483647,-1,42]]`

s := [][]int{}
currentIndex := 0
err := jscan.Scan(j, func(i *jscan.Iterator[string]) (err bool) {
	switch i.Level() {
	case 0: // Root array
		return i.ValueType() != jscan.ValueTypeArray
	case 1: // Sub-array
		if i.ValueType() != jscan.ValueTypeArray {
			return true
		}
		currentIndex = len(s)
		s = append(s, []int{})
		return false
	}
	if i.ValueType() != jscan.ValueTypeNumber {
		// Unexpected array element type
		return true
	}
	vi, errp := strconv.ParseInt(i.Value(), 10, 32)
	if errp != nil {
		// Not a valid 32-bit signed integer
		return true
	}
	s[currentIndex] = append(s[currentIndex], int(vi))
	return false
})
if err.IsErr() {
	fmt.Println(err.Error())
	return
}
fmt.Println(s)
Output:

[[1 2 34 567] [8901 2147483647 -1 42]]
Example (Error_handling) ΒΆ
j := `"something...`

err := jscan.Scan(j, func(i *jscan.Iterator[string]) (err bool) {
	fmt.Println("This shall never be executed")
	return false // No Error, resume scanning
})

if err.IsErr() {
	fmt.Printf("ERR: %s\n", err)
	return
}
Output:

ERR: error at index 13: unexpected EOF

func ScanOne ΒΆ

func ScanOne[S ~string | ~[]byte](
	s S, fn func(*Iterator[S]) (err bool),
) (trailing S, err Error[S])

ScanOne calls fn for every encountered value including objects and arrays. When an object or array is encountered fn will also be called for each of its member and element values.

Unlike Scan, ScanOne doesn't return ErrorCodeUnexpectedToken when it encounters anything other than EOF after reading a valid JSON value. Returns an error if any and trailing as substring of s with the scanned value cut. In case of an error trailing will be a substring of s cut up until the index where the error was encountered.

Unlike (*Parser).ScanOne this function will take an iterator instance from a global iterator pool and can therefore be less efficient. Consider reusing a Parser instance instead.

TIP: Explicitly cast s to string or []byte to use the global iterator pools and avoid an unecessary iterator allocation such as when dealing with json.RawMessage and similar types derived from string or []byte.

m := json.RawMessage(`1`)
jscan.ScanOne([]byte(m), // Cast m to []byte to avoid allocation!

WARNING: Don't use or alias *Iterator[S] after fn returns!

func Validate ΒΆ

func Validate[S ~string | ~[]byte](s S) Error[S]

Validate returns an error if s is invalid JSON.

Unlike (*Validator).Validate this function will take a validator instance from a global pool and can therefore be less efficient. Consider reusing a Validator instance instead.

TIP: Explicitly cast s to string or []byte to use the global validator pools and avoid an unecessary validator allocation such as when dealing with json.RawMessage and similar types derived from string or []byte.

m := json.RawMessage(`1`)
jscan.Validate([]byte(m), // Cast m to []byte to avoid allocation!

func ValidateOne ΒΆ

func ValidateOne[S ~string | ~[]byte](s S) (trailing S, err Error[S])

ValidateOne scans one JSON value from s and returns an error if it's invalid and trailing as substring of s with the scanned value cut. In case of an error trailing will be a substring of s cut up until the index where the error was encountered.

Unlike (*Validator).ValidateOne this function will take a validator instance from a global pool and can therefore be less efficient. Consider reusing a Validator instance instead.

TIP: Explicitly cast s to string or []byte to use the global validator pools and avoid an unecessary validator allocation such as when dealing with json.RawMessage and similar types derived from string or []byte.

m := json.RawMessage(`1`)
jscan.ValidateOne([]byte(m), // Cast m to []byte to avoid allocation!
Example ΒΆ
s := `-120.4` +
	`"string"` +
	`{"key":"value"}` +
	`[0,1]` +
	`true` +
	`false` +
	`null`

for offset, x := 0, s; x != ""; offset = len(s) - len(x) {
	var err jscan.Error[string]
	if x, err = jscan.ValidateOne(x); err.IsErr() {
		panic(fmt.Errorf("unexpected error: %w", err))
	}
	fmt.Println(s[offset : len(s)-len(x)])
}
Output:

-120.4
"string"
{"key":"value"}
[0,1]
true
false
null

func (Error[S]) Error ΒΆ

func (e Error[S]) Error() string

Error stringifies the error implementing the built-in error interface. Calling Error should be avoided in performance-critical code as it relies on dynamic memory allocation.

func (Error[S]) IsErr ΒΆ

func (e Error[S]) IsErr() bool

IsErr returns true if there is an error, otherwise returns false.

type ErrorCode ΒΆ

type ErrorCode int8

ErrorCode defines the error type.

const (

	// ErrorCodeInvalidEscape indicates the encounter of an invalid escape sequence.
	ErrorCodeInvalidEscape ErrorCode

	// ErrorCodeIllegalControlChar indicates the encounter of
	// an illegal control character in the source.
	ErrorCodeIllegalControlChar

	// ErrorCodeUnexpectedEOF indicates the encounter an unexpected end of file.
	ErrorCodeUnexpectedEOF

	// ErrorCodeUnexpectedToken indicates the encounter of an unexpected token.
	ErrorCodeUnexpectedToken

	// ErrorCodeMalformedNumber indicates the encounter of a malformed number.
	ErrorCodeMalformedNumber

	// ErrorCodeCallback indicates return of true from the callback function.
	ErrorCodeCallback
)

type Iterator ΒΆ

type Iterator[S ~string | ~[]byte] struct {
	// contains filtered or unexported fields
}

Iterator provides access to the recently encountered value.

func (*Iterator[S]) ArrayIndex ΒΆ

func (i *Iterator[S]) ArrayIndex() int

ArrayIndex returns either the index of the element value in the array or -1 if the value isn't inside an array.

func (*Iterator[S]) Key ΒΆ

func (i *Iterator[S]) Key() (key S)

Key returns either the object member key or "" when the value isn't a member of an object and hence doesn't have a key.

func (*Iterator[S]) KeyIndex ΒΆ

func (i *Iterator[S]) KeyIndex() int

KeyIndex returns either the start index of the member key string in the source or -1 when the value isn't a member of an object and hence doesn't have a key.

func (*Iterator[S]) KeyIndexEnd ΒΆ

func (i *Iterator[S]) KeyIndexEnd() int

KeyIndexEnd returns either the end index of the member key string in the source or -1 when the value isn't a member of an object and hence doesn't have a key.

func (*Iterator[S]) Level ΒΆ

func (i *Iterator[S]) Level() int

Level returns the depth level of the current value.

For example in the following JSON: `[1,2,3]` the array is situated at level 0 while the integers inside are situated at level 1.

func (*Iterator[S]) Pointer ΒΆ

func (i *Iterator[S]) Pointer() (s S)

Pointer returns the JSON pointer in RFC-6901 format.

func (*Iterator[S]) ScanStack ΒΆ

func (i *Iterator[S]) ScanStack(fn func(keyIndex, keyEnd, arrayIndex int))

ScanStack calls fn for every element in the stack. If keyIndex is != -1 then the element is a member value, otherwise arrayIndex indicates the index of the element in the underlying array.

func (*Iterator[S]) Value ΒΆ

func (i *Iterator[S]) Value() (value S)

Value returns the value if any.

func (*Iterator[S]) ValueIndex ΒΆ

func (i *Iterator[S]) ValueIndex() int

ValueIndex returns the start index of the value in the source.

func (*Iterator[S]) ValueIndexEnd ΒΆ

func (i *Iterator[S]) ValueIndexEnd() int

ValueIndexEnd returns the end index of the value in the source if any. Object and array values have a -1 end index because their end is unknown during traversal.

func (*Iterator[S]) ValueType ΒΆ

func (i *Iterator[S]) ValueType() ValueType

ValueType returns the value type identifier.

func (*Iterator[S]) ViewPointer ΒΆ

func (i *Iterator[S]) ViewPointer(fn func(p []byte))

ViewPointer calls fn and provides the buffer holding the JSON pointer in RFC-6901 format. Consider using (*Iterator[S]).Pointer() instead for safety and convenience.

WARNING: do not use or alias p after fn returns, only reading and copying p are considered safe!

type Parser ΒΆ

type Parser[S ~string | ~[]byte] struct {
	// contains filtered or unexported fields
}

Parser wraps an iterator in a reusable instance. Reusing a parser instance is more efficient than global functions that rely on a global iterator pool.

func NewParser ΒΆ

func NewParser[S ~string | ~[]byte](preallocStackFrames int) *Parser[S]

NewParser creates a new reusable parser instance. A higher preallocStackFrames value implies greater memory usage but also reduces the chance of dynamic memory allocations if the JSON depth surpasses the stack size. preallocStackFrames of 32 is equivalent to ~1KiB of memory usage on 64-bit systems (1 frame = ~32 bytes). Use DefaultStackSizeIterator when not sure.

func (*Parser[S]) Scan ΒΆ

func (p *Parser[S]) Scan(
	s S, fn func(*Iterator[S]) (err bool),
) Error[S]

Scan calls fn for every encountered value including objects and arrays. When an object or array is encountered fn will also be called for each of its member and element values.

WARNING: Don't use or alias *Iterator[S] after fn returns!

func (*Parser[S]) ScanOne ΒΆ

func (p *Parser[S]) ScanOne(
	s S, fn func(*Iterator[S]) (err bool),
) (trailing S, err Error[S])

ScanOne calls fn for every encountered value including objects and arrays. When an object or array is encountered fn will also be called for each of its member and element values.

Unlike Scan, ScanOne doesn't return ErrorCodeUnexpectedToken when it encounters anything other than EOF after reading a valid JSON value. Returns an error if any and trailing as substring of s with the scanned value cut. In case of an error trailing will be a substring of s cut up until the index where the error was encountered.

WARNING: Don't use or alias *Iterator[S] after fn returns!

type Validator ΒΆ

type Validator[S ~string | ~[]byte] struct {
	// contains filtered or unexported fields
}

Validator is a reusable validator instance. The validator is more efficient than the parser at JSON validation. A validator instance can be more efficient than global Valid, Validate and ValidateOne function calls due to potential stack frame allocation avoidance.

func NewValidator ΒΆ

func NewValidator[S ~string | ~[]byte](preallocStackFrames int) *Validator[S]

NewValidator creates a new reusable validator instance. A higher preallocStackFrames value implies greater memory usage but also reduces the chance of dynamic memory allocations if the JSON depth surpasses the stack size. preallocStackFrames of 1024 is equivalent to ~1KiB of memory usage (1 frame = 1 byte). Use DefaultStackSizeValidator when not sure.

func (*Validator[S]) Valid ΒΆ

func (v *Validator[S]) Valid(s S) bool

Valid returns true if s is a valid JSON value, otherwise returns false.

func (*Validator[S]) Validate ΒΆ

func (v *Validator[S]) Validate(s S) Error[S]

Validate returns an error if s is invalid JSON, otherwise returns a zero value of Error[S].

func (*Validator[S]) ValidateOne ΒΆ

func (v *Validator[S]) ValidateOne(s S) (trailing S, err Error[S])

ValidateOne scans one JSON value from s and returns an error if it's invalid and trailing as substring of s with the scanned value cut. In case of an error trailing will be a substring of s cut up until the index where the error was encountered.

type ValueType ΒΆ

type ValueType int8

ValueType defines a JSON value type

const (
	ValueTypeObject ValueType
	ValueTypeArray
	ValueTypeNull
	ValueTypeFalse
	ValueTypeTrue
	ValueTypeString
	ValueTypeNumber
)

JSON value types

func (ValueType) String ΒΆ

func (t ValueType) String() string

Directories ΒΆ

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL