zll

package
v0.0.0-...-86e9f11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 7, 2024 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Overview

Package zll exposes types and procedures related to low-level zion decoding. Callers should prefer the high-level zion package.

Index

Constants

View Source
const (
	BucketBits = 4
	NumBuckets = 1 << BucketBits
	BucketMask = NumBuckets - 1
)
View Source
const MaxBucketSize = 1 << 21

MaxBucketSize is the maximum size of a compressed bucket.

Variables

This section is empty.

Functions

func AppendMagic

func AppendMagic(dst []byte, algo BucketAlgo, seed uint8) []byte

AppendMagic appends the zion magic bytes plus the seed bits to dst.

NOTE: currently only the lowest 4 bits of seed should be set. The rest are reserved for future use.

func FrameSize

func FrameSize(src []byte) (int, error)

FrameSize returns the number compressed bytes within the next frame. This is the same number that Decompress would return as the number of bytes consumed if called on src.

func Hash64

func Hash64(seed uint32, sym ion.Symbol) uint64

Hash64 hashes a symbol using a 32-bit seed.

func IsMagic

func IsMagic(x []byte) bool

IsMagic returns true if x begins with the 4-byte magic number for zion-encoded streams, or false otherwise.

func SymbolBucket

func SymbolBucket(seed uint32, selector uint8, sym ion.Symbol) int

SymbolBucket maps an ion symbol to a bucket using a specific hash seed and bit-selector.

Types

type BucketAlgo

type BucketAlgo uint8

BucketAlgo is an algorithm used to compress buckets.

const (
	// CompressZstd indicates that buckets are compressed
	// using vanilla zstd compression.
	CompressZstd BucketAlgo = iota
	// CompressIguanaV0 indicates that buckets are
	// compressed using the experimental iguana compression.
	CompressIguanaV0
	// CompressIguanaV0Specialized indicates
	// that buckets are compressed using the experimental
	// iguana compression OR a specialized algorithm given
	// by the first byte of the data.
	// If the first byte of the data is a null byte, then
	// IguanaV0 is used.
	CompressIguanaV0Specialized
)

func (BucketAlgo) Compress

func (a BucketAlgo) Compress(hints *BucketHints, src, dst []byte) ([]byte, error)

Compress compresses data from src and appends it to dst, returning the new dst slice or an error. If [hints] is non-nil, it may be used to improve the quality of the compression performed.

func (BucketAlgo) Decompress

func (a BucketAlgo) Decompress(src, dst []byte) ([]byte, int, error)

Decompress decompressed data from src, appending it to dst. Decompress returns the new dst, the number of compressed bytes consumed, and the first error encountered, if any.

func (BucketAlgo) String

func (a BucketAlgo) String() string

type BucketHints

type BucketHints struct {
	// Elements is the number of (symbol, value) pairs in the bucket.
	Elements int
	// TypeSet is a bitmap containing all of the
	// possible ion types for values in this bucket.
	TypeSet uint16
	// ListTypeSet is a bitmap of all the possible
	// ion types for sub-elements of the top level type
	// when the top-level type is an ion list type.
	// (This may be zero even when the top-level type
	// is only a list type iif the top-level lists are all empty.)
	ListTypeSet uint16
}

BucketHints is a set of hints to be provided for compression. The zero value of BucketHints implies there are no hints available.

type Buckets

type Buckets struct {
	// Shape is used to determine the seed and
	// symbol table used for populating the right buckets.
	Shape *Shape

	// Pos is the starting position of each
	// bucket within Decompressed, or -1 if the bucket
	// has not yet been decoded.
	Pos [NumBuckets]int32
	// Decompressed contains the raw decompressed buckets
	Decompressed []byte
	// Compressed contains the compressed buckets
	Compressed []byte
	// SymbolBits is a bitmap of symbol IDs;
	// only top-level symbols that need to be
	// extracted have their corresponding bit set.
	SymbolBits []uint64
	// BucketBits is a bitmap of buckets;
	// bit N == 1 implies that bucket N has been
	// decompressed.
	BucketBits uint32
	// Decomps is the number of individual bucket
	// decompression operations that have been performed.
	Decomps int

	// SkipPadding, if set, causes the calls to
	// Select and SelectSymbols to omit padding
	// Decompressed. If SkipPadding is not set,
	// then Decompressed is padded so that its
	// capacity allows the byte at len(Decompressed)-1
	// to be read with an 8-byte load.
	SkipPadding bool
}

Buckets represents the decompression state of the "buckets" portion of a zion block.

func (*Buckets) Reset

func (b *Buckets) Reset(shape *Shape, compressed []byte)

Reset resets the state of b to point to the given shape and compressed buckets.

func (*Buckets) Select

func (b *Buckets) Select(components []string) error

Select ensures that all the buckets corresponding to the selected components are already decompressed. Supplying a nil list of components causes all buckets to be decompressed. Select may be called more than once with different sets of components. Each time Select is called, it resets b.BucketBits and b.SymbolBits to correspond to the most-recently-selected components, but it does not reset the b.Pos displacements into decompressed data.

func (*Buckets) SelectAll

func (b *Buckets) SelectAll() error

SelectAll is equivalent to b.Select(nil)

func (*Buckets) SelectSymbols

func (b *Buckets) SelectSymbols(syms []ion.Symbol) error

SelectSymbols works identically to Select, but it picks the top-level path components by their symbol IDs rather than the names of the path components.

func (*Buckets) Selected

func (b *Buckets) Selected(sym ion.Symbol) bool

Selected indicates whether or not the symbol is one of the symbols selected by Select for the current symbol table.

type Shape

type Shape struct {
	// Symtab is the current symbol table.
	// Callers may plug in their own symbol table implementation.
	Symtab Symtab
	// Bits is the raw shape bitsream, including the leading symbol table.
	Bits []byte
	// Start is the position within Bits that the actual shape bits start.
	// This may be non-zero if the Bits stream has a leading symbol table.
	Start int
	// Seed is the 32-bit seed stored in the shape preamble;
	// it contains the selector used to hash symbols and the
	// algorithm used to compress buckets. All the other bits
	// are reserved and should be zero.
	Seed uint32
}

Shape manages the stateful part of decoding relevant to the "shape" portion of the stream.

func (*Shape) Algo

func (s *Shape) Algo() BucketAlgo

Algo returns the bucket compression algorithm as indicated by s.Seed.

func (*Shape) Count

func (s *Shape) Count() (int, error)

Count returns the number of records implied by the shape bitstream. The caller should have already called s.Decode at least once to populate the shape bits.

func (*Shape) Decode

func (s *Shape) Decode(src []byte) ([]byte, error)

Decode decodes the shape portion of src into s.Symtab and s.Bits and returns the buckets portion. Note that zion streams tend to be stateful, so the order in which Decode is called on sequences of blocks will change how s.Symtab is computed.

func (*Shape) SymbolBucket

func (s *Shape) SymbolBucket(sym ion.Symbol) int

SymbolBucket determines the bucket in which the top-level fields associated with sym would be encoded.

type Symtab

type Symtab interface {
	Unmarshal([]byte) ([]byte, error)
	Symbolize(x string) (ion.Symbol, bool)
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL