Documentation ¶
Overview ¶
Package rezi provides the ability to encode and decode data in Rarefied Encoding (Compressible) Interchange format. It allows types that implement encoding.BinaryUnmarshaler and encoding.BinaryMarshaler to be easily read from and written to byte slices. It has an interface similar to the json package, where one function is used to encode all supported types, and another function receives bytes and a receiver for decoded data and infers how to decode the bytes based on the receiver.
The Enc function is to encode any supported type to REZI bytes:
import "github.com/dekarrin/rezi" func main() { specialNumber := 413 name := "TEREZI" var numData []byte var nameData []byte var err error numData, err = rezi.Enc(specialNumber) if err != nil { panic(err.Error()) } nameData, err = rezi.Enc(name) if err != nil { panic(err.Error()) } }
Data from multiple calls to Enc() can be combined into a single block of data by appending them together:
var allData []byte allData = append(allData, numData...) allData = append(allData, nameData...)
The Dec function is used to decode data from REZI bytes:
var readNumber int var readName string var n int var err error n, err := rezi.Dec(allData, &readNumber) if err != nil { panic(err.Error()) } allData = allData[n:] n, err := rezi.Dec(allData, &readName) if err != nil { panic(err.Error()) } allData = allData[n:]
Error Checking ¶
Errors in REZI have specific types that can be checked in order to determine the cause of an error. These errors conform to the errors interface and must be checked by using errors.Is.
As mentioned in that library's documentation, errors should not be checked with simple equality checks. REZI enforces this fully; non-nil errors that are checked with `==` will never return true.
That is
if err == rezi.Error
is not only the non-preferred way of checking an error, but will always return false. Instead, do:
if errors.Is(err, rezi.Error)
There are several error types defined for checking non-nil errors. Error is the type that all non-nil errors from REZI will match. It may be caused by some other underlying error; again, use errors.Is to check this, even if a non-rezi error is being checked. For instance, to check if an error was caused due to the supplied bytes being shorter than expected, use errors.Is(err, io.ErrUnexpectedEOF).
See the individual functions for a list of error types that non-nil returned errors may be checked against.
Supported Data Types ¶
REZI supports several built-in basic Go types: int (as well as all of its unsigned and specific-bitsize varieties), string, bool, and any type that implements encoding.BinaryMarshaler (for encoding) or whose pointer type implements encoding.BinaryUnmarshaler (for decoding).
Floating point types and complex types are not supported at this time, although they may be added in a future release.
Slices and maps are supported with some stipulations. Slices must contain only other supported types (or pointers to them). Maps have the same restrictions on their values, but only maps with a key type of string, int (or any of its unsigned and specific-bitsize varieties), or bool are supported.
Pointers to any supported type are also accepted, including to other pointer types with any number of indirections. The REZI format encodes information on how many levels of indirection are valid, though of course note that it does not have any concept of two different pointer variables pointing to the same data.
Binary Data Format ¶
REZI uses a binary format for all supported types. Other than bool, which is encoded as simply one of two byte values, an encoded value will start with one or more "info" bytes that gives metadata on the value itself. This is typically the length of the full value, but may include additional information such as whether the encoded value is in fact a nil pointer.
Note that the info byte does not give information on the type of the encoded value, besides whether it is nil (and still, the type of the nil is not encoded). Types of the encoded values are inferred by the pointer receiver that is passed to Dec(). If a pointer to an int is passed to it, the bytes will be interpreted as an encoded int; likewise, if a pointer to a string is passed to it, the bytes will be interpreted as an encoded string.
The INFO Byte Layout: SXNILLLL | | MSB LSB
The info byte has information coded into its bits represented as SXNILLLL, where each letter stands for a particular bit, from most signficant to the least significant.
The bit labeled "S" is the sign bit; when high (1), it indicates that the following integer value is negative.
The "X" bit is the extension flag, and indicates that the next byte is a second info byte with additional information. At this time that bit is unused, but is planned to be used in future releases.
The "N" bit is the explicit nil flag, and when set it indicates that the value is a nil and that there are no following bytes which make up its encoding, with the exception of any indirection amount indicators.
The "I" bit is the indirection bit, and if set, indicates that the following bytes encode the number of additional indirections of the pointer beyond the initial indirection at which the nil occurs; for instance, a nil *int value is encoded as simply the info byte 0b00100000, but a non nil **int that points at a nil *int would be encoded with one level of additional indirection and the info byte's I bit would be set.
The "L" bits make up the length of the value. Together, they are a 4-bit unsigned integer that indicates how many of the following bytes are part of the encoded value. If the I bit is set on the info byte, the L bits give the number of bytes that make up the indirection level rather than the actual value.
Bool Values Layout: [ VALUE ] 1 byte
Boolean values are encoded in REZI as the byte value 0x01 for true, or 0x00 for false. Bool is the only type whose encoded value does not begin with an info byte, although a pointer-to-bool may be encoded with an info byte if it is nil.
Integer Values Layout: [ INFO ] [ INT VALUE ] 1 byte 0..8 bytes
Integer values begin with the info byte. Assuming that it is not nil, the 4 L bits of the info byte give the number of bytes that are in the value itself, and the S bit represents whether the value is negative.
The INT VALUE portion of the integer includes all bytes necessary to rebuild the integer value. It is created by first taking the integer's value expanded to 64-bits, and then removing all leading insignificant bytes (those with a value of 0x00 for positive integers, or those with a value of 0xff for negative integers). These bytes are then used as the INT VALUE.
As a result of the above encoding, certain integer values can be encoded with no bytes in INT VALUE at all; the 64-bit representation for 0 all 0x00's, and therefore has no significant bytes. Likewise, the 64-bit representation for -1 using two's complement representation is all 0xff's.
All Go integer types are encoded in the same way. This includes int, int8, int16, int32, int64, uint, uint8, uint16, uint32, and uint64. The specific interpretation into a value is handled at decoding time by infering the type from the pointer passed to Enc.
String Values Layout: [ INFO ] [ INT VALUE ] [ CODEPOINT 1 ] ... [ CODEPOINT N ] <---CODEPOINT COUNT--> <------------CODEPOINTS-----------> 1..9 bytes COUNT..COUNT*4 bytes
String values are encoded as a count of codepoints (which is itself encoded as an integer value), followed by the Unicode codepoints that make up the string encoded with UTF-8. Due to the count being of Unicode codepoints rather than bytes, the actual number of bytes in an encoded string will be between the minimum and maximum number of bytes needed to encode a codepoint in UTF-8, multiplied by the number of codepoints.
encoding.BinaryMarshaler Values Layout: [ INFO ] [ INT VALUE ] [ MARSHALED BYTES ] <-------COUNT--------> <-MARSHALED BYTES-> 1..9 bytes COUNT bytes
Any type that implements encoding.BinaryMarshaler is encoded by taking the result of calling its MarshalBinary() method and prepending it with an integer value giving the number of bytes in it.
Slice Values Layout: [ INFO ] [ INT VALUE ] [ ITEM 1 ] ... [ ITEM N ] <-------COUNT--------> <--------VALUES---------> 1..9 bytes COUNT bytes
Slices are encoded as a count of bytes that make up the entire slice, followed by the encoded value of each element in the slice. There is no special delimiter between the encoded elements; when one ends, the next one begins.
Map Values Layout: [ INFO ] [ INT VALUE ] [ KEY 1 ] [ VALUE 1 ] ... [ KEY N ] [ VALUE N ] <-------COUNT--------> <-------------------VALUES--------------------> 1..9 bytes COUNT bytes
Map values are encoded as a count of all bytes that make up the entire map, followed by pairs of the encoded keys and associated values for each element of the map. Each pair consistes of the encoded key, followed immediately by the encoded value that the key maps to. There is no special delimiter between key-value pairs or between the key and value in a pair; where one ends, the next one begins.
The encoded keys are placed in a consistent order; encoding the same map will result in the same encoding regardless of the order of keys encountered during iteration over the keys.
Nil Values Layout: [ INFO ] [ INT VALUE ] 1 byte 0..8 bytes
Nil values are encoded similarly to integers, with one major exception: the nil bit in the info byte is set to true. This allows a nil to be stored in the same place as a length count, so when interpreting data, a length count can be checked for nil and if nil, instead of the normal value being decoded, a nil value is decoded.
Nil pointers to a non-pointer type of any kind are encoded as a single info byte with the nil bit set and the indirection bit unset.
Pointers that are themselves not nil but point to another pointer which is nil are encoded slightly differently. In this case, the info byte will have both the nil bit and the indirection bit set, and its length bits will be non-zero and give the number of bytes which follow that make up an encoded integer. The encoded integer gives the number of indirections that are done before a nil pointer is arrived at. For instance, a ***int that points to a valid **int that itself points to a valid *int which is nil would be encoded as a nil with indirection level of 2.
Encoded nil values are *not* typed; they will be interpreted as the same type as the pointed-to value of the receiver passed to REZI during decoding.
Pointer Values Layout: (either encoded value type, or encoded nil)
A pointer is not encoded in a special manner. Instead, the value they point to is encoded as though it were not pointer, and when decoding to a pointer, the value is first decoded, then a pointer to the decoded value is used as the value of the pointer.
If a pointer is nil, it is instead encoded as a nil value.
Pointers that have multiple levels of indirection before arriving at the pointed-to value are not treated any differently when non-nil; i.e. an **int which points to an *int which points to an int with value 413 would be encoded as an integer value representing 413. If a pointer with multiple levels of indirection has a nil somewhere in the indirection chain, it is encoded as a nil value; see the section on nil value encodings for a description of how this information is captured.
Compatibility:
Older versions of the REZI encoding indicated nil by giving -1 as the byte count. This version of REZI will read this as well and can interpret it correctly, however do note that it will only be able to handle a single level of indirection, i.e. a nil pointer-to-type, with no additional indirections.
Index ¶
- Variables
- func Dec(data []byte, v interface{}) (n int, err error)
- func DecBinary(data []byte, b encoding.BinaryUnmarshaler) (int, error)deprecated
- func DecBool(data []byte) (bool, int, error)deprecated
- func DecInt(data []byte) (int, int, error)deprecated
- func DecMapStringToBinary[E encoding.BinaryUnmarshaler](data []byte) (map[string]E, int, error)deprecated
- func DecMapStringToInt(data []byte) (map[string]int, int, error)deprecated
- func DecSliceBinary[E encoding.BinaryUnmarshaler](data []byte) ([]E, int, error)deprecated
- func DecSliceString(data []byte) ([]string, int, error)deprecated
- func DecString(data []byte) (string, int, error)deprecated
- func Enc(v interface{}) (data []byte, err error)
- func EncBinary(b encoding.BinaryMarshaler) []bytedeprecated
- func EncBool(b bool) []bytedeprecated
- func EncInt(i int) []bytedeprecated
- func EncMapStringToBinary[E encoding.BinaryMarshaler](m map[string]E) []bytedeprecated
- func EncMapStringToInt(m map[string]int) []bytedeprecated
- func EncSliceBinary[E encoding.BinaryMarshaler](sl []E) []bytedeprecated
- func EncSliceString(sl []string) []bytedeprecated
- func EncString(s string) []bytedeprecated
- func MustDec(data []byte, v interface{}) int
- func MustEnc(v interface{}) []byte
- type Decoderdeprecated
- type Encoderdeprecated
Constants ¶
This section is empty.
Variables ¶
var ( // Error is a general error returned from encoding and decoding functions. // All non-nil errors returned from this package will return true for the // expression errors.Is(err, Error). Error = errors.New("a problem related to the binary REZI format has occurred") // ErrMarshalBinary indicates that calling a MarshalBinary method on a type // that was being encoded returned a non-nil error. Any error returned from // this package that was caused by this will return true for the expression // errors.Is(err, ErrMarshalBinary). ErrMarshalBinary = errors.New("MarshalBinary() returned an error") // ErrUnmarshalBinary indicates that calling an UnmarshalBinary method on a // type that was being decoded returned a non-nil error. Any error returned // from this package that was caused by this will return true for the // expression errors.Is(err, ErrUnmarshalBinary). ErrUnmarshalBinary = errors.New("UnmarshalBinary() returned an error") // ErrInvalidType indicates that the value to be encoded or decoded to is // not of a valid type. Any error returned from this package that was caused // by this will return true for the expression // errors.Is(err, ErrInvalidType). ErrInvalidType = errors.New("data is not the correct type") // ErrMalformedData indicates that there is a problem with the data being // decoded. Any error returned from this package that was caused by this // will return true for the expression errors.Is(err, ErrMalformedData). ErrMalformedData = errors.New("data cannot be interpretered") )
Functions ¶
func Dec ¶ added in v1.1.0
Dec decodes a value from REZI-format bytes in data, starting with the first byte in it. Returns the number of bytes consumed in order to read the complete value. If the data slice was constructed by appending encoded values together, then skipping over n bytes after a successful call to Dec will result in the next call to Dec reading the next subsequent value.
V must be a non-nil pointer to a type supported by REZI. The type of v is examined to determine how to decode the value. The data itself is not examined for type inference, therefore v must be a pointer to a compatible type. V is only assigned to on successful decoding; if this function returns a non-nil error, v will not have been assigned to.
If a problem occurs while decoding, the returned error will be non-nil and will return true for errors.Is(err, rezi.Error). Additionally, the same expression will return true for other error types, depending on the cause of the error. Do not check error types with the equality operator ==; this will always return false.
Non-nil errors from this function can match the following error types: Error in all cases. ErrInvalidType if the type pointed to by v is not supported or if v is a nil pointer. ErrUnmarshalBinary if an implementor of encoding.BinaryUnmarshaler returns an error from its UnmarshalBinary method (additionally, the returned error will match the same types that the error returned from UnmarshalBinary would match). io.ErrUnexpectedEOF if there are fewer bytes than necessary to decode the value. ErrMalformedData if there is any problem with the data itself (including there being fewer bytes than necessary to decode the value).
func DecBinary
deprecated
func DecBinary(data []byte, b encoding.BinaryUnmarshaler) (int, error)
decBinary decodes a value at the start of the given bytes and calls UnmarshalBinary on the provided object with those bytes. If a nil value was encoded, then a nil byte slice is passed to the UnmarshalBinary func.
It returns the total number of bytes read from the data bytes.
Deprecated: this function has been replaced by Dec.
func DecMapStringToBinary
deprecated
func DecSliceBinary
deprecated
func DecSliceBinary[E encoding.BinaryUnmarshaler](data []byte) ([]E, int, error)
DecSliceBinary decodes a slice of implementors of encoding.BinaryUnmarshaler from the data bytes.
Deprecated: This function requires the slice value type to directly implement encoding.BinaryUnmarshaler. Use Dec instead, which allows any type as a slice value provided that a *pointer* to it implements encoding.BinaryUnmarshaler.
func Enc ¶ added in v1.1.0
Enc encodes a value to REZI-format bytes. The type of the value is examined to determine how to encode it. No type information is included in the returned bytes, so it is up to the caller to keep track of it and use a receiver of a compatible type when decoding.
If a problem occurs while encoding, the returned error will be non-nil and will return true for errors.Is(err, rezi.Error). Additionally, the same expression will return true for other error types, depending on the cause of the error. Do not check error types with the equality operator ==; this will always return false.
Non-nil errors from this function can match the following error types: Error in all cases. ErrInvalidType if the type of v is not supported. ErrMarshalBinary if an implementor of encoding.BinaryMarshaler returns an error from its MarshalBinary method (additionally, the returned error will match the same types that the error returned from MarshalBinary would match).
func EncBinary
deprecated
func EncBinary(b encoding.BinaryMarshaler) []byte
encBinary encodes a BinaryMarshaler as a slice of bytes. The value can later be decoded with DecBinary. Encoded output starts with an integer (as encoded by EncBinaryInt) indicating the number of bytes following that make up the object, followed by that many bytes containing the encoded value.
The output will be variable length; it will contain 8 bytes followed by the number of bytes encoded in those 8 bytes.
Deprecated: This function has been replaced by Enc.
func EncBool
deprecated
encBool encodes the bool value as a slice of bytes. The value can later be decoded with DecBool. No type indicator is included in the output; it is up to the caller to add this if they so wish it.
The output will always contain exactly 1 byte.
Deprecated: This function has been replaced by Enc.
func EncInt
deprecated
EncInt encodes the int value as a slice of bytes. The value can later be decoded with DecInt. No type indicator is included in the output; it is up to the caller to add this if they so wish it. Integers up to 64 bits are supported with this encoding scheme.
The returned slice will be 1 to 9 bytes long. Integers larger in magnitude will result in longer slices; only 0 is encoded as a single byte.
Encoded integers start with an info byte that packs the sign and the number of following bytes needed to represent the value together. The sign is encoded as the most significant bit (the first/leftmost bit) of the byte, with 0 being positive and 1 being negative. The next significant 3 bits are unused. The least significant 4 bits contain the number of bytes that are used to encode the integer value. The bits in the info byte can be represented as `SXXXLLLL`, where S is the sign bit, X are unused bits, and L are the bits that encode the remaining length.
The remaining bytes give the value being encoded as a 2's complement 64-bit big-endian integer, omitting any leading bytes that would be encoded as 0x00 if the integer is positive, or 0xff if the integer is negative. The value 0 is special and is encoded as with infobyte 0x00 with no additional bytes. Because two's complement is used and as a result of the rules, -1 also requires no bytes besides the info byte (because it would simply be a series of eight 0xff bytes), and is therefore encoded as 0x80.
Additional examples: 1 would be encoded as [0x01 0x01], 2 as [0x01 0x02], 500 as [0x02 0x01 0xf4], etc. -2 would be encoded as [0x81 0xfe], -500 as [0x82 0xfe 0x0c], etc.
Deprecated: This function has been replaced by Enc.
func EncMapStringToBinary
deprecated
func EncMapStringToBinary[E encoding.BinaryMarshaler](m map[string]E) []byte
EncMapStringToBinary encodes a map of string to an implementer of encoding.BinaryMarshaler as bytes. The order of keys in output is gauranteed to be consistent.
Deprecated: This function has been replaced by Enc.
func EncMapStringToInt
deprecated
func EncSliceBinary
deprecated
func EncSliceBinary[E encoding.BinaryMarshaler](sl []E) []byte
EncSliceBinary encodes a slice of implementors of encoding.BinaryMarshaler from the data bytes.
Deprecated: This function has been replaced by Enc.
func EncSliceString
deprecated
func EncString
deprecated
encString encodes a string value as a slice of bytes. The value can later be decoded with DecString. Encoded string output starts with an integer (as encoded by EncInt) indicating the number of bytes following that make up the string, followed by that many bytes containing the string encoded as UTF-8.
The output will be variable length; it will contain 8 bytes followed by the bytes that make up X characters, where X is the int value contained in the first 8 bytes. Due to the specifics of how UTF-8 strings are encoded, this may or may not be the actual number of bytes used.
Deprecated: This function has been replaced by Enc.
Types ¶
type Decoder
deprecated
type Decoder[E any] interface { // DecodeBool decodes a bool value at the current position within the buffer // of the Decoder and advances the current position past the read bytes. DecodeBool() (bool, error) // DecodeInt decodes an int value at the current position within the buffer // of the Decoder and advances the current position past the read bytes. DecodeInt() (int, error) // DecodeString decodes a string value at the current position within the // buffer of the Decoder and advances the current position past the read // bytes. DecodeString() (string, error) // Decode decodes a value at the current position within the buffer of the // Decoder and advances the current position past the read bytes. Unlike the // other functions, instead of returning the value this one will set the // value of the given item. Decode(o E) error }
Decoder decodes the primitive types bool, int, and string, as well as a type that is specified by its type parameter (usually an interface of some XMarshaler type, such as BinaryUnmarshaler).
Deprecated: Does not function as intended.
func NewBinaryDecoder
deprecated
func NewBinaryDecoder() Decoder[encoding.BinaryUnmarshaler]
NewBinaryDecoder creates a Decoder that can decode bytes and uses an object's UnmarshalBinary method to decode non-trivial types.
Deprecated: Do not use.
type Encoder
deprecated
type Encoder[E any] interface { EncodeBool(b bool) EncodeInt(i int) EncodeString(s string) Encode(o E) // Bytes returns all encoded values as sequential bytes. Bytes() []byte }
Encoder encodes the primitive types bool, int, and string, as well as a type that is specified by its type parameter (usually an interface of some XMarshaler type, such as BinaryMarshaler).
Deprecated: Does not function as intended.
func NewBinaryEncoder
deprecated
func NewBinaryEncoder() Encoder[encoding.BinaryMarshaler]
NewBinaryEncoder creates an Encoder that can encode to bytes and uses an object's MarshalBinary method to encode non-trivial types.
Deprecated: Do not use.