huffman

package module
v1.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 30, 2021 License: BSD-2-Clause Imports: 10 Imported by: 1

README

huffman

Go Latest Release Go Doc License Go Report Card

Canonical Huffman codes in Go.

Documentation

Overview

Package huffman implements canonical Huffman codes. These are useful for DEFLATE and other compression algorithms.

References:

<https://www.rfc-editor.org/rfc/rfc1951.html>, Section 3.2.2

<https://en.wikipedia.org/wiki/Canonical_Huffman_code>

Index

Constants

View Source
const InvalidSymbol = Symbol(-1)

InvalidSymbol is returned by some functions to clearly indicate that no symbol is being returned.

View Source
const MaxSymbol = Symbol(math.MaxInt32)

MaxSymbol is the maximum valid symbol.

Variables

This section is empty.

Functions

This section is empty.

Types

type Code

type Code struct {
	// Size holds the number of valid bits.
	Size byte

	// Bits holds the actual values of the bits.  The least significant bit
	// of Bits is the first bit.
	Bits uint32
}

Code represents a sequence of bits.

func MakeCode

func MakeCode(size byte, bits uint32) Code

MakeCode is a convenience function that constructs a Code.

func MakeReversedCode

func MakeReversedCode(size byte, bits uint32) Code

MakeReversedCode constructs a Code from a sequence of bits that's in the wrong order, i.e. the least significant bit is the *last* bit in the sequence, instead of the first.

func (Code) Reversed

func (hc Code) Reversed() Code

Reversed returns the corresponding Code with the bits in reverse order.

func (Code) String

func (hc Code) String() string

String returns the string representation of this Code.

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder implements a decoder for canonical Huffman codes.

func NewDecoder added in v1.1.0

func NewDecoder(sizes []byte) *Decoder

NewDecoder is a convenience function that allocates a new Decoder and calls Init on it. If Init returns an error, NewDecoder panics.

func (Decoder) DebugString added in v1.1.0

func (d Decoder) DebugString() string

DebugString returns a programmer-readable debugging string of the Decoder's current state.

func (Decoder) Decode

func (d Decoder) Decode(hc Code) (symbol Symbol, minSize byte, maxSize byte)

Decode attempts to decode a Huffman code into a Symbol.

If the Decode is completely successful, symbol >= 0 and minSize == maxSize.

If the Decode fails due to insufficient bits, symbol == InvalidSymbol and at least (minSize - hc.Size) additional bits are required to decode this symbol. No more than (maxSize - hc.Size) additional bits will be required.

If the Decode fails due to unreasonable input, symbol == InvalidSymbol and minSize == maxSize == 0.

func (Decoder) Dump

func (d Decoder) Dump(w io.Writer) (int64, error)

Dump writes DebugString() to the given writer.

func (Decoder) Encoder added in v1.1.0

func (d Decoder) Encoder() *Encoder

Encoder returns a new Encoder which mirrors this Decoder.

func (Decoder) GoString added in v1.1.0

func (d Decoder) GoString() string

GoString returns a Go expression that would reconstruct this Decoder.

func (*Decoder) Init

func (d *Decoder) Init(sizes []byte) error

Init initializes this Decoder. The argument consists of zero or more bit lengths, one for each symbol in the code, which is used to construct the canonical Huffman code per the algorithm in RFC 1951 Section 3.2.2. Symbols with an assigned bit length of 0 are omitted from the code entirely.

Not all inputs are valid for constructing a canonical Huffman code. In particular, this method will reject "degenerate" codes which use overly long big lengths for some inputs. Degenerate codes consisting of 0 valid symbols or 1 valid symbol are permitted, however, as there is no way to construct a non-degenerate Huffman code for such cases.

func (*Decoder) InitFromEncoder added in v1.1.0

func (d *Decoder) InitFromEncoder(e Encoder) error

InitFromEncoder initializes this Decoder to be the mirror of the given Encoder.

func (Decoder) MarshalJSON added in v1.1.0

func (d Decoder) MarshalJSON() ([]byte, error)

MarshalJSON renders this Decoder as JSON data.

func (Decoder) MaxSize

func (d Decoder) MaxSize() byte

MaxSize is the bit length of the longest legal code.

func (Decoder) MaxSymbol

func (d Decoder) MaxSymbol() Symbol

MaxSymbol is the last Symbol in the code's alphabet.

(The first Symbol in the code's alphabet is always 0.)

func (Decoder) MinSize

func (d Decoder) MinSize() byte

MinSize is the bit length of the shortest legal code.

func (Decoder) NumSymbols added in v1.1.0

func (d Decoder) NumSymbols() uint

NumSymbols returns the total number of symbols in the code's alphabet.

func (Decoder) SizeBySymbol

func (d Decoder) SizeBySymbol() []byte

SizeBySymbol returns a copy of the original bit length array used to initialize this Decoder.

func (Decoder) String added in v1.1.0

func (d Decoder) String() string

String returns a brief string representation.

func (*Decoder) UnmarshalJSON added in v1.1.0

func (d *Decoder) UnmarshalJSON(raw []byte) error

UnmarshalJSON initializes this Decoder from JSON data.

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder implements an encoder for canonical Huffman codes.

func NewEncoder added in v1.1.0

func NewEncoder(numSymbols int, frequencies []uint32) *Encoder

NewEncoder is a convenience function that allocates a new Encoder and calls Init on it.

func NewEncoderFromSizes added in v1.1.0

func NewEncoderFromSizes(sizes []byte) *Encoder

NewEncoderFromSizes is a convenience function that allocates a new Encoder and calls InitFromSizes on it. If InitFromSizes returns an error, NewEncoderFromSizes panics.

func (Encoder) DebugString added in v1.1.0

func (e Encoder) DebugString() string

DebugString returns a programmer-readable debugging string of the Encoder's current state.

func (Encoder) Decoder added in v1.1.0

func (e Encoder) Decoder() *Decoder

Decoder returns a new Decoder which mirrors this Encoder.

func (Encoder) Dump

func (e Encoder) Dump(w io.Writer) (int64, error)

Dump writes DebugString() to the given writer.

func (Encoder) Encode

func (e Encoder) Encode(symbol Symbol) Code

Encode encodes a Symbol into a Huffman-coded bit string.

func (Encoder) GoString added in v1.1.0

func (e Encoder) GoString() string

GoString returns a Go expression that would reconstruct this Encoder.

func (*Encoder) Init

func (e *Encoder) Init(numSymbols int, frequencies []uint32)

Init initializes this Encoder. The first argument tells Init how many Symbols are in this code's alphabet, and the second argument lists the frequency (i.e. number of occurrences) for each Symbol in the code, one for each Symbol except that any Symbol not represented in the list is assumed to have a frequency of 0.

func (*Encoder) InitFromDecoder added in v1.1.0

func (e *Encoder) InitFromDecoder(d Decoder) error

InitFromDecoder initializes this Encoder to be the mirror of the given Decoder.

func (*Encoder) InitFromSizes added in v1.1.0

func (e *Encoder) InitFromSizes(sizes []byte) error

InitFromSizes initializes this Encoder from a list of bit lengths, one for each symbol in the code. See Decoder.Init for more details.

func (Encoder) MarshalJSON added in v1.1.0

func (e Encoder) MarshalJSON() ([]byte, error)

MarshalJSON renders this Encoder as JSON data.

func (Encoder) MaxSize

func (e Encoder) MaxSize() byte

MaxSize is the bit length of the longest legal code.

func (Encoder) MaxSymbol

func (e Encoder) MaxSymbol() Symbol

MaxSymbol is the last Symbol in the code's alphabet.

(The first Symbol in the code's alphabet is always 0.)

func (Encoder) MinSize

func (e Encoder) MinSize() byte

MinSize is the bit length of the shortest legal code.

func (Encoder) NumSymbols added in v1.1.0

func (e Encoder) NumSymbols() uint

NumSymbols returns the total number of symbols in the code's alphabet.

func (Encoder) SizeBySymbol

func (e Encoder) SizeBySymbol() []byte

SizeBySymbol returns an array containing the bit length for each Symbol in the alphabet. This array can be transmitted to another party and used by Decoder to reconstruct this Huffman code on the receiving end.

func (Encoder) String added in v1.1.0

func (e Encoder) String() string

String returns a brief string representation.

func (*Encoder) UnmarshalJSON added in v1.1.0

func (e *Encoder) UnmarshalJSON(raw []byte) error

UnmarshalJSON initializes this Encoder from JSON data.

type Symbol

type Symbol int32

Symbol represents a symbol in an arbitrary alphabet. Negative symbols are not valid.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL