bits

package module
v0.0.0-...-1e96df0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2021 License: Unlicense Imports: 3 Imported by: 4

README

================================
`Succinct Data Structure`_ Trie_
================================

.. image:: https://img.shields.io/badge/Language-Go-blue.svg
   :target: https://golang.org/

.. image:: https://godoc.org/github.com/siongui/go-succinct-data-structure-trie?status.svg
   :target: https://godoc.org/github.com/siongui/go-succinct-data-structure-trie

.. image:: https://github.com/siongui/go-succinct-data-structure-trie/workflows/Test%20Package/badge.svg
    :target: https://github.com/siongui/go-succinct-data-structure-trie/blob/master/.github/workflows/build.yml

.. image:: https://goreportcard.com/badge/github.com/siongui/go-succinct-data-structure-trie
   :target: https://goreportcard.com/report/github.com/siongui/go-succinct-data-structure-trie

.. image:: https://img.shields.io/badge/license-Unlicense-blue.svg
   :target: https://raw.githubusercontent.com/siongui/go-succinct-data-structure-trie/master/UNLICENSE

.. image:: https://img.shields.io/twitter/url/https/github.com/siongui/go-succinct-data-structure-trie.svg?style=social
   :target: https://twitter.com/intent/tweet?text=Wow:&url=%5Bobject%20Object%5D


Implementation of `Succinct Trie`_ [1]_ in Go_.

The trie structure is great for fast lookup of dictionary words, but if the
vocabulary of the dictionary is big, it may takes a lot of space to store the
constructed trie. For this reason, succinct data structure is applied to the
trie strcuture and we can both have fast lookup and small space requirement.


Usage
=====

- Basic example: `basic usage <example/basic/usage.go>`__
- Advanced example: `pali dir <example/pali/>`__

UNLICENSE
=========

Released in public domain. See UNLICENSE_.


References
==========

.. [1] `Succinct Data Structures: Cramming 80,000 words into a Javascript file. <http://stevehanov.ca/blog/?id=120>`_
       (`source code <http://www.hanovsolutions.com/trie/Bits.js>`__)

.. [2] Google Search `succinct data structure <https://www.google.com/search?q=succinct+data+structure>`__

.. [3] Google Search `succinct trie <https://www.google.com/search?q=succinct+trie>`__

.. [4] Google Search `golang const array <https://www.google.com/search?q=golang+const+array>`__

.. [5] Google Search `golang function as argument <https://www.google.com/search?q=golang+function+as+argument>`__

.. [6] Google Search `golang charcodeat <https://www.google.com/search?q=golang+charcodeat>`__

       `string - Go lang's equivalent of charCode() method of JavaScript - Stack Overflow <http://stackoverflow.com/questions/31239330/go-langs-equivalent-of-charcode-method-of-javascript>`_

.. [7] `[Golang] Succinct Trie Implementation <https://siongui.github.io/2016/02/08/go-succinct-trie-implementation/>`_

.. [8] `[JavaScript] Bug in Succinct Trie Implementation of Bits.js <https://siongui.github.io/2016/02/02/javascript-bug-in-succinct-trie-implementation-of-bits-js/>`_

.. _Go: https://golang.org/
.. _UNLICENSE: https://unlicense.org/
.. _Succinct Data Structure: https://www.google.com/search?q=Succinct+Data+Structure
.. _Trie: https://www.google.com/search?q=Trie
.. _Succinct Trie: https://www.google.com/search?q=Succinct+Trie

Documentation

Overview

Package go-succinct-data-structure-trie implements trie with succinct data structure in Go.

Index

Constants

This section is empty.

Variables

View Source
var BASE64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"
View Source
var BASE64_CACHE = map[string]uint{
	"A": 0, "B": 1, "C": 2, "D": 3, "E": 4, "F": 5, "G": 6, "H": 7,
	"I": 8, "J": 9, "K": 10, "L": 11, "M": 12, "N": 13, "O": 14,
	"P": 15, "Q": 16, "R": 17, "S": 18, "T": 19, "U": 20, "V": 21,
	"W": 22, "X": 23, "Y": 24, "Z": 25, "a": 26, "b": 27, "c": 28,
	"d": 29, "e": 30, "f": 31, "g": 32, "h": 33, "i": 34, "j": 35,
	"k": 36, "l": 37, "m": 38, "n": 39, "o": 40, "p": 41, "q": 42,
	"r": 43, "s": 44, "t": 45, "u": 46, "v": 47, "w": 48, "x": 49,
	"y": 50, "z": 51, "0": 52, "1": 53, "2": 54, "3": 55, "4": 56,
	"5": 57, "6": 58, "7": 59, "8": 60, "9": 61, "-": 62, "_": 63,
}

*

Returns the decimal value of the given character unit.
View Source
var BitsInByte = [256]uint{}/* 256 elements not displayed */
View Source
var L1 uint = 32 * 32

*

Fixed values for the L1 and L2 table sizes in the Rank Directory
View Source
var L2 uint = 32
View Source
var MaskTop = [7]uint{
	0x3f, 0x1f, 0x0f, 0x07, 0x03, 0x01, 0x00,
}
View Source
var W uint = 6

*

The width of each unit of the encoding, in bits. Here we use 6, for base-64
encoding.

Functions

func CHR

func CHR(id uint) string

*

Returns the character unit that represents the given value. If this were
binary data, we would simply return id.

func ORD

func ORD(ch string) uint

func SetAllowedCharacters

func SetAllowedCharacters(alphabet string)

Types

type BitString

type BitString struct {
	// contains filtered or unexported fields
}

*

Given a string of data (eg, in BASE-64), the BitString class supports
reading or counting a number of bits from an arbitrary position in the
string.

func (*BitString) Count

func (bs *BitString) Count(p, n uint) uint

*

Counts the number of bits set to 1 starting at position p and
ending at position p + n

func (*BitString) Get

func (bs *BitString) Get(p, n uint) uint

*

Returns a decimal number, consisting of a certain number, n, of bits
starting at a certain position, p.

func (*BitString) GetData

func (bs *BitString) GetData() string

*

Returns the internal string of bytes

func (*BitString) Init

func (bs *BitString) Init(data string)

func (*BitString) Rank

func (bs *BitString) Rank(x uint) uint

*

Returns the number of bits set to 1 up to and including position x.
This is the slow implementation used for testing.

type BitWriter

type BitWriter struct {
	// contains filtered or unexported fields
}

*

The BitWriter will create a stream of bytes, letting you write a certain
number of bits at a time. This is part of the encoder, so it is not
optimized for memory or speed.

func (*BitWriter) GetData

func (bw *BitWriter) GetData() string

*

Get the bitstring represented as a javascript string of bytes

func (*BitWriter) GetDebugString

func (bw *BitWriter) GetDebugString(group uint) string

*

Returns the bits as a human readable binary string for debugging

func (*BitWriter) Write

func (bw *BitWriter) Write(data, numBits uint)

*

Write some data to the bit string. The number of bits must be 32 or
fewer.

type FrozenTrie

type FrozenTrie struct {
	// contains filtered or unexported fields
}

*

The FrozenTrie is used for looking up words in the encoded trie.

@param data A string representing the encoded trie.

@param directoryData A string representing the RankDirectory. The global L1
and L2 constants are used to determine the L1Size and L2size.

@param nodeCount The number of nodes in the trie.

func (*FrozenTrie) GetNodeByIndex

func (f *FrozenTrie) GetNodeByIndex(index uint) FrozenTrieNode

*

Retrieve the FrozenTrieNode of the trie, given its index in level-order.
This is a private function that you don't have to use.

func (*FrozenTrie) GetRoot

func (f *FrozenTrie) GetRoot() FrozenTrieNode

*

Retrieve the root node. You can use this node to obtain all of the other
nodes in the trie.

func (*FrozenTrie) GetSuggestedWords

func (f *FrozenTrie) GetSuggestedWords(word string, limit int) []string

*

  • Given a word, returns array of words, prefix of which is word

func (*FrozenTrie) Init

func (f *FrozenTrie) Init(data, directoryData string, nodeCount uint)

func (*FrozenTrie) Lookup

func (f *FrozenTrie) Lookup(word string) bool

*

Look-up a word in the trie. Returns true if and only if the word exists
in the trie.

type FrozenTrieNode

type FrozenTrieNode struct {
	// contains filtered or unexported fields
}

*

This class is used for traversing the succinctly encoded trie.

func (*FrozenTrieNode) GetChild

func (f *FrozenTrieNode) GetChild(index uint) FrozenTrieNode

*

Returns the FrozenTrieNode for the given child.

@param index The 0-based index of the child of this node. For example, if
the node has 5 children, and you wanted the 0th one, pass in 0.

func (*FrozenTrieNode) GetChildCount

func (f *FrozenTrieNode) GetChildCount() uint

*

Returns the number of children.

type RankDirectory

type RankDirectory struct {
	// contains filtered or unexported fields
}

*

The rank directory allows you to build an index to quickly compute the
rank() and select() functions. The index can itself be encoded as a binary
string.

func CreateRankDirectory

func CreateRankDirectory(data string, numBits, l1Size, l2Size uint) RankDirectory

*

Used to build a rank directory from the given input string.

@param data A javascript string containing the data, as readable using the
BitString object.

@param numBits The number of bits to index.

@param l1Size The number of bits that each entry in the Level 1 table
summarizes. This should be a multiple of l2Size.

@param l2Size The number of bits that each entry in the Level 2 table
summarizes.

func (*RankDirectory) GetData

func (rd *RankDirectory) GetData() string

*

Returns the string representation of the directory.

func (*RankDirectory) Init

func (rd *RankDirectory) Init(directoryData, bitData string, numBits, l1Size, l2Size uint)

func (*RankDirectory) Rank

func (rd *RankDirectory) Rank(which, x uint) uint

*

Returns the number of 1 or 0 bits (depending on the "which" parameter) to
to and including position x.

func (*RankDirectory) Select

func (rd *RankDirectory) Select(which, y uint) uint

*

Returns the position of the y'th 0 or 1 bit, depending on the "which"
parameter.

type Trie

type Trie struct {
	// contains filtered or unexported fields
}

func (*Trie) Apply

func (t *Trie) Apply(fn func(*TrieNode))

*

Apply a function to each node, traversing the trie in level order.

func (*Trie) Encode

func (t *Trie) Encode() string

*

Encode the trie and all of its nodes. Returns a string representing the
encoded data.

func (*Trie) GetNodeCount

func (t *Trie) GetNodeCount() uint

*

Returns the number of nodes in the trie

func (*Trie) Init

func (t *Trie) Init()

func (*Trie) Insert

func (t *Trie) Insert(word string)

*

Inserts a word into the trie. This function is fastest if the words are
inserted in alphabetical order.

type TrieNode

type TrieNode struct {
	// contains filtered or unexported fields
}

*

A Trie node, for use in building the encoding trie. This is not needed for
the decoder.

Directories

Path Synopsis
example
A Succinct Trie for Go By Siong-Ui Te Released to the public domain.
A Succinct Trie for Go By Siong-Ui Te Released to the public domain.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL