Documentation ¶
Overview ¶
Package succinct provides several succinct data types.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Set ¶
type Set struct {
// contains filtered or unexported fields
}
Set is a succinct, sorted and static string set impl with compacted trie as storage. The space cost is about half lower than the original data.
Implementation ¶
It stores sorted strings in a compacted trie(AKA prefix tree). A trie node has at most 256 outgoing labels. A label is just a single byte. E.g., [ab, abc, abcd, axy, buv] is represented with a trie like the following: (Numbers are node id)
^ -a-> 1 -b-> 3 $ | | `c-> 6 $ | | `d-> 9 $ | `x-> 4 -y-> 7 $ `b-> 2 -u-> 5 -v-> 8 $
Internally it uses a packed []byte and a bitmap with `len([]byte)` bits to describe the outgoing labels of a node,:
^: ab 00 1: bx 00 2: u 0 3: c 0 4: y 0 5: v 0 6: d 0 7: ø 8: ø 9: ø
In storage it packs labels together and bitmaps joined with separator `1`:
labels(ignore space): "ab bx u c y v d" label bitmap: 0010010101010101111
Finally leaf nodes are indicated by another bitmap `leaves`, in which a `1` at i-th bit indicates the i-th node is a leaf:
leaves: 0001001111
func NewSet ¶
NewSet creates a new *Set struct, from a slice of sorted strings.
Example ¶
keys := []string{ "A", "Aani", "Aaron", "Aaronic", "Aaronical", "Aaronite", "Aaronitic", "Aaru", "Ab", "Ababdeh", "Ababua", "Abadite", } s := NewSet(keys) for _, k := range []string{"Aani", "Foo", "Ababdeh"} { found := s.Has(k) fmt.Printf("lookup %10s, found: %v\n", k, found) }
Output: lookup Aani, found: true lookup Foo, found: false lookup Ababdeh, found: true
Example (Memory) ¶
keys := testkeys.Load("200kweb2") original := 0 for _, k := range keys { original += len(k) } s := NewSet(keys) fmt.Println("With", len(keys)/1000, "thousands keys:") fmt.Println(" Original size:", original/1024, "KB") fmt.Println(" Compressed size:", size.Of(s)/1024, "KB, ratio:", size.Of(s)*100/original, "%") fmt.Println("Memory layout:") fmt.Println(size.Stat(s, 10, 1))
Output: With 235 thousands keys: Original size: 2204 KB Compressed size: 1258 KB, ratio: 57 % Memory layout: *succinct.Set: 1288412 succinct.Set: 1288404 leaves: []uint64: 99128 0: uint64: 8 labelBitmap: []uint64: 198224 0: uint64: 8 labels: []uint8: 792800 0: uint8: 1 ranks: []int32: 99128 0: int32: 4 selects: []int32: 99124 0: int32: 4