trie

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 22, 2023 License: MIT Imports: 4 Imported by: 0

README

trie godoc Build

An implementation of the Trie data structure in Go. It provides more features than the usual Trie prefix-search, and is meant to be used for auto-completion.

Demo

A WebAssembly demo can be tried at shivamMg.github.io/trie.

Features

  • Keys are []string instead of string, thereby supporting more use cases - e.g. []string{the quick brown fox} can be a key where each word will be a node in the Trie
  • Support for Put key and Delete key
  • Support for Prefix search - e.g. searching for nation might return nation, national, nationalism, nationalist, etc.
  • Support for Edit distance search (aka Levenshtein distance) - e.g. searching for wheat might return similar looking words like wheat, cheat, heat, what, etc.
  • Order of search results is deterministic. It follows insertion order.

Examples

tri := trie.New()
// Put keys ([]string) and values (any)
tri.Put([]string{"the"}, 1)
tri.Put([]string{"the", "quick", "brown", "fox"}, 2)
tri.Put([]string{"the", "quick", "sports", "car"}, 3)
tri.Put([]string{"the", "green", "tree"}, 4)
tri.Put([]string{"an", "apple", "tree"}, 5)
tri.Put([]string{"an", "umbrella"}, 6)

tri.Root().Print()
// Output (full trie with terminals ending with ($)):
// ^
// ├─ the ($)
// │  ├─ quick
// │  │  ├─ brown
// │  │  │  └─ fox ($)
// │  │  └─ sports
// │  │     └─ car ($)
// │  └─ green
// │     └─ tree ($)
// └─ an
//    ├─ apple
//    │  └─ tree ($)
//    └─ umbrella ($)

results := tri.Search([]string{"the", "quick"})
for _, res := range results.Results {
	fmt.Println(res.Key, res.Value)
}
// Output (prefix-based search):
// [the quick brown fox] 2
// [the quick sports car] 3

key := []string{"the", "tree"}
results = tri.Search(key, trie.WithMaxEditDistance(2), // An edit can be insert, delete, replace
	trie.WithEditOps())
for _, res := range results.Results {
	fmt.Println(res.Key, res.EditDistance) // EditDistance is number of edits needed to convert to [the tree]
}
// Output (results not more than 2 edits away from [the tree]):
// [the] 1
// [the green tree] 1
// [an apple tree] 2
// [an umbrella] 2

result := results.Results[2]
fmt.Printf("To convert %v to %v:\n", result.Key, key)
printEditOps(result.EditOps)
// Output (edit operations needed to covert a result to [the tree]):
// To convert [an apple tree] to [the tree]:
// - delete "an"
// - replace "apple" with "the"
// - don't edit "tree"

results = tri.Search(key, trie.WithMaxEditDistance(2), trie.WithTopKLeastEdited(), trie.WithMaxResults(2))
for _, res := range results.Results {
	fmt.Println(res.Key, res.Value, res.EditDistance)
}
// Output (top 2 least edited results):
// [the] 1 1
// [the green tree] 4 1

References

Documentation

Overview

Example
package main

import (
	"fmt"

	"github.com/shivamMg/trie"
)

func printEditOps(ops []*trie.EditOp) {
	for _, op := range ops {
		switch op.Type {
		case trie.EditOpTypeNoEdit:
			fmt.Printf("- don't edit %q\n", op.KeyPart)
		case trie.EditOpTypeInsert:
			fmt.Printf("- insert %q\n", op.KeyPart)
		case trie.EditOpTypeDelete:
			fmt.Printf("- delete %q\n", op.KeyPart)
		case trie.EditOpTypeReplace:
			fmt.Printf("- replace %q with %q\n", op.KeyPart, op.ReplaceWith)
		}
	}
}

func main() {
	tri := trie.New()
	// Put keys ([]string) and values (any)
	tri.Put([]string{"the"}, 1)
	tri.Put([]string{"the", "quick", "brown", "fox"}, 2)
	tri.Put([]string{"the", "quick", "sports", "car"}, 3)
	tri.Put([]string{"the", "green", "tree"}, 4)
	tri.Put([]string{"an", "apple", "tree"}, 5)
	tri.Put([]string{"an", "umbrella"}, 6)

	tri.Root().Print()
	// Output (full trie with terminals ending with ($)):
	// ^
	// ├─ the ($)
	// │  ├─ quick
	// │  │  ├─ brown
	// │  │  │  └─ fox ($)
	// │  │  └─ sports
	// │  │     └─ car ($)
	// │  └─ green
	// │     └─ tree ($)
	// └─ an
	//    ├─ apple
	//    │  └─ tree ($)
	//    └─ umbrella ($)

	results := tri.Search([]string{"the", "quick"})
	for _, res := range results.Results {
		fmt.Println(res.Key, res.Value)
	}
	// Output (prefix-based search):
	// [the quick brown fox] 2
	// [the quick sports car] 3

	key := []string{"the", "tree"}
	results = tri.Search(key, trie.WithMaxEditDistance(2), // An edit can be insert, delete, replace
		trie.WithEditOps())
	for _, res := range results.Results {
		fmt.Println(res.Key, res.EditDistance) // EditDistance is number of edits needed to convert to [the tree]
	}
	// Output (results not more than 2 edits away from [the tree]):
	// [the] 1
	// [the green tree] 1
	// [an apple tree] 2
	// [an umbrella] 2

	result := results.Results[2]
	fmt.Printf("To convert %v to %v:\n", result.Key, key)
	printEditOps(result.EditOps)
	// Output (edit operations needed to covert a result to [the tree]):
	// To convert [an apple tree] to [the tree]:
	// - delete "an"
	// - replace "apple" with "the"
	// - don't edit "tree"

	results = tri.Search(key, trie.WithMaxEditDistance(2), trie.WithTopKLeastEdited(), trie.WithMaxResults(2))
	for _, res := range results.Results {
		fmt.Println(res.Key, res.Value, res.EditDistance)
	}
	// Output (top 2 least edited results):
	// [the] 1 1
	// [the green tree] 4 1
}
Output:

Index

Examples

Constants

View Source
const (
	RootKeyPart = "^"
)

Variables

This section is empty.

Functions

func WithEditOps

func WithEditOps() func(*SearchOptions)

WithEditOps can be passed to Search() alongside WithMaxEditDistance(). When passed, Search() also returns EditOps for each SearchResult. EditOps can be used to determine the minimum number of edit operations needed to convert a result Key into the Search()-ed key.

e.g. Searching for "wheat" in a Trie that stores English words might return "beat". EditOps for this result might be: 1. insert "w" 2. replace "b" with "h".

There might be multiple ways to edit a key into another. EditOps represents only one.

Computing EditOps makes Search() slower.

func WithExactKey

func WithExactKey() func(*SearchOptions)

WithExactKey can be passed to Search(). When passed, Search() returns just the result with Key=Search()-ed key. If the key does not exist, result list will be empty.

func WithMaxEditDistance

func WithMaxEditDistance(maxDistance int) func(*SearchOptions)

WithMaxEditDistance can be passed to Search(). When passed, Search() changes its default behaviour from Prefix search to Edit distance search. It can be used to return "Approximate" results instead of strict Prefix search results.

maxDistance is the maximum number of edits allowed on Trie keys to consider them as a SearchResult. Higher the maxDistance, more lenient and slower the search becomes.

e.g. If a Trie stores English words, then searching for "wheat" with maxDistance=1 might return similar looking words like "wheat", "cheat", "heat", "what", etc. With maxDistance=2 it might also return words like "beat", "ahead", etc.

Read about Edit distance: https://en.wikipedia.org/wiki/Edit_distance

func WithMaxResults

func WithMaxResults(maxResults int) func(*SearchOptions)

WithMaxResults can be passed to Search(). When passed, Search() will return at most maxResults number of results.

func WithTopKLeastEdited

func WithTopKLeastEdited() func(*SearchOptions)

WithTopKLeastEdited can be passed to Search() alongside WithMaxEditDistance() and WithMaxResults(). When passed, Search() returns maxResults number of results that have the lowest EditDistances. Results are sorted on EditDistance (lowest to highest).

e.g. In a Trie that stores English words searching for "wheat" might return "wheat" (EditDistance=0), "cheat" (EditDistance=1), "beat" (EditDistance=2) - in that order.

Types

type EditOp

type EditOp struct {
	Type EditOpType
	// KeyPart:
	// - In case of NoEdit, KeyPart is to be retained.
	// - In case of Insert, KeyPart is to be inserted in the key.
	// - In case of Delete/Replace, KeyPart is the part of the key on which delete/replace is performed.
	KeyPart string
	// ReplaceWith is set for Type=EditOpTypeReplace
	ReplaceWith string
}

EditOp represents an Edit Operation.

type EditOpType

type EditOpType int
const (
	EditOpTypeNoEdit EditOpType = iota
	EditOpTypeInsert
	EditOpTypeDelete
	EditOpTypeReplace
)

type Node

type Node struct {
	// contains filtered or unexported fields
}

Node is a tree node inside Trie.

func (*Node) ChildNodes

func (n *Node) ChildNodes() []*Node

ChildNodes returns the child-nodes of this Node.

func (*Node) Children

func (n *Node) Children() []tree.Node

Children is used in Print(). Use ChildNodes() to get child-nodes of this Node.

func (*Node) Data

func (n *Node) Data() interface{}

Data is used in Print(). Use Value() to get value at this Node.

func (*Node) IsTerminal

func (n *Node) IsTerminal() bool

IsTerminal returns a boolean that tells whether a key ends at this Node.

func (*Node) KeyPart

func (n *Node) KeyPart() string

KeyPart returns the part (string) of the key ([]string) that this Node represents.

func (*Node) Print

func (n *Node) Print()

Print prints the tree rooted at this Node. A Trie's root node is printed as RootKeyPart. All the terminal nodes are suffixed with ($).

func (*Node) SetValue added in v0.0.1

func (n *Node) SetValue(value interface{})

SetValue sets the value for the key ending at this Node. If Node is not a terminal, value is not set.

func (*Node) Sprint

func (n *Node) Sprint() string

func (*Node) Value

func (n *Node) Value() interface{}

Value returns the value stored for the key ending at this Node. If Node is not a terminal, it returns nil.

type SearchOptions

type SearchOptions struct {
	// contains filtered or unexported fields
}

type SearchResult

type SearchResult struct {
	// Key is the key that was Put() into the Trie.
	Key []string
	// Value is the value that was Put() into the Trie.
	Value interface{}
	// EditDistance is the number of edits (insert/delete/replace) needed to convert Key into the Search()-ed key.
	EditDistance int
	// EditOps is the list of edit operations (see EditOpType) needed to convert Key into the Search()-ed key.
	EditOps []*EditOp
	// contains filtered or unexported fields
}

type SearchResults

type SearchResults struct {
	Results []*SearchResult
	// contains filtered or unexported fields
}

type Trie

type Trie struct {
	// contains filtered or unexported fields
}

Trie is the trie data structure.

func New

func New() *Trie

New returns a new instance of Trie.

func (*Trie) Delete

func (t *Trie) Delete(key []string) (value interface{}, existed bool)

Delete deletes key-value for the given key in the Trie. It returns (value, true) if the key existed, else (nil, false).

func (*Trie) Put

func (t *Trie) Put(key []string, value interface{}) (existed bool)

Put upserts value for the given key in the Trie. It returns a boolean depending on whether the key already existed or not.

func (*Trie) Root

func (t *Trie) Root() *Node

Root returns the root node of the Trie.

func (*Trie) Search

func (t *Trie) Search(key []string, options ...func(*SearchOptions)) *SearchResults

Search() takes a key and some options to return results (see SearchResult) from the Trie. Without any options, it does a Prefix search i.e. result Keys have the same prefix as key. Order of the results is deterministic and will follow the order in which Put() was called for the keys. See "With*" functions for options accepted by Search().

func (*Trie) Walk added in v0.0.1

func (t *Trie) Walk(key []string, walker WalkFunc) error

Walk traverses the Trie and calls walker function. If walker function returns an error, Walk early-returns with that error. Traversal follows insertion order.

type WalkFunc added in v0.0.1

type WalkFunc func(key []string, node *Node) error

Directories

Path Synopsis
demo

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL