Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Decoder ¶
type Decoder[WordID, VariantID any] interface { IDToWord(WordID) string IDToVariant(VariantID) Variant }
Encoder supplies the necessary decoding information to translate IDs into strings.
type Document ¶
type Document[WordID, VariantID comparable] struct { Start string ByteLen, WordLen int // Words holds the root words present in the document. This slice can // be reordered without effecting the encoding. Words []Locations[WordID] Variants *huffslice.Slice[VariantID] }
Document is a string encoded as the root words. This makes identifying which words are in a document fast.
type DocumentDecoder ¶
type DocumentDecoder[WordID, VariantID comparable] struct { Decoder[WordID, VariantID] WordSingleToken WordID VarSingleToken VariantID }
DocumentDecoder can decode a Document into a string.
func (DocumentDecoder[WordID, VariantID]) Decode ¶
func (dec DocumentDecoder[WordID, VariantID]) Decode(doc *Document[WordID, VariantID]) string
Decode a Document to a string
type DocumentEncoder ¶
type DocumentEncoder[WordID, VariantID comparable] struct { Encoder[WordID, VariantID] Splitter func(string) (string, []string) RootVariant func(string) (string, Variant) WordSingleToken WordID VarSingleToken VariantID }
DocumentEncoder can encode a string into a Document.
func (DocumentEncoder[WordID, VariantID]) Build ¶
func (enc DocumentEncoder[WordID, VariantID]) Build(str string) *Document[WordID, VariantID]
Build takes a stirng and encodes it to a Document.
type Encoder ¶
type Encoder[WordID, VariantID any] interface { WordToID(string) WordID VariantToID(Variant) VariantID }
Encoder supplies the necessary encoding information to translate strings into IDs.
type Locations ¶
type Locations[T comparable] struct { ID T Idxs []uint32 }
Locations hold an ID and the index locations where that ID occures.
type Variant ¶
type Variant []byte
Variant encodes the casing of a word and the non-alphanumeric characters that follow the word.
func RootVariant ¶
RootVariant find the prefix of the string containing letters and numbers and the Variant to convert the root back to the original input.