document

package
v0.0.0-...-157c9c8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2024 License: GPL-3.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Parse

func Parse(str string) (string, []string)

func Root

func Root(str string) string

Root find the prefix of the string containing letters and numbers.

Types

type ChangeSet

type ChangeSet[T any] struct {
	Added, Removed slice.Slice[T]
}

ChangeSet shows the IDs that were added and removed from a Document.

type Decoder

type Decoder[WordID, VariantID any] interface {
	IDToWord(WordID) string
	IDToVariant(VariantID) Variant
}

Encoder supplies the necessary decoding information to translate IDs into strings.

type Document

type Document[WordID, VariantID comparable] struct {
	Start            string
	ByteLen, WordLen int
	// Words holds the root words present in the document. This slice can
	// be reordered without effecting the encoding.
	Words    []Locations[WordID]
	Variants *huffslice.Slice[VariantID]
}

Document is a string encoded as the root words. This makes identifying which words are in a document fast.

func (*Document[WordID, VariantID]) WordIDs

func (doc *Document[WordID, VariantID]) WordIDs() []WordID

WordIDs returns a slice with all the WordIDs in the document.

type DocumentDecoder

type DocumentDecoder[WordID, VariantID comparable] struct {
	Decoder[WordID, VariantID]
	WordSingleToken WordID
	VarSingleToken  VariantID
}

DocumentDecoder can decode a Document into a string.

func (DocumentDecoder[WordID, VariantID]) Decode

func (dec DocumentDecoder[WordID, VariantID]) Decode(doc *Document[WordID, VariantID]) string

Decode a Document to a string

type DocumentEncoder

type DocumentEncoder[WordID, VariantID comparable] struct {
	Encoder[WordID, VariantID]
	Splitter        func(string) (string, []string)
	RootVariant     func(string) (string, Variant)
	WordSingleToken WordID
	VarSingleToken  VariantID
}

DocumentEncoder can encode a string into a Document.

func (DocumentEncoder[WordID, VariantID]) Build

func (enc DocumentEncoder[WordID, VariantID]) Build(str string) *Document[WordID, VariantID]

Build takes a stirng and encodes it to a Document.

func (DocumentEncoder[WordID, VariantID]) Update

func (enc DocumentEncoder[WordID, VariantID]) Update(doc *Document[WordID, VariantID], str string) *ChangeSet[WordID]

Update a document updates the encoding and returns a ChangeSet.

type Encoder

type Encoder[WordID, VariantID any] interface {
	WordToID(string) WordID
	VariantToID(Variant) VariantID
}

Encoder supplies the necessary encoding information to translate strings into IDs.

type Locations

type Locations[T comparable] struct {
	ID   T
	Idxs []uint32
}

Locations hold an ID and the index locations where that ID occures.

type Variant

type Variant []byte

Variant encodes the casing of a word and the non-alphanumeric characters that follow the word.

func RootVariant

func RootVariant(str string) (string, Variant)

RootVariant find the prefix of the string containing letters and numbers and the Variant to convert the root back to the original input.

func (Variant) Apply

func (v Variant) Apply(root string, buf []byte) []byte

Apply a variant to a word. It is expected that the root is all lower case. The casing will be changed according the variant and non-alphanumeric runes will be appended.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL