text

package
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2024 License: GPL-3.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidUtf8 = fmt.Errorf("invalid UTF-8")
)

Functions

func Repeat

func Repeat(c rune, n int) string

Repeat creates a string with the same character repeated n times.

func Reverse

func Reverse(s string) string

Reverse reverses the bytes of a string. The result may not be a valid UTF-8.

func ToggleRuneCase

func ToggleRuneCase(r rune) rune

ToggleRuneCase changes the case of the rune from lower-to-upper or vice-versa.

Types

type LineMatch added in v0.5.0

type LineMatch struct {
	LeftLineNum  uint64
	RightLineNum uint64
}

LineMatch represents a line matched between two documents.

func Align added in v0.5.0

func Align(leftReader, rightReader io.Reader) ([]LineMatch, error)

Align matches lines in the left document with identical lines in the right document.

type Reader added in v0.3.0

type Reader struct {
	// contains filtered or unexported fields
}

Reader reads UTF-8 bytes from a text.Tree. It implements io.Reader. Copying the struct produces a new, independent reader. text.Tree is NOT thread-safe, so reading from a tree while modifying it is undefined behavior!

func (*Reader) Read added in v0.3.0

func (r *Reader) Read(b []byte) (int, error)

Read implements io.Reader#Read

func (*Reader) ReadRune added in v0.3.0

func (r *Reader) ReadRune() (rune, int, error)

ReadRune implements io.RuneReader#ReadRune If the next bytes in the reader are not valid UTF8, it returns ErrInvalidUtf8. If there are no more bytes to read, it returns io.EOF.

type ReverseReader added in v0.3.0

type ReverseReader struct {
	Reader
}

ReverseReader reads bytes in reverse order.

func (*ReverseReader) Read added in v0.3.0

func (r *ReverseReader) Read(b []byte) (int, error)

Read implements io.Reader#Read

func (*ReverseReader) ReadRune added in v0.3.0

func (r *ReverseReader) ReadRune() (rune, int, error)

ReadRune implements io.RuneReader#ReadRune

type RuneStack added in v1.3.0

type RuneStack struct {
	// contains filtered or unexported fields
}

RuneStack represents a string with efficient operations to push/pop runes. The zero value is equivalent to an empty string.

func (*RuneStack) Len added in v1.3.0

func (rs *RuneStack) Len() int

func (*RuneStack) Pop added in v1.3.0

func (rs *RuneStack) Pop() (bool, rune)

func (*RuneStack) Push added in v1.3.0

func (rs *RuneStack) Push(r rune)

func (*RuneStack) String added in v1.3.0

func (rs *RuneStack) String() string

type Searcher

type Searcher struct {
	// contains filtered or unexported fields
}

Searcher searches for an exact match of a query. It uses the Knuth-Morris-Pratt algorithm, which runs in O(n+m) time, where n is the length of the text and m is the length of the query.

func NewSearcher

func NewSearcher(query string) *Searcher

func (*Searcher) LastInReader added in v0.5.0

func (s *Searcher) LastInReader(r io.Reader) (bool, uint64, error)

LastInReader finds the last occurrence of a query in the text produced by an io.Reader. If it finds a match, it returns the offset (in rune positions) from the start of the reader.

func (*Searcher) Limit added in v0.2.0

func (s *Searcher) Limit(offset uint64) *Searcher

Limit sets the maximum offset (in rune positions) for the end of a match. For example, a limit of 3 would allow matches that end on the second rune from the reader, but not on the following runes.

func (*Searcher) NextInReader

func (s *Searcher) NextInReader(r io.Reader) (bool, uint64, error)

NextInReader finds the next occurrence of a query in the text produced by an io.Reader. If it finds a match, it returns the offset (in rune positions) from the start of the reader.

func (*Searcher) NoLimit added in v0.5.0

func (s *Searcher) NoLimit() *Searcher

NoLimit removes any limit set on the Searcher.

type Tree

type Tree struct {
	// contains filtered or unexported fields
}

text.Tree is a data structure for representing UTF-8 text. It supports efficient insertions, deletions, and lookup by character offset and line number. It is inspired by two papers: Boehm, H. J., Atkinson, R., & Plass, M. (1995). Ropes: an alternative to strings. Software: Practice and Experience, 25(12), 1315-1330. Rao, J., & Ross, K. A. (2000, May). Making B+-trees cache conscious in main memory. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (pp. 475-486). Like a rope, the tree maintains character counts at each level to efficiently locate a character at a given offset. To use the CPU cache efficiently, all children of a node are pre-allocated in a group (what the Rao & Ross paper calls a "full" cache-sensitive B+ tree), and the parent uses offsets within the node group to identify child nodes. All nodes are carefully designed to fit as much data as possible within a 64-byte cache line.

func NewTree

func NewTree() *Tree

NewTree returns a tree representing an empty string.

func NewTreeFromReader

func NewTreeFromReader(r io.Reader) (*Tree, error)

NewTreeFromReader creates a new Tree from a reader that produces UTF-8 text. This is more efficient than inserting the bytes into an empty tree. Returns an error if the bytes are invalid UTF-8.

func NewTreeFromString

func NewTreeFromString(s string) (*Tree, error)

NewTreeFromString creates a new Tree from a UTF-8 string.

func (*Tree) DeleteAtPosition

func (t *Tree) DeleteAtPosition(charPos uint64) (bool, rune)

DeleteAtPosition removes the UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, this has no effect.

func (*Tree) InsertAtPosition

func (t *Tree) InsertAtPosition(charPos uint64, c rune) error

InsertAtPosition inserts a UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, it will be appended at the end. Returns an error if c is not a valid UTF-8 character.

func (*Tree) LineNumForPosition

func (t *Tree) LineNumForPosition(charPos uint64) uint64

LineNumForPosition returns the line number (0-indexed) for the line containing the specified position.

func (*Tree) LineStartPosition

func (t *Tree) LineStartPosition(lineNum uint64) uint64

LineStartPosition returns the position of the first character at the specified line (0-indexed). If the line number is greater than the maximum line number, returns one past the position of the last character.

func (*Tree) NumChars

func (t *Tree) NumChars() uint64

NumChars returns the total number of characters (runes) in the tree.

func (*Tree) NumLines

func (t *Tree) NumLines() uint64

NumLines returns the total number of lines in the tree.

func (*Tree) ReaderAtPosition

func (t *Tree) ReaderAtPosition(charPos uint64) Reader

ReaderAtPosition returns a reader starting at the UTF-8 character at the specified position (0-indexed). If the position is past the end of the text, the returned reader will read zero bytes.

func (*Tree) ReverseReaderAtPosition added in v0.3.0

func (t *Tree) ReverseReaderAtPosition(charPos uint64) ReverseReader

ReverseReaderAtPosition returns a reverse reader starting at the specified position.

func (*Tree) String

func (t *Tree) String() string

String returns the text in the tree as a string.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL