utf8

package
v0.0.0-...-4aa4c59 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2022 License: MIT Imports: 2 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RunesLen

func RunesLen(p []Rune) (n int)

RunesLen returns the total number of bytes to encode each rune in p. Each rune with an invalid UTF-8 encoding is ignored, i.e., its rune length is assumed to be 0.

Types

type Iterable

type Iterable struct {
	Iterator
	// contains filtered or unexported fields
}

Iterable defines a concrete implementation of an Iterator that provides type-agnostic methods over the interface.

func (*Iterable) GlyphCount

func (s *Iterable) GlyphCount() (count int)

GlyphCount scans the current range and counts the number of runes that are not within any escape sequence.

Once scanning completes, the receiver's internal indices are reset to their original value from when the method was called.

Note that only those runes that are in escape sequences which begin at or after the Iterator's first element (at RuneHead) will be excluded from the count. An example of this limitation is shown and discussed below.

Three different Iterators over the same backing array ([9]rune) are shown:

 [ 'H', 'l', 'o', ESC, '[', '2', 'D', 'e', 'l' ]   // Backing array
  ==== ==== ==== ____ ____ ____ ____ ==== ====
 { +1   +2   +3   --   --   --   --   +4   +5  }   // (1.) 5 glyphs
 { +1   +2   +3   --   -- }                        // (2.) 3 glyphs
                          { +1   +2   +3   +4  }   // (3.) 4 glyphs

The second and third Iterators together form the same sequence as the first
Iterator, so their total number of glyphs (3 + 4) should logically equal
the first (5).

However, because the second and third Iterators' bounds were not aligned
with the escape sequence's bounds, the final 2 runes of the escape sequence
was erroneously counted as ordinary runes in the third Iterator.

func (*Iterable) Len

func (s *Iterable) Len() (n int)

Len scans the current range and counts the number of bytes required to encode each valid rune. Rune encodings that are invalid UTF-8 are considered to have zero bytes. Unlike GlyphCount, runes in escape sequences are included.

Once scanning completes, the receiver's internal indices are reset to their original value from when the method was called.

func (*Iterable) Next

func (s *Iterable) Next() (r *Rune)

Next returns the next Rune in s.

If there are no elements remaining in s, returns a Rune r such that r.IsError() == true and r.Len() == 0.

func (*Iterable) Reset

func (s *Iterable) Reset() *Iterable

Reset resets the internal indices of s based on the backing Iterator's current head and tail indices.

func (*Iterable) Slice

func (s *Iterable) Slice(lo, hi int) *Iterable

Slice sets the internal indices of s based on the backing Iterator's current head and tail indices offset by the given lo and hi slice indices.

Both lo and hi are relative to head and tail of s, such that lo=0 always refers to s's head index (even if s was previously sliced).

If lo and/or hi are negative, they are treated as unspecified slice indices. For example, Slice(-1, N) is equivalent to s[:N], Slice(N, -1) is to s[N:], and Slice(-1, -1) is to s[:] (also equivalent to Reset).

type IterableRune

type IterableRune []rune

IterableRune implements Iterator using the native Go type []rune.

Use a construct like the following to convert an existing []rune to Iterable without causing a copy/alloc:

  var aSlice = []rune{...} // Some global
  ...
	   it := Iterable{Iterator: (*IterableRune)(&aSlice)}
var NonIterableRune IterableRune

func (*IterableRune) RuneAt

func (ir *IterableRune) RuneAt(i int) *Rune

RuneAt returns a pointer to (*ir)[i]. Implements Iterator for native Go type []rune.

func (*IterableRune) RuneHead

func (ir *IterableRune) RuneHead() uint32

RuneHead returns 0. Implements Iterator for native Go type []rune.

func (*IterableRune) RuneTail

func (ir *IterableRune) RuneTail() uint32

RuneTail returns len(*ir). Implements Iterator for native Go type []rune.

type Iterator

type Iterator interface {
	// RuneHead returns the index of the first Rune element.
	RuneHead() uint32
	// RuneTail returns the index of the last Rune element.
	RuneTail() uint32
	// RuneAt is the normal array-like accessor that returns a Rune for a given
	// 0-based list index.
	RuneAt(i int) *Rune
}

Iterator defines a list of Rune randomly accessible by sequential index and terminated by head and tail indices.

This interface provides a []Rune accessor abstraction for structures that store Rune-like data in some other type such as the native Go type []rune.

type Rune

type Rune rune

Rune extends type rune with unbuffered implementations of several interfaces from standard library package "io".

func (*Rune) Encode

func (r *Rune) Encode(p []byte) (n int, err error)

Encode writes into p the UTF-8 encoding of r and returns the number of bytes written. Returns 0, ErrWriteOverflow if p is not large enough to hold the encoding of r.

func (*Rune) Equals

func (r *Rune) Equals(a Rune) bool

Equals returns true if and only if r and a are the same UTF-8 code point.

func (*Rune) EqualsRune

func (r *Rune) EqualsRune(a rune) bool

EqualsRune returns true if and only if r and a are the same UTF-8 code point.

func (*Rune) IsError

func (r *Rune) IsError() bool

IsError returns true if and only if r is nil or equal to RuneError.

func (*Rune) Len

func (r *Rune) Len() (n int)

Len returns the number of bytes required to encode r. Returns 0 if r is not a valid UTF-8 encoding.

func (*Rune) Read

func (r *Rune) Read(p []byte) (n int, err error)

Read implements io.Reader.

func (*Rune) Rune

func (r *Rune) Rune() rune

Rune returns r as a Go native rune type.

func (*Rune) Set

func (r *Rune) Set(a Rune)

Set sets r equal to a.

func (*Rune) SetRune

func (r *Rune) SetRune(a rune)

SetRune sets r equal to the Go native rune a.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL