skiprope

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 18, 2017 License: MIT Imports: 5 Imported by: 0

README

SkipRope GoDoc Build Status Coverage Status

package skiprope is an implementation of the rope data structure. It's not strictly a rope as per se. It's not built on top of a binary tree like most rope data structures are.

Instead it's built on top of a skip list. This makes it more akin to a piece table than a true rope.

This library is quite complete as it is (it has all the functions and I need for my projects). However, any addition, corrections etc are welcome. Please see CONTRIBUTING.md for more information on how to contribute.

Installing

This package is go-gettable: go get -u github.com/chewxy/skiprope.

This package is versioned with SemVer 2.0, and does not have any external dependencies except for testing (of which the fantastic assert package is used).

Usage

r := skiprope.New()
if err := r.Insert(0, "Hello World! Hello, 世界"); err != nil {
	// handle error
}
if err  := r.Insert(5, "there "); err != nil {
	// handle error
}
fmt.Println(r.String()) // "Hello there World! Hello, 世界"

FAQ

How do I keep track of row, col information?

The best way is to use a linebuffer data structure in conjunction with the Rope:

type Foo struct{
	*skiprope.Rope
	linebuf []int
}

The linebuf is essentially a list of offsets to a newline character. The main reason why I didn't add a linebuf slice into the *Rope data structure is because while it's a common thing to do, for a couple of my projects I didn't need it. The nature of Go's composable data structure and programs make it quite easy to add these things. A pro-tip is to extend the Insert, InsertRunes and EraseAt methods.

I do not think it to be wise to add it to the core data structure.

When is *Rope going to implement io.Writer and io.Reader?

The main reason why I didn't do it was mostly because I didn't need it. However, I've been asked about this before. I personally don't have the bandwidth to do it.

Please send a pull request :)

Benchmarks

There is a benchmark mini-program that is not required for the running, but here are the results:

go test -run=^$ -bench=. -benchmem -cpuprofile=test.prof
BenchmarkNaiveRandomInsert-8   	   50000	     37774 ns/op	      15 B/op	       0 allocs/op
BenchmarkRopeRandomInsert-8    	 3000000	       407 ns/op	      27 B/op	       0 allocs/op
BenchmarkERopeRandomInsert-8   	 1000000	      2143 ns/op	    1161 B/op	      25 allocs/op
PASS

History, Development and Acknowledgements

This package started its life as a textbook binary-tree data structure for another personal project of mine. Over time, I decided to add more and more optimizations to it. The original design had a split at every new-line character.

I started by moving more and more things off the heap, and onto the stack. As I wondered how to incorporate a search structure using a skiplist, I stumbled onto a well developed library for C, librope.

It had everything I had wanted: a rope-like structure, using skiplists to find nodes, minimal allocations on heap, and a solution to my problem wrt keeping the skiplists off heap. The solution turned out to be to be the skiplist data structure, without a pointer. So I ended up adapting most of Joseph Gentle's algorithm. So this library owes most of its work to Joseph Gentle.

Hours before releasing this library, I had a consult by Egon Elbre, who gave good advice on whether just sticking with []rune was a good idea. He managed to convince me that it isn't, so the first pull request was made to update this library to deal with []byte instead. As a result, memory use went down 40B/op at the cost of an increase of about 30 ns/op. The number can be further shaved down with better optimizations.

Documentation

Overview

package skiprope provides rope-like data structure for efficient manipulation of large strings.

Index

Examples

Constants

View Source
const (
	MaxHeight  = 60 // maximum size of the skiplist
	BucketSize = 64 // data bucket size in a knot - about 64 bytes is optimal for insertion on a core i7.
)

Variables

View Source
var Bias = 20

Bias indicates the probability that a new knot will have height of n+1. This is the parameter to tweak when considering the tradeoff between high amounts of append operations and amount of random writes.

The higher the bias is, the better the data structure is at performing end-of-string appends. The tradeoff is performance of random writes will deterioriate.

Functions

This section is empty.

Types

type Rope

type Rope struct {
	Head knot
	// contains filtered or unexported fields
}

Rope is a rope data structure built on top of a skip list.

func New

func New() *Rope

New creates a new Rope.

func (*Rope) Before

func (r *Rope) Before(at int, fn func(r rune) bool) (retVal int, retRune rune, err error)

Before returns the index of the first rune that matches the function before the given point.

Example: "Hello World". Let's say `at` is at 9 (rune = r). And we want to find the whitespace before it. This function will return 5, which is the byte index of the rune immediately after the whitespace.

Example
r := New()
_ = r.Insert(0, "Hello World. This is a long sentence. The purpose of this long sentence is to make sure there is more than BucketSize worth of runes")
before, char, err := r.Before(70, unicode.IsSpace)
if err != nil {
	fmt.Printf("Error: cannot get before 70: %v", err)
}

fmt.Printf("First whitespace before position 70: %d - %q\n", before, char)
Output:

First whitespace before position 70: 63 - ' '

func (*Rope) EraseAt

func (r *Rope) EraseAt(point, n int) (err error)

EraseAt erases n runes starting from the point.

func (*Rope) Index

func (r *Rope) Index(at int) rune

Index returns the rune at the given index.

func (*Rope) Insert

func (r *Rope) Insert(at int, str string) error

Insert inserts the string at the point

func (*Rope) InsertBytes

func (r *Rope) InsertBytes(point int, data []byte) (err error)

InsertBytes inserts the bytes at the point.

func (*Rope) InsertRunes

func (r *Rope) InsertRunes(point int, data []rune) (err error)

InsertRunes inserts the runes at the point

func (*Rope) Runes

func (r *Rope) Runes() int

Runes is the number of runes in the rope

func (*Rope) Size

func (r *Rope) Size() int

Size is the length of the rope.

func (*Rope) String

func (r *Rope) String() string

String returns the rope as a full string.

func (*Rope) Substr

func (r *Rope) Substr(pointA, pointB int) string

Substr creates a substring.

func (*Rope) SubstrBytes

func (r *Rope) SubstrBytes(pointA, pointB int) []byte

SubstrBytes returns a byte slice given the "substring" of the rope. Both pointA and pointB refers to the rune, not the byte offset.

Example: "你好world" has a length of 11 bytes. If we only want "好", we'd have to call SubstrBytes(1, 2), not SubstrBytes(3, 6) which you would if you were dealing with pure bytes

func (*Rope) SubstrRunes

func (r *Rope) SubstrRunes(pointA, pointB int) []rune

SubstrRunes is like Substr, but returns []rune

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL