bufrr

package module
v0.0.0-...-7210313 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 29, 2016 License: MIT Imports: 3 Imported by: 5

README

bufrr - a buffered rune reader

Language: Go

Synopsis

Package bufrr provides a buffered rune reader, with both PeekRune and UnreadRune. It takes an io.Reader providing the source, buffers it by wrapping with a bufio.Reader, and creates a new Reader implementing the bufrr.RunePeeker interface (an io.RuneScanner interface plus an additional PeekRune method).

Additionally, bufrr.Reader also translates io.EOF error into the invalid rune value of -1 (defined as bufrr.EOF)

Internally, bufrr.Reader is a bufio.Reader plus a single-rune peek buffer and a single-rune unread buffer.

Code Example

import (
	"github.com/SteelSeries/bufrr"
	"strings"
)

func ExampleBufrr() {

	// example input
	in := strings.NewReader("abc")

	// construct buffered rune reader
	buf := bufrr.NewReader(in)

	var err error
	var r, p rune

	// common sequence of operations when lexing an awkawrd grammar
	r, _, err = buf.ReadRune()
	// [...]
	p, _, err = buf.PeekRune()
	// [...]
	err = buf.UnreadRune()
	// [...]
}

Motivation

When writing Unicode/UTF-8 parsers/lexers/tokenizers in Go, it is preferential to work with the higher-level native rune type instead of []byte.

A common sequence of operations that a tokenizer performs on its input stream are:

  1. next (read)
  2. peek (look-ahead)
  3. backup (unread)

Requirement: a simple API providing ReadRune(), PeekRune() and UnreadRune().

  • bufio.Reader has ReadRune and UnreadRune -- but no PeekRune (has PeekBytes though). Furthermore, under certain conditions, bufio.Reader seems to have some unexpected behaviour when combining peeks with unreads.
  • scanner.Scanner is rune-based, with Read and Peek -- but no Unread.

I considered adding PeekRune() to bufio.Reader, as the easiest option. But once I got halfway through the implementation I realised there were some edge cases where things became trickier than I'd expected (due to bufio.Reader's current implementation).

I considered adding Unread() to scanner.Scanner, but decided this would introduce unnecessary complexity - plus scanner.Scanner is higher-level than needed, having additional unrequired functionality; to implement a tokenizer over the top of it would really be duplicating too much functionality.

After all this, I finally decided the easiest option was to implement a simple wrapper for bufio.Reader with the functionality I needed - it was the least amount of work I could do: my API requirement is only 3 methods.

As two of my methods are already covered by the io.RuneScanner interface, the bufrr.RunePeeker interface simply extends this with the addition of a PeekRune() method.

Why bufio.Reader? Tokenizers arguably/usually work over a buffered input stream (supporting both peek and unread implies at least a minimal amount of buffering, i.e. two runes - plus buffered I/O is generally a good thing).

An eventual end-of-file is an expected condition when parsing, lexing or tokenizing. Therefore, representing EOF as a token/marker in the rune stream, distinct from any error conditions encountered while reading the stream, is preferable, and leads to cleaner client code.

To this end, when bufrr.Reader reaches EOF, both ReadRune() and PeekRune() will return an invalid rune value of -1 (defined as bufrr.EOF), and will never return an io.EOF error.

Installation

Fetch the code:

go get github.com/SteelSeries/bufrr

Import the package into your code:

import (
	...
	"github.com/SteelSeries/bufrr"
	...
)

API Reference

See autogenerated documentation at: http://godoc.org/github.com/SteelSeries/bufrr

API Overview
Constructors
func NewReader(rd io.Reader) *bufrr.Reader
func NewReaderSize(rd io.Reader, size int) *bufrr.Reader
bufrr.Reader methods

bufrr.Reader implements all the methods of interface bufrr.RunePeeker, namely:

ReadRune() (r rune, w int, err error)
PeekRune() (r rune, w int, err error)
UnreadRune() error
bufrr.RunePeeker interface
type RunePeeker interface {
	io.RuneScanner
	PeekRune() (r rune, w int, err error)
}

Tests

To run the tests:

cd $GOPATH/src/github.com/SteelSeries/bufrr
go test

The tests could do with improvement. They only test the basic API functionality and do not test all of the edge cases. But this is not to say that the code is not fully tested, per se; it is in fact well exercised by several file parsers I have written.

Contributors

Bug reports and pull requests are most welcome!

License

This work is distributed under an MIT License (Wikipedia: MIT License) - see LICENSE file for details.

Documentation

Overview

Package bufrr provides a buffered rune reader, with both PeekRune and UnreadRune. It takes an io.Reader providing the source, buffers it by wrapping with a bufio.Reader, and creates a new Reader implementing the bufrr.RunePeeker interface (an io.RuneScanner interface plus an additional PeekRune method).

Additionally, bufrr.Reader also translates io.EOF error into the invalid rune value of -1 (defined as bufrr.EOF)

Internally, bufrr.Reader is a bufio.Reader plus a single-rune peek buffer and a single-rune unread buffer.

Index

Examples

Constants

View Source
const EOF = -1

Variables

View Source
var ErrInvalidUnreadRune = errors.New("bufrr: invalid use of UnreadRune")

Functions

This section is empty.

Types

type Reader

type Reader struct {
	// contains filtered or unexported fields
}
Example
s := strings.NewReader("abc")
b := NewReader(s)

r, _, _ := b.ReadRune()
fmt.Printf("%d\n", r)
r, _, _ = b.ReadRune()
fmt.Printf("%d\n", r)
r, _, _ = b.ReadRune()
fmt.Printf("%d\n", r)
r, _, _ = b.ReadRune()
fmt.Printf("%d\n", r)
r, _, _ = b.ReadRune()
fmt.Printf("%d\n", r)
Output:

97
98
99
-1
-1

func NewReader

func NewReader(rd io.Reader) *Reader

func NewReaderSize

func NewReaderSize(rd io.Reader, size int) *Reader

func (*Reader) PeekRune

func (b *Reader) PeekRune() (r rune, w int, err error)

func (*Reader) ReadRune

func (b *Reader) ReadRune() (r rune, w int, err error)

func (*Reader) UnreadRune

func (b *Reader) UnreadRune() error

type RunePeeker

type RunePeeker interface {
	io.RuneScanner
	PeekRune() (r rune, w int, err error)
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL