ten

package module

v0.0.0-...-f8d227e Latest Latest Go to latest Published: Aug 19, 2020 License: MIT Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/AlexanderEkdahl/ten

Links

Open Source Insights

README ¶

ten

Efficient binary encoding for tensors. The format is supported by WebDataset with the .ten filename extension.

The reference implemention developed in Python by tmbdev at Nvidia can be found here

Installation

go get github.com/AlexanderEkdahl/ten

This package uses no external dependencies outside of the Go standard library.

Usage

// Writes encoded tensor to w
e := NewEncoder(w)
e.Encode([]float32{1, 2, 3, 4, 5, 6, 7, 8, 9}, []int{3, 3}, "policy")

// Reads encoded tensor from r
d := NewDecoder(r)
tensorData, shape, info, err := d.Decode()

float16 is not supported due to missing support in Go (#32022).

Benchmarks

$ go test -bench .
goos: linux
goarch: amd64
pkg: github.com/AlexanderEkdahl/ten
BenchmarkDecoder/100-16         	  531340	      2107 ns/op	 189.88 MB/s
BenchmarkDecoder/500-16         	  224413	      5368 ns/op	 372.55 MB/s
BenchmarkDecoder/1000-16        	  120788	      9430 ns/op	 424.20 MB/s
BenchmarkDecoder/10000-16       	   15444	     78619 ns/op	 508.79 MB/s
BenchmarkEncoder/100-16         	  920988	      1231 ns/op	 324.97 MB/s
BenchmarkEncoder/500-16         	  324014	      3880 ns/op	 515.49 MB/s
BenchmarkEncoder/1000-16        	  159812	      7117 ns/op	 562.01 MB/s
BenchmarkEncoder/10000-16       	   18616	     64677 ns/op	 618.46 MB/s

Documentation ¶

Overview ¶

Package ten provides efficient binary encoding for tensors. The format is 8 byte aligned and can be used directly for computations when transmitted, say, via RDMA. The format is supported by WebDataset with the `.ten` filename extension. It is also used by Tensorcom, Tensorcom RDMA, and can be used for fast tensor storage with LMDB and in disk files (which can be memory mapped).

Data is encoded as a series of chunks:

magic number (int64)
length in bytes (int64)
bytes (multiple of 64 bytes long)

Arrays are a header chunk followed by a data chunk. Header chunks have the following structure:

dtype (int64)
8 byte array name
ndim (int64)
dim[0]
dim[1]
...

Index ¶

Variables
type Decoder
- func NewDecoder(r io.Reader) *Decoder
- func (d *Decoder) Decode() (tensorData interface{}, shape []int, info string, err error)
type Encoder
- func NewEncoder(w io.Writer) *Encoder
- func (e *Encoder) Encode(tensorData interface{}, shape []int, info string) error

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrMagicNumberMismatch     = fmt.Errorf("magic number mismatch")
	ErrNegativeLength          = fmt.Errorf("negative length")
	ErrNegativeDimensions      = fmt.Errorf("negative dimensions")
	ErrDecodingUnsupportedType = fmt.Errorf("unsupported data type")
)

Decoding errors

View Source

var (
	ErrTooManyDimensions = fmt.Errorf("too many dimensions")
	ErrInfoTooLong       = fmt.Errorf("info can not exceed 8 bytes")
)

Encoding errors

View Source

var MagicNumber = []byte{0x7e, 0x54, 0x65, 0x6e, 0x42, 0x69, 0x6e, 0x7e}

MagicNumber is the magic number before every chunk.

Functions ¶

This section is empty.

Types ¶

type Decoder ¶

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder reads and decodes tensor data from an input stream.

func NewDecoder ¶

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a new decoder that reads from r.

func (*Decoder) Decode ¶

func (d *Decoder) Decode() (tensorData interface{}, shape []int, info string, err error)

Decode reads the next ten-encoded tensor from its input.

type Encoder ¶

type Encoder struct {
	// contains filtered or unexported fields
}

An Encoder writes tensors to an output stream.

func NewEncoder ¶

func NewEncoder(w io.Writer) *Encoder

NewEncoder returns a new encoder that writes to w.

func (*Encoder) Encode ¶

func (e *Encoder) Encode(tensorData interface{}, shape []int, info string) error

Encode writes the tensor encoding of t to the stream along with a custom info header.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL