tfutils

package module
v0.0.0-...-21f1f2a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 20, 2017 License: MIT Imports: 11 Imported by: 3

README

tfutils

tensorflow utils for golang

A collection of tensorflow helpers useful in golang.

Motivation

tensorflow's TFRecord format can be read and written to using the native RecordWriter and RecordReader. While this is exposed to your go environment via tensorflow, this is not handy for building simple translation tools which might need to read or operate on images to generate tfrecords -- enter tfutils.

What you get (so far ...)

A reader that implements the go equivalent of the RecordReader, and a similar writer.

The vendor directory contains the generated go code for the protobuf, which is done using the protoc tool. The .proto files required to generate the same, are found in the tensorflow project and are not checked into this repo.

Record format:

Field Type Description
length uint64 The length of the Example tfrecord
lengthCrc uint32 32-bit CRC of the 8-byte length
data []byte length-sized slice of bytes - this is the serialized version of the tfrecord
dataCrc uint32 32-bit CRC of the data

Caveats

The protobuf files needed to interpret the Example are bundled in this "library" and so is the generated go code that allows the serialization and de-serialization of the data. This is done to decrease the complexity of building this tool-set, but might get out of sync with the latest and greatest reader / writer logic. However, this is not something that should change often (if at all) and most protobuf implementations promise backwards compatibility which will ignore old fields that it cannot recognize.

Usage

See tests defined in reader_test.go and writer_test.go.

Generating the protocol buffer stuff

protoc -I=$TENSORFLOW_DIR --go_out=$GOPATH/src/ $TENSORFLOW_DIR/tensorflow/core/example/*.proto

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CrcMask

func CrcMask(v uint32) uint32

func CrcUnmask

func CrcUnmask(v uint32) uint32

func GetFeatureFromGoType

func GetFeatureFromGoType(x interface{}) (*tf.Feature, error)

GetFeatureFromGoType returns a tensorflow Feature instance for the underlying golang type. Note that not all primitive types are massaged into one of the feature types here (yet).

func GetFeatureMapFromTFRecord

func GetFeatureMapFromTFRecord(data []byte) (*tf.Features, error)

func GetFeaturesFromMap

func GetFeaturesFromMap(m map[string]interface{}) (*tf.Features, error)

GetFeaturesFromMap returns a tensorflow Features instance based on the generic map of string->value being passed in.

func GetTFRecordStringForFeatures

func GetTFRecordStringForFeatures(fs *tf.Features) ([]byte, error)

GetTFRecordStringForFeatures returns a serialized version of an "Example" protobuffer for writing using a TF RecordWriter.

func MaskedCrc

func MaskedCrc(bs []byte, n int64) uint32

Types

type CompressionType

type CompressionType int
const (
	CompressionTypeNone CompressionType = iota
	CompressionTypeZlib
)

type RecordReader

type RecordReader struct {
	// contains filtered or unexported fields
}

RecordReader implements a reader which can work on a queue of tf record files to extract the nested "Example" protobufs written using a matching tf record writer.

func NewReader

func NewReader(queue []string, options *RecordReaderOptions) (*RecordReader, error)

NewReader returns a new instance of a record reader which accepts a queue of files to read from.

func (*RecordReader) NumRecordsProduced

func (rr *RecordReader) NumRecordsProduced() int

NumRecordsProduced returns the number of records that this record reader has produced.

func (*RecordReader) ReadRecord

func (rr *RecordReader) ReadRecord() ([]byte, error)

ReadRecord checks the record reader for additional records and returns the next available one. If the current reader returns an EOF, the queue is dequeued for another file to parse. If the queue is empty, ReadRecord returns io.EOF to the caller.

type RecordReaderOptions

type RecordReaderOptions struct {
	CompressionType CompressionType
}

RecordReaderOptions specify reader options for the tf record reader.

type RecordWriter

type RecordWriter struct {
	// contains filtered or unexported fields
}

RecordWriter implements a writer that appends tfrecord strings to a output file. The `dstfile` is the path to the tfrecord file, and the `options` stores a copy of the writer options.

func NewWriter

func NewWriter(path string, options *RecordWriterOptions) (*RecordWriter, error)

NewWriter returns a new instance of a tfrecrod writer.

func (*RecordWriter) Close

func (rw *RecordWriter) Close() error

func (*RecordWriter) Flush

func (rw *RecordWriter) Flush() error

func (*RecordWriter) WriteRecord

func (rw *RecordWriter) WriteRecord(data []byte) error

type RecordWriterOptions

type RecordWriterOptions struct {
	CompressionType CompressionType
}

RecordWriterOptions defines the options to open the record writer with.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL