peanut

package module
v1.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 18, 2023 License: BSD-3-Clause Imports: 19 Imported by: 0

README

peanut

BSD3 Build Status codecov Go Report Card Used By Godoc

peanut is a Go package to write tagged data structs to disk in a variety of formats.

Its primary purpose is to provide a single consistent interface for easy, ceremony-free persistence of record-based struct data.

Each distinct struct type is written to an individual file (or table), automatically created, each named according to the name of the struct. Field/column names in each file/table are derived from struct tags. All writers use the same tags.

Currently supported formats are CSV, TSV, Excel (.xlsx), JSON Lines (JSONL), and SQLite. Additional writers are also provided to assist with testing and debugging. Mutiple writers can be combined using MultiWriter.

All writers perform atomic file operations, writing data to a temporary location and moving it to the final output location when Close is called.

About

When building an app or tool that needs to output data consisting of multiple different record types to disk, perhaps with requirements that change over time (whether during development or after initial deployment), perhaps requiring multiple output formats (during development/testing, or as final output) — is where peanut might be 'the right tool for the job'.

Ideal for use as an output solution for, e.g. data conversion tools, part of an ETL pipeline, data-acquistion or extraction tools/apps, web-scrapers, structured logging, persistence of captured data/metadata/events, job reporting, etc. Whether building an ad-hoc tool as a quick hack, or as part of a bigger, more serious project.

peanut initially evolved as part of a larger closed-source project, is tried and tested, and production-ready.

Quickstart

Installation

Get the package:

go get github.com/jimsmart/peanut

Use the package within your code:

import "github.com/jimsmart/peanut"
API

All peanut writers implement this interface:

type Writer interface {
    Write(r interface{}) error
    Close() error
    Cancel() error
}
Usage
  1. Tag some structs.
  2. Initialise a peanut.Writer to use.
  3. Collect and assign data into tagged structs.
  4. Call Write() to write records, repeating until done.
  5. Call Close() to finish.
Example Code

See GoDocs.

Documentation

GoDocs https://godoc.org/github.com/jimsmart/peanut

Testing

To run the tests execute go test inside the project folder.

For a full coverage report, try:

go test -coverprofile=coverage.out && go tool cover -html=coverage.out

License

Package peanut is copyright 2020-2022 by Jim Smart and released under the BSD 3-Clause License.

History

  • v1.0.5 (2022-01-18) Updated dependencies.
  • v1.0.4 (2022-12-16) Updated dependencies.
  • v1.0.3 (2021-04-19) Relax semantics of Close/Cancel. Improved error handling.
  • v1.0.2 (2021-04-19) Fixup handling of uints.
  • v1.0.1 (2021-04-19) Repository made public.

Documentation

Overview

Package peanut writes tagged data structs to disk in a variety of formats. Its primary purpose is to provide a single consistent interface for easy, ceremony-free persistence of record-based struct data.

Each distinct struct type is written to an individual file (or table), automatically created, each named according to the name of the struct. Field/column names in each file/table are derived from struct tags. All writers use the same tags.

Currently supported formats are CSV, TSV, Excel (.xlsx), JSON Lines (JSONL), and SQLite. Additional writers are also provided to assist with testing and debugging. Mutiple writers can be combined using MultiWriter.

All writers have the same basic interface: a Write method, that can take any appropriately tagged struct; a Close method, which should be called to successfully complete writing; and a Cancel method, which should be called to abort writing and clean-up, in the event of an error or cancellation. It is safe to make mulltiple calls to Cancel, and it is safe to call Close after having previously called Cancel.

All writers output their files atomically — that is to say: all output is written to a temporary location and only moved to the final output location when Close is called, meaning the output folder never contains any partially written files.

Struct Tagging

Structs to be used with peanut must have appropriately tagged fields, for example:

type Shape struct {
	ShapeID  string `peanut:"shape_id"`
	Name     string `peanut:"name"`
	NumSides int    `peanut:"num_sides"`
}

type Color struct {
	ColorID  string `peanut:"color_id"`
	Name     string `peanut:"name"`
	RBG      string `peanut:"rgb"`
}

Fields without tags do not get written as output.

Usage

First create a writer, for example:

w := peanut.NewCSVWriter("/some/path/my-", "-data")

Next, write some records to it:

x := &Shape{
	ShapeID:  "sid1",
	Name:     "Square",
	NumSides: 4,
}
err := w.Write(x)
// ...

y := &Color{
	ColorID: "cid1",
	Name:    "red",
	RGB:     "ff0000",
}
err = w.Write(y)
// ...

z := &Shape{
	ShapeID:  "sid2",
	Name:     "Octogon",
	NumSides: 8,
}
err = w.Write(z)
// ...

When successfully completed:

err = w.Close()

// Output files will be:
// /some/path/my-Shape-data.csv
// /some/path/my-Color-data.csv

Or, to abort the whole operation in the event of an error or cancellation while writing records:

err = w.Cancel()

MultiWriter

Multiple writers can be combined using MultiWriter:

w1 := peanut.NewCSVWriter("/some/path/my-", "-data")
w2 := peanut.NewExcelWriter("/some/path/my-", "-data")
w3 := &peanut.LogWriter{}
w := peanut.MultiWriter(w1, w2, w3)

Here w will write records to CSV files, Excel files, and a logger.

Limitations

Behaviour is undefined for types with the same name but in different packages, such as package1.Foo and package2.Foo.

Supported datatypes for struct fields: string, bool, float32, float64, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64.

Pointer following and nested structs are currently unsupported.

Tagging a field that has an unsupported datatype will result in a error when Write is called.

Index

Constants

This section is empty.

Variables

View Source
var ErrClosedWriter = errors.New("peanut: write on closed writer")

ErrClosedWriter is the error used for write operations on a closed writer.

Functions

This section is empty.

Types

type CSVWriter

type CSVWriter struct {
	// contains filtered or unexported fields
}

CSVWriter writes records to CSV files, writing each record type to an individual CSV file automatically.

Filenames for each corresponding record type are derived accordingly:

prefix + type.Name() + suffix + extension

Where extension is ".csv" or ".tsv" accordingly.

The first row of resulting CSV file(s) will contain headers using names extracted from the struct's field tags. Records' fields are written in the order that they appear within the struct.

The caller must call Close on successful completion of all writing, to ensure buffers are flushed and files are properly written to disk.

In the event of an error or cancellation, the caller must call Cancel before quiting, to ensure closure and cleanup of any partially written files.

func NewCSVWriter

func NewCSVWriter(prefix, suffix string) *CSVWriter

NewCSVWriter returns a new CSVWriter, using prefix and suffix when building its output filenames, and using ".csv" file extension with comma ',' as a field separator.

See CSVWriter (above) for output filename details.

func NewTSVWriter

func NewTSVWriter(prefix, suffix string) *CSVWriter

NewTSVWriter returns a new CSVWriter configured to write TSV files, using prefix and suffix when building its output filenames, and using ".tsv" file extension with tab '\t' as a field separator.

See CSVWriter (above) for output filename details.

func (*CSVWriter) Cancel

func (w *CSVWriter) Cancel() error

Cancel should be called in the event of an error occurring, to properly close and delete any partially written files.

func (*CSVWriter) Close

func (w *CSVWriter) Close() error

Close flushes all buffers and writers, and closes the output files.

Calling Close after a previous call to Cancel is safe, and always results in a no-op.

func (*CSVWriter) Write

func (w *CSVWriter) Write(x interface{}) error

Write is called to persist records. Each record is written to an individual row in the corresponding output file, according to the type of the given record.

type DiscardWriter

type DiscardWriter struct{}

DiscardWriter is a Writer that does nothing.

func (*DiscardWriter) Cancel

func (*DiscardWriter) Cancel() error

Cancel does nothing and returns nil.

func (*DiscardWriter) Close

func (*DiscardWriter) Close() error

Close does nothing and returns nil.

func (*DiscardWriter) Write

func (*DiscardWriter) Write(x interface{}) error

Write does nothing and returns nil.

type ExcelWriter

type ExcelWriter struct {
	// contains filtered or unexported fields
}

ExcelWriter writes records to Excel files, writing each record type to an individual Excel file automatically.

Filenames for each corresponding record type are derived accordingly:

prefix + type.Name() + suffix + ".xslx"

The first row of resulting Excel file(s) will contain headers using names extracted from the struct's field tags, and will be frozen. Records' fields are written in the order that they appear within the struct.

The caller must call Close on successful completion of all writing, to ensure buffers are flushed and files are properly written to disk.

In the event of an error or cancellation, the caller must call Cancel before quiting, to ensure closure and cleanup of any partially written files.

func NewExcelWriter

func NewExcelWriter(prefix, suffix string) *ExcelWriter

NewExcelWriter returns a new ExcelWriter, using prefix and suffix when building its output filenames.

See ExcelWriter (above) for output filename details.

func (*ExcelWriter) Cancel

func (w *ExcelWriter) Cancel() error

Cancel should be called in the event of an error occurring.

func (*ExcelWriter) Close

func (w *ExcelWriter) Close() error

Close the writer, ensuring all files are saved.

Calling Close after a previous call to Cancel is safe, and always results in a no-op.

func (*ExcelWriter) Write

func (w *ExcelWriter) Write(x interface{}) error

Write is called to persist records. Each record is written to an individual row in the corresponding output file, according to the type of the given record.

type JSONLWriter

type JSONLWriter struct {
	// contains filtered or unexported fields
}

JSONLWriter writes records to JSON Lines files, writing each record type to an individual JSON Lines file automatically.

Filenames for each corresponding record type are derived accordingly:

prefix + type.Name() + suffix + ".jsonl"

The caller must call Close on successful completion of all writing, to ensure buffers are flushed and files are properly written to disk.

In the event of an error or cancellation, the caller must call Cancel before quiting, to ensure closure and cleanup of any partially written files.

func NewJSONLWriter

func NewJSONLWriter(prefix, suffix string) *JSONLWriter

NewJSONLWriter returns a new JSONLWriter, using prefix and suffix when building its output filenames.

See JSONLWriter (above) for output filename details.

func (*JSONLWriter) Cancel

func (w *JSONLWriter) Cancel() error

Cancel should be called in the event of an error occurring, to properly close and delete any partially written files.

func (*JSONLWriter) Close

func (w *JSONLWriter) Close() error

Close flushes all buffers and writers, and closes the output files.

Calling Close after a previous call to Cancel is safe, and always results in a no-op.

func (*JSONLWriter) Write

func (w *JSONLWriter) Write(x interface{}) error

Write is called to persist records. Each record is written to an individual row in the corresponding output file, according to the type of the given record.

type LogWriter

type LogWriter struct {
	Logger  *log.Logger
	Verbose bool
	// contains filtered or unexported fields
}

LogWriter writes records to a log.Logger.

If Logger is nil at runtime, a new log.Logger will be created when needed, writing to os.Stderr.

func (*LogWriter) Cancel

func (w *LogWriter) Cancel() error

Cancel should be called in the event of an error occurring.

func (*LogWriter) Close

func (w *LogWriter) Close() error

Close should be called after successfully writing records.

func (*LogWriter) Write

func (w *LogWriter) Write(x interface{}) error

Write is called to persist records.

type MockWriter

type MockWriter struct {
	Headers            map[string][]string
	Data               map[string][]map[string]string
	DisableDataCapture map[string]bool
	CalledWrite        int
	CalledClose        int
	CalledCancel       int
	// contains filtered or unexported fields
}

MockWriter captures written data in memory, to provide easy mocking when testing code that uses peanut.

func (*MockWriter) Cancel

func (w *MockWriter) Cancel() error

Cancel should be called in the event of an error occurring.

func (*MockWriter) Close

func (w *MockWriter) Close() error

Close should be called after successfully writing records.

func (*MockWriter) Write

func (w *MockWriter) Write(x interface{}) error

Write is called to persist records.

type SQLiteWriter

type SQLiteWriter struct {
	// contains filtered or unexported fields
}

SQLiteWriter writes records to an SQLite database, writing each record type to an individual table automatically.

During writing, the database file is held in a temporary location, and only moved into its final destination during a successful Close operation.

Note that if an existing database with the same filename already exists at the given output location, it will be silently overwritten.

The caller must call Close on successful completion of all writing, to ensure proper cleanup, and the relocation of the database from its temporary location during writing, to its final output.

In the event of an error or cancellation, the caller must call Cancel before quiting, to ensure closure and cleanup of any partially written data.

SQLiteWriter supports additional tag values to denote the primary key:

type Shape struct {
	ShapeID  string `peanut:"shape_id,pk"`
	Name     string `peanut:"name"`
	NumSides int    `peanut:"num_sides"`
}

type Color struct {
	ColorID string `peanut:"color_id,pk"`
	Name    string `peanut:"name"`
	RBG     string `peanut:"rgb"`
}

Compound primary keys are also supported.

SQLiteWriter has no support for foreign keys, indexes, etc.

func NewSQLiteWriter

func NewSQLiteWriter(filename string) *SQLiteWriter

NewSQLiteWriter returns a new SQLiteWriter, using the given filename + ".sqlite" as its final output location.

func (*SQLiteWriter) Cancel

func (w *SQLiteWriter) Cancel() error

Cancel should be called in the event of an error occurring, to properly close any used resources, and delete the partially written database from its temporary location.

func (*SQLiteWriter) Close

func (w *SQLiteWriter) Close() error

Close cleans up all used resources, closes the database connection, and moves the database to its final location.

Calling Close after a previous call to Cancel is safe, and always results in a no-op.

func (*SQLiteWriter) Write

func (w *SQLiteWriter) Write(x interface{}) error

Write is called to persist records. Each record is written to an individual row in the corresponding table within the output database, according to the type of the given record.

type Writer

type Writer interface {
	Write(r interface{}) error
	Close() error
	Cancel() error
}

Writer defines a record-based writer.

func MultiWriter

func MultiWriter(writers ...Writer) Writer

MultiWriter creates a writer that duplicates its method calls to all the provided writers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL