datatable

package module
v0.0.0-...-04681c6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 19, 2017 License: MIT Imports: 4 Imported by: 0

README

datatable

Build Status GoDoc

An in-memory relational table in Go similar to C#'s System.Data.DataTable, supporting slicing and joining operations.

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DataTable

type DataTable struct {
	// contains filtered or unexported fields
}

DataTable is an in-memory relational table. The data values are immutable.

func FromCSV

func FromCSV(file *csv.Reader) (*DataTable, error)

FromCSV reads data values from the CSV file and initialize a datatable.

func HashJoin

func HashJoin(left, right *DataTable, fnLeft, fnRight func([]string) string) *DataTable

HashJoin performs equal join on the two tables, and returns the result as a new DataTable. fnLeft and fnRight are functions that take a row as the input and return the value used for equality condition in join. HashJoin is generally faster than Join, which does nested loop join, but uses more memory due to the temporary hash table.

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"f", "k", "x"})
dt.AppendRow([]string{"g", "h", "l"})

dt2 := NewDataTable(2)
dt2.AppendRow([]string{"a", "1"})
dt2.AppendRow([]string{"f", "2"})
dt2.AppendRow([]string{"k", "3"})

// Join dt1 and dt2 on their first columns.
dt3 := HashJoin(dt, dt2,
	func(l []string) string {
		return l[0]
	}, func(r []string) string {
		return r[0]
	})

dt3.ToCSV(os.Stdout)
Output:

a,b,c,a,1
f,k,x,f,2

func Join

func Join(left, right *DataTable, fn func(l, r []string) bool) *DataTable

Join performs relational join between the left and right tables. The join condition is defined by the function fn, which takes two rows, l and r, from the left and right tables respectively, and returns whether the two rows should be joined. The join result is returned as a new data table. Each joined rows contains all the fields from the input tables, in the order of [left table fields ... right table fields ...].

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"f", "k", "x"})
dt.AppendRow([]string{"g", "h", "l"})

dt2 := NewDataTable(2)
dt2.AppendRow([]string{"a", "1"})
dt2.AppendRow([]string{"f", "2"})
dt2.AppendRow([]string{"k", "3"})

// Join dt1 and dt2 on their first columns..
dt3 := Join(dt, dt2, func(l, r []string) bool {
	return l[0] == r[0]
})

dt3.ToCSV(os.Stdout)
Output:

a,b,c,a,1
f,k,x,f,2

func LeftJoin

func LeftJoin(left, right *DataTable, fn func(l, r []string) bool) *DataTable

LeftJoin is similar to Join, execpt that every row from the left table will be part of the join result even it doesn't join with any row from the right table. e.g., [left table fields ... empty fields] where the empty fields have the same number of columns as the right table.

func NewDataTable

func NewDataTable(ncol int) *DataTable

NewDataTable creates a new data table with a given number of columns.

func (*DataTable) AppendRow

func (dt *DataTable) AppendRow(row []string) error

AppendRow appends a new row at the bottom of the table.

func (*DataTable) ApplyColumn

func (dt *DataTable) ApplyColumn(fn func(int, string) error, y int) error

ApplyColumn calls the function fn using all values in column y from the first to the last row. fn takes two arguments: the first is the row index and the second is the corresponding value. Error is returned immediately if encountered.

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"f", "k", "x"})
dt.AppendRow([]string{"g", "h", "l"})

// Concatenate all values in the first column
s := ""
dt.ApplyColumn(func(x int, v string) error {
	s += v
	return nil
}, 0)

fmt.Println(s)
Output:

aefg

func (*DataTable) ApplyColumns

func (dt *DataTable) ApplyColumns(fn func(int, []string) error, ys ...int) error

ApplyColumns calls the function fn using all values in multiple columns given by their indexes, from the first to the last row. fn takes two arguments: the first is the row index and the second is the corresponding row projected on the given columns. Error is returned immediately if encountered.

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"a", "b", "x"})
dt.AppendRow([]string{"e", "h", "l"})

// Count the number of unique pairs in the first two
// columns
s := make(map[string]bool)
dt.ApplyColumns(func(x int, vs []string) error {
	pair := strings.Join(vs, ",")
	s[pair] = true
	return nil
}, 0, 1)

fmt.Println(len(s))
Output:

3

func (*DataTable) Get

func (dt *DataTable) Get(x, y int) string

Get returns the value at row x and column y.

func (*DataTable) GetColumn

func (dt *DataTable) GetColumn(y int) []string

GetColumn returns the column at index y.

func (*DataTable) GetRow

func (dt *DataTable) GetRow(x int) []string

GetRow returns the row at index x.

func (*DataTable) MarshalJSON

func (dt *DataTable) MarshalJSON() ([]byte, error)

MarshalJSON marshals data table into JSON.

func (*DataTable) Merge

func (dt *DataTable) Merge(dt2 *DataTable, matches map[int]int)

Merge takes another DataTable dt2 and the 1-to-1 mapping from this table's column indexes to dt2's column indexes, then append new rows to this table with values from dt2.

func (*DataTable) NumCol

func (dt *DataTable) NumCol() int

NumCol returns the number of columns in the table

func (*DataTable) NumRow

func (dt *DataTable) NumRow() int

NumRow returns the number of rows in the table

func (*DataTable) Project

func (dt *DataTable) Project(ys ...int) *DataTable

Project creates a new DataTable that has only a subset of the columns, which are indicated by the given column indexes.

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"f", "k", "x"})
dt.AppendRow([]string{"g", "h", "l"})

// Project on the first and the third column
dt2 := dt.Project(0, 2)

dt2.ToCSV(os.Stdout)
Output:

a,c
e,g
f,x
g,l

func (*DataTable) RemoveColumn

func (dt *DataTable) RemoveColumn(y int) error

RemoveColumn deletes the column at index y

func (*DataTable) RemoveRow

func (dt *DataTable) RemoveRow(x int)

RemoveRow deletes the row at index x

func (*DataTable) Slice

func (dt *DataTable) Slice(x, n int) *DataTable

Slice take a contiguous subset of at most n rows, starting at index x, and make a new DataTable from them. Note that different from Project, the new DataTable uses the underlying rows of the original DataTable, and changes to the new table may affect the original.

Example
dt := NewDataTable(3)
dt.AppendRow([]string{"a", "b", "c"})
dt.AppendRow([]string{"e", "f", "g"})
dt.AppendRow([]string{"f", "k", "x"})
dt.AppendRow([]string{"g", "h", "l"})

// Take 2 rows starting at the row index 1
dt2 := dt.Slice(1, 2)

dt2.ToCSV(os.Stdout)
Output:

e,f,g
f,k,x

func (*DataTable) ToCSV

func (dt *DataTable) ToCSV(file io.Writer) error

ToCSV writes the table in standard CSV format to a file

func (*DataTable) UnmarshalJSON

func (dt *DataTable) UnmarshalJSON(data []byte) error

UnmarshalJSON parses data table from JSON.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL