Documentation ¶
Index ¶
- type DataTable
- func FromCSV(file *csv.Reader) (*DataTable, error)
- func HashJoin(left, right *DataTable, fnLeft, fnRight func([]string) string) *DataTable
- func Join(left, right *DataTable, fn func(l, r []string) bool) *DataTable
- func LeftJoin(left, right *DataTable, fn func(l, r []string) bool) *DataTable
- func NewDataTable(ncol int) *DataTable
- func (dt *DataTable) AppendRow(row []string) error
- func (dt *DataTable) ApplyColumn(fn func(int, string) error, y int) error
- func (dt *DataTable) ApplyColumns(fn func(int, []string) error, ys ...int) error
- func (dt *DataTable) Get(x, y int) string
- func (dt *DataTable) GetColumn(y int) []string
- func (dt *DataTable) GetRow(x int) []string
- func (dt *DataTable) MarshalJSON() ([]byte, error)
- func (dt *DataTable) Merge(dt2 *DataTable, matches map[int]int)
- func (dt *DataTable) NumCol() int
- func (dt *DataTable) NumRow() int
- func (dt *DataTable) Project(ys ...int) *DataTable
- func (dt *DataTable) RemoveColumn(y int) error
- func (dt *DataTable) RemoveRow(x int)
- func (dt *DataTable) Slice(x, n int) *DataTable
- func (dt *DataTable) ToCSV(file io.Writer) error
- func (dt *DataTable) UnmarshalJSON(data []byte) error
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DataTable ¶
type DataTable struct {
// contains filtered or unexported fields
}
DataTable is an in-memory relational table. The data values are immutable.
func HashJoin ¶
HashJoin performs equal join on the two tables, and returns the result as a new DataTable. fnLeft and fnRight are functions that take a row as the input and return the value used for equality condition in join. HashJoin is generally faster than Join, which does nested loop join, but uses more memory due to the temporary hash table.
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"f", "k", "x"}) dt.AppendRow([]string{"g", "h", "l"}) dt2 := NewDataTable(2) dt2.AppendRow([]string{"a", "1"}) dt2.AppendRow([]string{"f", "2"}) dt2.AppendRow([]string{"k", "3"}) // Join dt1 and dt2 on their first columns. dt3 := HashJoin(dt, dt2, func(l []string) string { return l[0] }, func(r []string) string { return r[0] }) dt3.ToCSV(os.Stdout)
Output: a,b,c,a,1 f,k,x,f,2
func Join ¶
Join performs relational join between the left and right tables. The join condition is defined by the function fn, which takes two rows, l and r, from the left and right tables respectively, and returns whether the two rows should be joined. The join result is returned as a new data table. Each joined rows contains all the fields from the input tables, in the order of [left table fields ... right table fields ...].
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"f", "k", "x"}) dt.AppendRow([]string{"g", "h", "l"}) dt2 := NewDataTable(2) dt2.AppendRow([]string{"a", "1"}) dt2.AppendRow([]string{"f", "2"}) dt2.AppendRow([]string{"k", "3"}) // Join dt1 and dt2 on their first columns.. dt3 := Join(dt, dt2, func(l, r []string) bool { return l[0] == r[0] }) dt3.ToCSV(os.Stdout)
Output: a,b,c,a,1 f,k,x,f,2
func LeftJoin ¶
LeftJoin is similar to Join, execpt that every row from the left table will be part of the join result even it doesn't join with any row from the right table. e.g., [left table fields ... empty fields] where the empty fields have the same number of columns as the right table.
func NewDataTable ¶
NewDataTable creates a new data table with a given number of columns.
func (*DataTable) ApplyColumn ¶
ApplyColumn calls the function fn using all values in column y from the first to the last row. fn takes two arguments: the first is the row index and the second is the corresponding value. Error is returned immediately if encountered.
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"f", "k", "x"}) dt.AppendRow([]string{"g", "h", "l"}) // Concatenate all values in the first column s := "" dt.ApplyColumn(func(x int, v string) error { s += v return nil }, 0) fmt.Println(s)
Output: aefg
func (*DataTable) ApplyColumns ¶
ApplyColumns calls the function fn using all values in multiple columns given by their indexes, from the first to the last row. fn takes two arguments: the first is the row index and the second is the corresponding row projected on the given columns. Error is returned immediately if encountered.
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"a", "b", "x"}) dt.AppendRow([]string{"e", "h", "l"}) // Count the number of unique pairs in the first two // columns s := make(map[string]bool) dt.ApplyColumns(func(x int, vs []string) error { pair := strings.Join(vs, ",") s[pair] = true return nil }, 0, 1) fmt.Println(len(s))
Output: 3
func (*DataTable) MarshalJSON ¶
MarshalJSON marshals data table into JSON.
func (*DataTable) Merge ¶
Merge takes another DataTable dt2 and the 1-to-1 mapping from this table's column indexes to dt2's column indexes, then append new rows to this table with values from dt2.
func (*DataTable) Project ¶
Project creates a new DataTable that has only a subset of the columns, which are indicated by the given column indexes.
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"f", "k", "x"}) dt.AppendRow([]string{"g", "h", "l"}) // Project on the first and the third column dt2 := dt.Project(0, 2) dt2.ToCSV(os.Stdout)
Output: a,c e,g f,x g,l
func (*DataTable) RemoveColumn ¶
RemoveColumn deletes the column at index y
func (*DataTable) Slice ¶
Slice take a contiguous subset of at most n rows, starting at index x, and make a new DataTable from them. Note that different from Project, the new DataTable uses the underlying rows of the original DataTable, and changes to the new table may affect the original.
Example ¶
dt := NewDataTable(3) dt.AppendRow([]string{"a", "b", "c"}) dt.AppendRow([]string{"e", "f", "g"}) dt.AppendRow([]string{"f", "k", "x"}) dt.AppendRow([]string{"g", "h", "l"}) // Take 2 rows starting at the row index 1 dt2 := dt.Slice(1, 2) dt2.ToCSV(os.Stdout)
Output: e,f,g f,k,x
func (*DataTable) UnmarshalJSON ¶
UnmarshalJSON parses data table from JSON.