imports

package

v1.0.0 Latest Latest Go to latest Published: Apr 5, 2021 License: MIT Imports: 19 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/jueyanyingyu/dataframe-go

Links

Open Source Insights

Documentation ¶

Overview ¶

Package imports provides functionality to read data contained in another format to populate a DataFrame. It provides inverse functionality to the exports package.

Index ¶

func LoadFromCSV(ctx context.Context, r io.ReadSeeker, options ...CSVLoadOptions) (*dataframe.DataFrame, error)
func LoadFromJSON(ctx context.Context, r io.ReadSeeker, options ...JSONLoadOptions) (*dataframe.DataFrame, error)
func LoadFromParquet(ctx context.Context, src source.ParquetFile, opts ...ParquetLoadOptions) (*dataframe.DataFrame, error)
func LoadFromSQL(ctx context.Context, stmt interface{}, options *SQLLoadOptions, ...) (*dataframe.DataFrame, error)
type CSVLoadOptions
type Converter
type Database
type GenericDataConverter
type JSONLoadOptions
type ParquetLoadOptions
type SQLLoadOptions

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func LoadFromCSV ¶

func LoadFromCSV(ctx context.Context, r io.ReadSeeker, options ...CSVLoadOptions) (*dataframe.DataFrame, error)

LoadFromCSV will load data from a csv file.

func LoadFromJSON ¶

func LoadFromJSON(ctx context.Context, r io.ReadSeeker, options ...JSONLoadOptions) (*dataframe.DataFrame, error)

LoadFromJSON will load data from a jsonl file. The first row determines which fields will be imported for subsequent rows.

func LoadFromParquet ¶

func LoadFromParquet(ctx context.Context, src source.ParquetFile, opts ...ParquetLoadOptions) (*dataframe.DataFrame, error)

LoadFromParquet will load data from a parquet file.

NOTE: This function is experimental and the implementation is likely to change.

Example (gist):

import	"github.com/xitongsys/parquet-go-source/local"
import	"github.com/rocketlaunchr/dataframe-go/imports"

func main() {
	fr, _ := local.NewLocalFileReader("file.parquet")
	defer fr.Close()

	df, _ := imports.LoadFromParquet(ctx, fr)
}

func LoadFromSQL ¶

func LoadFromSQL(ctx context.Context, stmt interface{}, options *SQLLoadOptions, args ...interface{}) (*dataframe.DataFrame, error)

LoadFromSQL will load data from a sql database. stmt must be a *sql.Stmt or the equivalent from the mysql-go package.

See: https://godoc.org/github.com/rocketlaunchr/mysql-go#Stmt

Types ¶

type Converter ¶

type Converter struct {
	ConcreteType  interface{}
	ConverterFunc GenericDataConverter
}

Converter is used to convert input data into a generic data type. This is required when importing data for a Generic Series ("dataframe.SeriesGeneric"). As a special case, if ConcreteType is time.Time, then a SeriesTime is used.

Example:

opts := imports.CSVLoadOptions{
   DictateDataType: map[string]interface{}{
      "Date": imports.Converter{
         ConcreteType: time.Time{},
         ConverterFunc: func(in interface{}) (interface{}, error) {
            return time.Parse("2006-01-02", in.(string))
         },
      },
   },
}

type Database ¶

type Database int

Database is used to set the Database. Different databases have different syntax for placeholders etc.

const (
	// PostgreSQL database
	PostgreSQL Database = 0
	// MySQL database
	MySQL Database = 1
)

type GenericDataConverter ¶

type GenericDataConverter func(in interface{}) (interface{}, error)

GenericDataConverter is used to convert input data into a generic data type. This is required when importing data for a Generic Series ("SeriesGeneric").

type JSONLoadOptions ¶

type JSONLoadOptions struct {

	// LargeDataSet should be set to true for large datasets.
	// It will set the capacity of the underlying slices of the Dataframe by performing a basic parse
	// of the full dataset before processing the data fully.
	// Preallocating memory can provide speed improvements. Benchmarks should be performed for your use-case.
	LargeDataSet bool

	// DictateDataType is used to inform LoadFromJSON what the true underlying data type is for a given field name.
	// The key must be the case-sensitive field name.
	// The value for a given key must be of the data type of the data.
	// eg. For a string use "". For a int64 use int64(0). What is relevant is the data type and not the value itself.
	//
	// NOTE: A custom Series must implement NewSerieser interface and be able to interpret strings to work.
	DictateDataType map[string]interface{}

	// ErrorOnUnknownFields will generate an error if an unknown field is encountered after the first row.
	ErrorOnUnknownFields bool
}

JSONLoadOptions is likely to change.

type ParquetLoadOptions ¶

type ParquetLoadOptions struct {
}

ParquetLoadOptions is likely to change.

type SQLLoadOptions ¶

type SQLLoadOptions struct {

	// KnownRowCount is used to set the capacity of the underlying slices of the Dataframe.
	// The maximum number of rows supported (on a 64-bit machine) is 9,223,372,036,854,775,807 (half of 64 bit range).
	// Preallocating memory can provide speed improvements. Benchmarks should be performed for your use-case.
	//
	// WARNING: Some databases may allow tables to contain more rows than the maximum supported.
	KnownRowCount *int

	// DictateDataType is used to inform LoadFromSQL what the true underlying data type is for a given column name.
	// The key must be the case-sensitive column name.
	// The value for a given key must be of the data type of the data.
	// eg. For a string use "". For a int64 use int64(0). What is relevant is the data type and not the value itself.
	//
	// NOTE: A custom Series must implement NewSerieser interface and be able to interpret strings to work.
	DictateDataType map[string]interface{}

	// Database is used to set the Database.
	Database Database

	// Query can be set to the sql stmt if a *sql.DB, *sql.TX, *sql.Conn or the equivalent from the mysql-go package is provided.
	//
	// See: https://godoc.org/github.com/rocketlaunchr/mysql-go
	Query string
}

SQLLoadOptions is likely to change.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL