import "github.com/cockroachdb/cockroach/pkg/workload"
Package workload provides an abstraction for generators of sql query loads (and requisite initial data) as well as tools for working with these generators.
connection.go csv.go driver.go histogram.go pgx_helpers.go sql_runner.go workload.go
ApproxDatumSize returns the canonical size of a datum as returned from a call to `Table.InitialRowFn`. NB: These datums end up getting serialized in different ways, which means there's no one size that will be correct for all of them.
CSVMux returns a mux over http handers for csv data in all tables in the given generators.
HandleCSV configures a Generator with url params and outputs the data for a single Table as a CSV (optionally limiting the rows via `row-start` and `row-end` params). It is intended for use in implementing a `net/http.Handler`.
Register is a hook for init-time registration of Generator implementations. This allows only the necessary generators to be compiled into a given binary.
SanitizeUrls verifies that the give SQL connection strings have the correct SQL database set, rewriting them in place if necessary. This database name is returned.
func Setup(
ctx context.Context, db *gosql.DB, gen Generator, batchSize, concurrency int,
) (int64, error)Setup creates the given tables and fills them with initial data via batched INSERTs. batchSize will only be used when positive (but INSERTs are batched either way). The function is idempotent and can be called multiple times if the Generator does not have any initial rows.
The size of the loaded data is returned in bytes, suitable for use with SetBytes of benchmarks. The exact definition of this is deferred to the ApproxDatumSize implementation.
Split creates the range splits defined by the given table.
StringTuple returns the given datums as strings suitable for use in directly in SQL.
TODO(dan): Remove this once SCATTER supports placeholders.
func WriteCSVRows(
ctx context.Context, w io.Writer, table Table, rowStart, rowEnd int, sizeBytesLimit int64,
) (rowBatchIdx int, err error)WriteCSVRows writes the specified table rows as a csv. If sizeBytesLimit is > 0, it will be used as an approximate upper bound for how much to write. The next rowStart is returned (so last row written + 1).
type BatchedTuples struct {
// NumBatches is the number of batches of tuples.
NumBatches int
// NumTotal is the total number of tuples in all batches. Not all generators
// will know this, it's set to 0 when unknown.
NumTotal int
// Batch is a function to deterministically compute a batch of tuples given
// its index.
Batch func(int) [][]interface{}
}BatchedTuples is a generic generator of tuples (SQL rows, PKs to split at, etc). Tuples are generated in batches of arbitrary size. Each batch has an index in `[0,NumBatches)` and a batch can be generated given only its index.
func Tuples(count int, fn func(int) []interface{}) BatchedTuples
Tuples returns a BatchedTuples where each batch has size 1.
type ConnFlags struct {
*pflag.FlagSet
DBOverride string
Concurrency int
// Method for issuing queries; see SQLRunner.
Method string
}ConnFlags is helper of common flags that are relevant to QueryLoads.
NewConnFlags returns an initialized ConnFlags.
type FlagMeta struct {
// RuntimeOnly may be set to true only if the corresponding flag has no
// impact on the behavior of any Tables in this workload.
RuntimeOnly bool
// CheckConsistencyOnly is expected to be true only if the corresponding
// flag only has an effect on the CheckConsistency hook.
CheckConsistencyOnly bool
}FlagMeta is metadata about a workload flag.
type Flags struct {
*pflag.FlagSet
// Meta is keyed by flag name and may be nil if no metadata is needed.
Meta map[string]FlagMeta
}Flags is a container for flags and associated metadata.
Flagser returns the flags this Generator is configured with. Any randomness in the Generator must be deterministic from these options so that table data initialization, query work, etc can be distributed by sending only these flags.
type Generator interface {
// Meta returns meta information about this generator, including a name,
// description, and a function to create instances of it.
Meta() Meta
// Tables returns the set of tables for this generator, including schemas
// and initial data.
Tables() []Table
}Generator represents one or more sql query loads and associated initial data.
type HistogramRegistry struct {
// contains filtered or unexported fields
}HistogramRegistry is a thread-safe enclosure for a (possibly large) number of named histograms. It allows for "tick"ing them periodically to reset the counts and also supports aggregations.
func NewHistogramRegistry() *HistogramRegistry
NewHistogramRegistry returns an initialized HistogramRegistry.
func (w *HistogramRegistry) GetHandle() *Histograms
GetHandle returns a thread-local handle for creating and registering NamedHistograms.
func (w *HistogramRegistry) Tick(fn func(HistogramTick))
Tick aggregates all registered histograms, grouped by name. It is expected to be called periodically from one goroutine.
type HistogramTick struct {
// Name is the name given to the histograms represented by this tick.
Name string
// Hist is the merged result of the represented histograms for this tick.
// Hist.TotalCount() is the number of operations that occurred for this tick.
Hist *hdrhistogram.Histogram
// Cumulative is the merged result of the represented histograms for all
// time. Cumulative.TotalCount() is the total number of operations that have
// occurred over all time.
Cumulative *hdrhistogram.Histogram
// Elapsed is the amount of time since the last tick.
Elapsed time.Duration
// Now is the time at which the tick was gathered. It covers the period
// [Now-Elapsed,Now).
Now time.Time
}HistogramTick is an aggregation of ticking all histograms in a HistogramRegistry with a given name.
func (t HistogramTick) Snapshot() SnapshotTick
Snapshot creates a SnapshotTick from the receiver.
type Histograms struct {
// contains filtered or unexported fields
}Histograms is a thread-local handle for creating and registering NamedHistograms.
func (w *Histograms) Get(name string) *NamedHistogram
Get returns a NamedHistogram with the given name, creating and registering it if necessary. The result is cached, so no need to cache it in the workload.
type Hooks struct {
// Validate is called after workload flags are parsed. It should return an
// error if the workload configuration is invalid.
Validate func() error
// PreLoad is called after workload tables are created and before workload
// data is loaded. It is not called when storing or loading a fixture.
// Implementations should be idempotent.
PreLoad func(*gosql.DB) error
// PostLoad is called after workload tables are created workload data is
// loaded. It called after restoring a fixture. This, for example, is where
// creating foreign keys should go. Implementations should be idempotent.
PostLoad func(*gosql.DB) error
// PostRun is called after workload run has ended, with the duration of the
// run. This is where any post-run special printing or validation can be done.
PostRun func(time.Duration) error
// CheckConsistency is called to run generator-specific consistency checks.
// These are expected to pass after the initial data load as well as after
// running queryload.
CheckConsistency func(context.Context, *gosql.DB) error
}Hooks stores functions to be called at points in the workload lifecycle.
Hookser returns any hooks associated with the generator.
type Meta struct {
// Name is a unique name for this generator.
Name string
// Description is a short description of this generator.
Description string
// Version is a semantic version for this generator. It should be bumped
// whenever InitialRowFn or InitialRowCount change for any of the tables.
Version string
// New returns an unconfigured instance of this generator.
New func() Generator
}Meta is used to register a Generator at init time and holds meta information about this generator, including a name, description, and a function to create instances of it.
Get returns the registered Generator with the given name, if it exists.
Registered returns all registered Generators.
MultiConnPool maintains a set of pgx ConnPools (to different servers).
func NewMultiConnPool(maxTotalConnections int, urls ...string) (*MultiConnPool, error)
NewMultiConnPool creates a new MultiConnPool (with one pool per url). The pools have approximately the same number of max connections, adding up to maxTotalConnections.
func (m *MultiConnPool) Close()
Close closes all the pools.
func (m *MultiConnPool) Get() *pgx.ConnPool
Get returns one of the pools, in round-robin manner.
func (m *MultiConnPool) PrepareEx( ctx context.Context, name, sql string, opts *pgx.PrepareExOptions, ) (*pgx.PreparedStatement, error)
PrepareEx prepares the given statement on all the pools.
type NamedHistogram struct {
// contains filtered or unexported fields
}NamedHistogram is a named histogram for use in Operations. It is threadsafe but intended to be thread-local.
func (w *NamedHistogram) Record(elapsed time.Duration)
Record saves a new datapoint and should be called once per logical operation.
Opser returns the work functions for this generator. The tables are required to have been created and initialized before running these.
PgxTx is a thin wrapper that implements the crdb.Tx interface, allowing pgx transactions to be used with ExecuteInTx.
Commit is part of the crdb.Tx interface.
func (tx *PgxTx) ExecContext( ctx context.Context, sql string, args ...interface{}, ) (gosql.Result, error)
ExecContext is part of the crdb.Tx interface.
Rollback is part of the crdb.Tx interface.
type QueryLoad struct {
SQLDatabase string
// WorkerFns is one function per worker. It is to be called once per unit of
// work to be done.
WorkerFns []func(context.Context) error
// Close, if set, is called before the process exits, giving workloads a
// chance to print some information.
// It's guaranteed that the ctx passed to WorkerFns (if they're still running)
// has been canceled by the time this is called (so an implementer can
// synchronize with the WorkerFns if need be).
Close func(context.Context)
// ResultHist is the name of the NamedHistogram to use for the benchmark
// formatted results output at the end of `./workload run`. The empty string
// will use the sum of all histograms.
//
// TODO(dan): This will go away once more of run.go moves inside Operations.
ResultHist string
}QueryLoad represents some SQL query workload performable on a database initialized with the requisite tables.
type SQLRunner struct {
// contains filtered or unexported fields
}SQLRunner is a helper for issuing SQL statements; it supports multiple methods for issuing queries.
Queries need to first be defined using calls to Define. Then the runner must be initialized, after which we can use the handles returned by Define.
Sample usage:
sr := &workload.SQLRunner{}
sel:= sr.Define("SELECT x FROM t WHERE y = $1")
ins:= sr.Define("INSERT INTO t(x, y) VALUES ($1, $2)")
err := sr.Init(ctx, conn, flags)
// [handle err]
row := sel.QueryRow(1)
// [use row]
_, err := ins.Exec(5, 6)
// [handle err]
A runner should typically be associated with a single worker.
func (sr *SQLRunner) Define(sql string) StmtHandle
Define creates a handle for the given statement. The handle can be used after Init is called.
func (sr *SQLRunner) Init( ctx context.Context, name string, mcp *MultiConnPool, flags *ConnFlags, ) error
Init initializes the runner; must be called after calls to Define and before the StmtHandles are used.
The name is used for naming prepared statements. Multiple workers that use the same set of defined queries can and should use the same name.
The way we issue queries is set by flags.Method:
- "prepare": we prepare the query once during Init, then we reuse it for each execution. This results in a Bind and Execute on the server each time we run a query (on the given connection). Note that it's important to prepare on separate connections if there are many parallel workers; this avoids lock contention in the sql.Rows objects they produce. See #30811. - "noprepare": each query is issued separately (on the given connection). This results in Parse, Bind, Execute on the server each time we run a query. - "simple": each query is issued in a single string; parameters are rendered inside the string. This results in a single SimpleExecute request to the server for each query. Note that only a few parameter types are supported.
type SnapshotTick struct {
Name string
Hist *hdrhistogram.Snapshot
Elapsed time.Duration
Now time.Time
}SnapshotTick parallels HistogramTick but replace the histogram with a snapshot that is suitable for serialization. Additionally, it only contains the per-tick histogram, not the cumulative histogram. (The cumulative histogram can be computed by aggregating all of the per-tick histograms).
type StmtHandle struct {
// contains filtered or unexported fields
}StmtHandle is associated with a (possibly prepared) statement; created by SQLRunner.Define.
func (h StmtHandle) Exec(ctx context.Context, args ...interface{}) (pgx.CommandTag, error)
Exec executes a query that doesn't return rows. The query is executed on the connection that was passed to SQLRunner.Init.
See pgx.Conn.Exec.
func (h StmtHandle) ExecTx( ctx context.Context, tx *pgx.Tx, args ...interface{}, ) (pgx.CommandTag, error)
ExecTx executes a query that doesn't return rows, inside a transaction.
See pgx.Conn.Exec.
Query executes a query that returns rows.
See pgx.Conn.Query.
QueryRow executes a query that is expected to return at most one row.
See pgx.Conn.QueryRow.
QueryRowTx executes a query that is expected to return at most one row, inside a transaction.
See pgx.Conn.QueryRow.
func (h StmtHandle) QueryTx( ctx context.Context, tx *pgx.Tx, args ...interface{}, ) (*pgx.Rows, error)
QueryTx executes a query that returns rows, inside a transaction.
See pgx.Tx.Query.
type Table struct {
// Name is the unqualified table name, pre-escaped for use directly in SQL.
Name string
// Schema is the SQL formatted schema for this table, with the `CREATE TABLE
// <name>` prefix omitted.
Schema string
// InitialRows is the initial rows that will be present in the table after
// setup is completed.
InitialRows BatchedTuples
// Splits is the initial splits that will be present in the table after
// setup is completed.
Splits BatchedTuples
}Table represents a single table in a Generator. Included is a name, schema, and initial data.
| Path | Synopsis |
|---|---|
| bank | |
| cli | |
| examples | |
| interleavedpartitioned | |
| jsonload | |
| kv | |
| ledger | |
| querybench | |
| queue | |
| tpcc | |
| tpch | |
| ycsb | Package ycsb is the workload specified by the Yahoo! Cloud Serving Benchmark. |
Package workload imports 30 packages (graph) and is imported by 17 packages. Updated 2018-10-18. Refresh now. Tools for package owners.