cockroach: github.com/cockroachdb/cockroach/pkg/workload Index | Files | Directories

package workload

import "github.com/cockroachdb/cockroach/pkg/workload"

Package workload provides an abstraction for generators of sql query loads (and requisite initial data) as well as tools for working with these generators.

Index

Package Files

connection.go csv.go driver.go histogram.go pgx_helpers.go sql_runner.go workload.go

func ApproxDatumSize Uses

func ApproxDatumSize(x interface{}) int64

ApproxDatumSize returns the canonical size of a datum as returned from a call to `Table.InitialRowFn`. NB: These datums end up getting serialized in different ways, which means there's no one size that will be correct for all of them.

func CSVMux Uses

func CSVMux(metas []Meta) *http.ServeMux

CSVMux returns a mux over http handers for csv data in all tables in the given generators.

func HandleCSV Uses

func HandleCSV(w http.ResponseWriter, req *http.Request, prefix string, meta Meta) error

HandleCSV configures a Generator with url params and outputs the data for a single Table as a CSV (optionally limiting the rows via `row-start` and `row-end` params). It is intended for use in implementing a `net/http.Handler`.

func Register Uses

func Register(m Meta)

Register is a hook for init-time registration of Generator implementations. This allows only the necessary generators to be compiled into a given binary.

func SanitizeUrls Uses

func SanitizeUrls(gen Generator, dbOverride string, urls []string) (string, error)

SanitizeUrls verifies that the give SQL connection strings have the correct SQL database set, rewriting them in place if necessary. This database name is returned.

func Setup Uses

func Setup(
    ctx context.Context, db *gosql.DB, gen Generator, batchSize, concurrency int,
) (int64, error)

Setup creates the given tables and fills them with initial data via batched INSERTs. batchSize will only be used when positive (but INSERTs are batched either way). The function is idempotent and can be called multiple times if the Generator does not have any initial rows.

The size of the loaded data is returned in bytes, suitable for use with SetBytes of benchmarks. The exact definition of this is deferred to the ApproxDatumSize implementation.

func Split Uses

func Split(ctx context.Context, db *gosql.DB, table Table, concurrency int) error

Split creates the range splits defined by the given table.

func StringTuple Uses

func StringTuple(datums []interface{}) []string

StringTuple returns the given datums as strings suitable for use in directly in SQL.

TODO(dan): Remove this once SCATTER supports placeholders.

func WriteCSVRows Uses

func WriteCSVRows(
    ctx context.Context, w io.Writer, table Table, rowStart, rowEnd int, sizeBytesLimit int64,
) (rowBatchIdx int, err error)

WriteCSVRows writes the specified table rows as a csv. If sizeBytesLimit is > 0, it will be used as an approximate upper bound for how much to write. The next rowStart is returned (so last row written + 1).

type BatchedTuples Uses

type BatchedTuples struct {
    // NumBatches is the number of batches of tuples.
    NumBatches int
    // NumTotal is the total number of tuples in all batches. Not all generators
    // will know this, it's set to 0 when unknown.
    NumTotal int
    // Batch is a function to deterministically compute a batch of tuples given
    // its index.
    Batch func(int) [][]interface{}
}

BatchedTuples is a generic generator of tuples (SQL rows, PKs to split at, etc). Tuples are generated in batches of arbitrary size. Each batch has an index in `[0,NumBatches)` and a batch can be generated given only its index.

func Tuples Uses

func Tuples(count int, fn func(int) []interface{}) BatchedTuples

Tuples returns a BatchedTuples where each batch has size 1.

type ConnFlags Uses

type ConnFlags struct {
    *pflag.FlagSet
    DBOverride  string
    Concurrency int
    // Method for issuing queries; see SQLRunner.
    Method string
}

ConnFlags is helper of common flags that are relevant to QueryLoads.

func NewConnFlags Uses

func NewConnFlags(genFlags *Flags) *ConnFlags

NewConnFlags returns an initialized ConnFlags.

type FlagMeta Uses

type FlagMeta struct {
    // RuntimeOnly may be set to true only if the corresponding flag has no
    // impact on the behavior of any Tables in this workload.
    RuntimeOnly bool
    // CheckConsistencyOnly is expected to be true only if the corresponding
    // flag only has an effect on the CheckConsistency hook.
    CheckConsistencyOnly bool
}

FlagMeta is metadata about a workload flag.

type Flags Uses

type Flags struct {
    *pflag.FlagSet
    // Meta is keyed by flag name and may be nil if no metadata is needed.
    Meta map[string]FlagMeta
}

Flags is a container for flags and associated metadata.

type Flagser Uses

type Flagser interface {
    Generator
    Flags() Flags
}

Flagser returns the flags this Generator is configured with. Any randomness in the Generator must be deterministic from these options so that table data initialization, query work, etc can be distributed by sending only these flags.

type Generator Uses

type Generator interface {
    // Meta returns meta information about this generator, including a name,
    // description, and a function to create instances of it.
    Meta() Meta

    // Tables returns the set of tables for this generator, including schemas
    // and initial data.
    Tables() []Table
}

Generator represents one or more sql query loads and associated initial data.

type HistogramRegistry Uses

type HistogramRegistry struct {
    // contains filtered or unexported fields
}

HistogramRegistry is a thread-safe enclosure for a (possibly large) number of named histograms. It allows for "tick"ing them periodically to reset the counts and also supports aggregations.

func NewHistogramRegistry Uses

func NewHistogramRegistry() *HistogramRegistry

NewHistogramRegistry returns an initialized HistogramRegistry.

func (*HistogramRegistry) GetHandle Uses

func (w *HistogramRegistry) GetHandle() *Histograms

GetHandle returns a thread-local handle for creating and registering NamedHistograms.

func (*HistogramRegistry) Tick Uses

func (w *HistogramRegistry) Tick(fn func(HistogramTick))

Tick aggregates all registered histograms, grouped by name. It is expected to be called periodically from one goroutine.

type HistogramTick Uses

type HistogramTick struct {
    // Name is the name given to the histograms represented by this tick.
    Name string
    // Hist is the merged result of the represented histograms for this tick.
    // Hist.TotalCount() is the number of operations that occurred for this tick.
    Hist *hdrhistogram.Histogram
    // Cumulative is the merged result of the represented histograms for all
    // time. Cumulative.TotalCount() is the total number of operations that have
    // occurred over all time.
    Cumulative *hdrhistogram.Histogram
    // Elapsed is the amount of time since the last tick.
    Elapsed time.Duration
    // Now is the time at which the tick was gathered. It covers the period
    // [Now-Elapsed,Now).
    Now time.Time
}

HistogramTick is an aggregation of ticking all histograms in a HistogramRegistry with a given name.

func (HistogramTick) Snapshot Uses

func (t HistogramTick) Snapshot() SnapshotTick

Snapshot creates a SnapshotTick from the receiver.

type Histograms Uses

type Histograms struct {
    // contains filtered or unexported fields
}

Histograms is a thread-local handle for creating and registering NamedHistograms.

func (*Histograms) Get Uses

func (w *Histograms) Get(name string) *NamedHistogram

Get returns a NamedHistogram with the given name, creating and registering it if necessary. The result is cached, so no need to cache it in the workload.

type Hooks Uses

type Hooks struct {
    // Validate is called after workload flags are parsed. It should return an
    // error if the workload configuration is invalid.
    Validate func() error
    // PreLoad is called after workload tables are created and before workload
    // data is loaded. It is not called when storing or loading a fixture.
    // Implementations should be idempotent.
    PreLoad func(*gosql.DB) error
    // PostLoad is called after workload tables are created workload data is
    // loaded. It called after restoring a fixture. This, for example, is where
    // creating foreign keys should go. Implementations should be idempotent.
    PostLoad func(*gosql.DB) error
    // PostRun is called after workload run has ended, with the duration of the
    // run. This is where any post-run special printing or validation can be done.
    PostRun func(time.Duration) error
    // CheckConsistency is called to run generator-specific consistency checks.
    // These are expected to pass after the initial data load as well as after
    // running queryload.
    CheckConsistency func(context.Context, *gosql.DB) error
}

Hooks stores functions to be called at points in the workload lifecycle.

type Hookser Uses

type Hookser interface {
    Generator
    Hooks() Hooks
}

Hookser returns any hooks associated with the generator.

type Meta Uses

type Meta struct {
    // Name is a unique name for this generator.
    Name string
    // Description is a short description of this generator.
    Description string
    // Version is a semantic version for this generator. It should be bumped
    // whenever InitialRowFn or InitialRowCount change for any of the tables.
    Version string
    // New returns an unconfigured instance of this generator.
    New func() Generator
}

Meta is used to register a Generator at init time and holds meta information about this generator, including a name, description, and a function to create instances of it.

func Get Uses

func Get(name string) (Meta, error)

Get returns the registered Generator with the given name, if it exists.

func Registered Uses

func Registered() []Meta

Registered returns all registered Generators.

type MultiConnPool Uses

type MultiConnPool struct {
    Pools []*pgx.ConnPool
    // contains filtered or unexported fields
}

MultiConnPool maintains a set of pgx ConnPools (to different servers).

func NewMultiConnPool Uses

func NewMultiConnPool(maxTotalConnections int, urls ...string) (*MultiConnPool, error)

NewMultiConnPool creates a new MultiConnPool (with one pool per url). The pools have approximately the same number of max connections, adding up to maxTotalConnections.

func (*MultiConnPool) Close Uses

func (m *MultiConnPool) Close()

Close closes all the pools.

func (*MultiConnPool) Get Uses

func (m *MultiConnPool) Get() *pgx.ConnPool

Get returns one of the pools, in round-robin manner.

func (*MultiConnPool) PrepareEx Uses

func (m *MultiConnPool) PrepareEx(
    ctx context.Context, name, sql string, opts *pgx.PrepareExOptions,
) (*pgx.PreparedStatement, error)

PrepareEx prepares the given statement on all the pools.

type NamedHistogram Uses

type NamedHistogram struct {
    // contains filtered or unexported fields
}

NamedHistogram is a named histogram for use in Operations. It is threadsafe but intended to be thread-local.

func (*NamedHistogram) Record Uses

func (w *NamedHistogram) Record(elapsed time.Duration)

Record saves a new datapoint and should be called once per logical operation.

type Opser Uses

type Opser interface {
    Generator
    Ops(urls []string, reg *HistogramRegistry) (QueryLoad, error)
}

Opser returns the work functions for this generator. The tables are required to have been created and initialized before running these.

type PgxTx Uses

type PgxTx pgx.Tx

PgxTx is a thin wrapper that implements the crdb.Tx interface, allowing pgx transactions to be used with ExecuteInTx.

func (*PgxTx) Commit Uses

func (tx *PgxTx) Commit() error

Commit is part of the crdb.Tx interface.

func (*PgxTx) ExecContext Uses

func (tx *PgxTx) ExecContext(
    ctx context.Context, sql string, args ...interface{},
) (gosql.Result, error)

ExecContext is part of the crdb.Tx interface.

func (*PgxTx) Rollback Uses

func (tx *PgxTx) Rollback() error

Rollback is part of the crdb.Tx interface.

type QueryLoad Uses

type QueryLoad struct {
    SQLDatabase string

    // WorkerFns is one function per worker. It is to be called once per unit of
    // work to be done.
    WorkerFns []func(context.Context) error

    // Close, if set, is called before the process exits, giving workloads a
    // chance to print some information.
    // It's guaranteed that the ctx passed to WorkerFns (if they're still running)
    // has been canceled by the time this is called (so an implementer can
    // synchronize with the WorkerFns if need be).
    Close func(context.Context)

    // ResultHist is the name of the NamedHistogram to use for the benchmark
    // formatted results output at the end of `./workload run`. The empty string
    // will use the sum of all histograms.
    //
    // TODO(dan): This will go away once more of run.go moves inside Operations.
    ResultHist string
}

QueryLoad represents some SQL query workload performable on a database initialized with the requisite tables.

type SQLRunner Uses

type SQLRunner struct {
    // contains filtered or unexported fields
}

SQLRunner is a helper for issuing SQL statements; it supports multiple methods for issuing queries.

Queries need to first be defined using calls to Define. Then the runner must be initialized, after which we can use the handles returned by Define.

Sample usage:

sr := &workload.SQLRunner{}

sel:= sr.Define("SELECT x FROM t WHERE y = $1")
ins:= sr.Define("INSERT INTO t(x, y) VALUES ($1, $2)")

err := sr.Init(ctx, conn, flags)
// [handle err]

row := sel.QueryRow(1)
// [use row]

_, err := ins.Exec(5, 6)
// [handle err]

A runner should typically be associated with a single worker.

func (*SQLRunner) Define Uses

func (sr *SQLRunner) Define(sql string) StmtHandle

Define creates a handle for the given statement. The handle can be used after Init is called.

func (*SQLRunner) Init Uses

func (sr *SQLRunner) Init(
    ctx context.Context, name string, mcp *MultiConnPool, flags *ConnFlags,
) error

Init initializes the runner; must be called after calls to Define and before the StmtHandles are used.

The name is used for naming prepared statements. Multiple workers that use the same set of defined queries can and should use the same name.

The way we issue queries is set by flags.Method:

- "prepare": we prepare the query once during Init, then we reuse it for
  each execution. This results in a Bind and Execute on the server each time
  we run a query (on the given connection). Note that it's important to
  prepare on separate connections if there are many parallel workers; this
  avoids lock contention in the sql.Rows objects they produce. See #30811.

- "noprepare": each query is issued separately (on the given connection).
  This results in Parse, Bind, Execute on the server each time we run a
  query.

- "simple": each query is issued in a single string; parameters are
  rendered inside the string. This results in a single SimpleExecute
  request to the server for each query. Note that only a few parameter types
  are supported.

type SnapshotTick Uses

type SnapshotTick struct {
    Name    string
    Hist    *hdrhistogram.Snapshot
    Elapsed time.Duration
    Now     time.Time
}

SnapshotTick parallels HistogramTick but replace the histogram with a snapshot that is suitable for serialization. Additionally, it only contains the per-tick histogram, not the cumulative histogram. (The cumulative histogram can be computed by aggregating all of the per-tick histograms).

type StmtHandle Uses

type StmtHandle struct {
    // contains filtered or unexported fields
}

StmtHandle is associated with a (possibly prepared) statement; created by SQLRunner.Define.

func (StmtHandle) Exec Uses

func (h StmtHandle) Exec(ctx context.Context, args ...interface{}) (pgx.CommandTag, error)

Exec executes a query that doesn't return rows. The query is executed on the connection that was passed to SQLRunner.Init.

See pgx.Conn.Exec.

func (StmtHandle) ExecTx Uses

func (h StmtHandle) ExecTx(
    ctx context.Context, tx *pgx.Tx, args ...interface{},
) (pgx.CommandTag, error)

ExecTx executes a query that doesn't return rows, inside a transaction.

See pgx.Conn.Exec.

func (StmtHandle) Query Uses

func (h StmtHandle) Query(ctx context.Context, args ...interface{}) (*pgx.Rows, error)

Query executes a query that returns rows.

See pgx.Conn.Query.

func (StmtHandle) QueryRow Uses

func (h StmtHandle) QueryRow(ctx context.Context, args ...interface{}) *pgx.Row

QueryRow executes a query that is expected to return at most one row.

See pgx.Conn.QueryRow.

func (StmtHandle) QueryRowTx Uses

func (h StmtHandle) QueryRowTx(ctx context.Context, tx *pgx.Tx, args ...interface{}) *pgx.Row

QueryRowTx executes a query that is expected to return at most one row, inside a transaction.

See pgx.Conn.QueryRow.

func (StmtHandle) QueryTx Uses

func (h StmtHandle) QueryTx(
    ctx context.Context, tx *pgx.Tx, args ...interface{},
) (*pgx.Rows, error)

QueryTx executes a query that returns rows, inside a transaction.

See pgx.Tx.Query.

type Table Uses

type Table struct {
    // Name is the unqualified table name, pre-escaped for use directly in SQL.
    Name string
    // Schema is the SQL formatted schema for this table, with the `CREATE TABLE
    // <name>` prefix omitted.
    Schema string
    // InitialRows is the initial rows that will be present in the table after
    // setup is completed.
    InitialRows BatchedTuples
    // Splits is the initial splits that will be present in the table after
    // setup is completed.
    Splits BatchedTuples
}

Table represents a single table in a Generator. Included is a name, schema, and initial data.

Directories

PathSynopsis
bank
cli
examples
interleavedpartitioned
jsonload
kv
ledger
querybench
queue
tpcc
tpch
ycsbPackage ycsb is the workload specified by the Yahoo! Cloud Serving Benchmark.

Package workload imports 30 packages (graph) and is imported by 17 packages. Updated 2018-10-18. Refresh now. Tools for package owners.