luci: go.chromium.org/luci/common/bq Index | Files

package bq

import "go.chromium.org/luci/common/bq"

Package bq is a library for working with BigQuery.

Limits

Please see BigQuery docs: https://cloud.google.com/bigquery/quotas#streaminginserts for the most updated limits for streaming inserts. It is expected that the client is responsible for ensuring their usage will not exceed these limits through bq usage. A note on maximum rows per request: Put() batches rows per request, ensuring that no more than 10,000 rows are sent per request, and allowing for custom batch size. BigQuery recommends using 500 as a practical limit (so we use this as a default), and experimenting with your specific schema and data sizes to determine the batch size with the ideal balance of throughput and latency for your use case.

Authentication

Authentication for the Cloud projects happens during client creation: https://godoc.org/cloud.google.com/go#pkg-examples. What form this takes depends on the application.

Monitoring

You can use tsmon (https://godoc.org/go.chromium.org/luci/common/tsmon) to track upload latency and errors.

If Uploader.UploadsMetricName field is not zero, Uploader will create a counter metric to track successes and failures.

Index

Package Files

doc.go eventupload.go insertid.go

type InsertIDGenerator Uses

type InsertIDGenerator struct {
    // Counter is an atomically-managed counter used to differentiate Insert
    // IDs produced by the same process.
    Counter int64
    // Prefix should be able to uniquely identify this specific process,
    // to differentiate Insert IDs produced by different processes.
    //
    // If empty, prefix will be derived from system and process specific
    // properties.
    Prefix string
}

InsertIDGenerator generates unique Insert IDs.

BigQuery uses Insert IDs to deduplicate rows in the streaming insert buffer. The association between Insert ID and row persists only for the time the row is in the buffer.

InsertIDGenerator is safe for concurrent use.

var ID InsertIDGenerator

ID is the global InsertIDGenerator

func (*InsertIDGenerator) Generate Uses

func (id *InsertIDGenerator) Generate() string

Generate returns a unique Insert ID.

type Row Uses

type Row struct {
    proto.Message // embedded

    // InsertID is unique per insert operation to handle deduplication.
    InsertID string
}

Row implements bigquery.ValueSaver

func (*Row) Save Uses

func (r *Row) Save() (map[string]bigquery.Value, string, error)

Save is used by bigquery.Uploader.Put when inserting values into a table.

type Uploader Uses

type Uploader struct {
    *bigquery.Uploader
    // Uploader is bound to a specific table. DatasetID and Table ID are
    // provided for reference.
    DatasetID string
    TableID   string
    // UploadsMetricName is a string used to create a tsmon Counter metric
    // for event upload attempts via Put, e.g.
    // "/chrome/infra/commit_queue/events/count". If unset, no metric will
    // be created.
    UploadsMetricName string

    // BatchSize is the max number of rows to send to BigQuery at a time.
    // The default is 500.
    BatchSize int
    // contains filtered or unexported fields
}

Uploader contains the necessary data for streaming data to BigQuery.

func NewUploader Uses

func NewUploader(ctx context.Context, c *bigquery.Client, datasetID, tableID string) *Uploader

NewUploader constructs a new Uploader struct.

DatasetID and TableID are provided to the BigQuery client to gain access to a particular table.

You may want to change the default configuration of the bigquery.Uploader. Check the documentation for more details.

Set UploadsMetricName on the resulting Uploader to use the default counter metric.

Set BatchSize to set a custom batch size.

func (*Uploader) Put Uses

func (u *Uploader) Put(ctx context.Context, messages ...proto.Message) error

Put uploads one or more rows to the BigQuery service. Put takes care of adding InsertIDs, used by BigQuery to deduplicate rows.

If any rows do now match one of the expected types, Put will not attempt to upload any rows and returns an InvalidTypeError.

Put returns a PutMultiError if one or more rows failed to be uploaded. The PutMultiError contains a RowInsertionError for each failed row.

Put will retry on temporary errors. If the error persists, the call will run indefinitely. Because of this, if ctx does not have a timeout, Put will add one.

See bigquery documentation and source code for detailed information on how struct values are mapped to rows.

Package bq imports 23 packages (graph) and is imported by 4 packages. Updated 2018-12-12. Refresh now. Tools for package owners.