import "github.com/apache/beam/sdks/go/pkg/beam/runners/dataflow/dataflowlib"
Package dataflowlib translates a Beam pipeline model to the Dataflow API job model, for submission to Google Cloud Dataflow.
execute.go fixup.go job.go messages.go metrics.go stage.go translate.go
func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL, endpoint string, async bool) (*dataflowPipelineResult, error)
Execute submits a pipeline as a Dataflow job.
Fixup proto pipeline with Dataflow quirks.
FromMetricUpdates extracts metrics from a slice of MetricUpdate objects and groups them into counters, distributions and gauges.
Dataflow currently only reports Counter and Distribution metrics to Cloud Monitoring. Gauge metrics are not supported. The output metrics.Results will not contain any gauges.
func GetMetrics(ctx context.Context, client *df.Service, project, region, jobID string) (*df.JobMetrics, error)
GetMetrics returns a collection of metrics describing the progress of a job by making a call to Cloud Monitoring service.
NewClient creates a new dataflow client with default application credentials and CloudPlatformScope. The Dataflow endpoint is optionally overridden.
PrintJob logs the Dataflow job.
StageFile uploads a file to GCS.
StageModel uploads the pipeline model to GCS as a unique object.
func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)
Submit submits a prepared job to Cloud Dataflow.
func Translate(ctx context.Context, p *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL string) (*df.Job, error)
Translate translates a pipeline to a Dataflow job.
func WaitForCompletion(ctx context.Context, client *df.Service, project, region, jobID string) error
WaitForCompletion monitors the given job until completion. It logs any messages and state changes received.
type JobOptions struct { // Name is the job name. Name string // Experiments are additional experiments. Experiments []string // Pipeline options Options runtime.RawOptions Project string Region string Zone string Network string Subnetwork string NoUsePublicIPs bool NumWorkers int64 DiskSizeGb int64 MachineType string Labels map[string]string ServiceAccountEmail string WorkerRegion string WorkerZone string // Autoscaling settings Algorithm string MaxNumWorkers int64 TempLocation string // Worker is the worker binary override. Worker string // WorkerJar is a custom worker jar. WorkerJar string TeardownPolicy string }
JobOptions capture the various options for submitting jobs to Dataflow.
Package dataflowlib imports 34 packages (graph) and is imported by 1 packages. Updated 2021-01-25. Refresh now. Tools for package owners.