udf

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 24, 2019 License: MIT Imports: 13 Imported by: 0

README

Apache Beam Golang UDF

Run UDFs (User Defined Functions) on Apache Beam Golang SDK.

Go Modules activation:

export GO111MODULE=on

CSV parse example:

# Direct
go run examples/parse/csv/csv.go

# Direct with internet files
go run examples/parse/csv/csv.go --location="http://localhost:8081/"

# Direct with GCS files
go run examples/parse/csv/csv.go --location=gs://apache-beam-golang-udf

# Dataflow with GCS files
go run examples/parse/csv/csv.go --location=gs://apache-beam-golang-udf \
    --max_num_workers=1 \
    --num_workers=1 \
    --project=marcelo-henrique-neppel \
    --runner=dataflow \
    --staging_location=gs://apache-beam-golang-udf/bin \
    --temp_location=gs://apache-beam-golang-udf/temp \
    --worker_harness_container_image=apachebeam/go_sdk:latest \
    --worker_machine_type=n1-standard-1

# Flink with GCS files
./gradlew :runners:flink:1.9:job-server:runShadow # For using the embedded cluster

./gradlew :runners:flink:1.9:job-server:runShadow -PflinkMasterUrl=localhost:8081 # For using a separate cluster

go run examples/parse/csv/csv.go --location=gs://apache-beam-golang-udf \
    --endpoint=localhost:8099 \
    --runner=flink

JSON parse example:

# Direct
go run examples/parse/json/json.go

# Direct with internet files
go run examples/parse/json/json.go --location="http://localhost:8081/"

# Direct with GCS files
go run examples/parse/json/json.go --location=gs://apache-beam-golang-udf

# Dataflow with GCS files
go run examples/parse/json/json.go --location=gs://apache-beam-golang-udf \
    --max_num_workers=1 \
    --num_workers=1 \
    --project=marcelo-henrique-neppel \
    --runner=dataflow \
    --staging_location=gs://apache-beam-golang-udf/bin \
    --temp_location=gs://apache-beam-golang-udf/temp \
    --worker_harness_container_image=apachebeam/go_sdk:latest \
    --worker_machine_type=n1-standard-1

# Flink with GCS files
./gradlew :runners:flink:1.9:job-server:runShadow # For using the embedded cluster

./gradlew :runners:flink:1.9:job-server:runShadow -PflinkMasterUrl=localhost:8081 # For using a separate cluster

go run examples/parse/json/json.go --location=gs://apache-beam-golang-udf \
    --endpoint=localhost:8099 \
    --runner=flink

XML parse example:

# Direct
go run examples/parse/xml/xml.go

# Direct with internet files
go run examples/parse/xml/xml.go --location="http://localhost:8081/"

# Direct with GCS files
go run examples/parse/xml/xml.go --location=gs://apache-beam-golang-udf

# Dataflow with GCS files
go run examples/parse/xml/xml.go --location=gs://apache-beam-golang-udf \
    --max_num_workers=1 \
    --num_workers=1 \
    --project=marcelo-henrique-neppel \
    --runner=dataflow \
    --staging_location=gs://apache-beam-golang-udf/bin \
    --temp_location=gs://apache-beam-golang-udf/temp \
    --worker_harness_container_image=apachebeam/go_sdk:latest \
    --worker_machine_type=n1-standard-1

# Flink with GCS files
./gradlew :runners:flink:1.9:job-server:runShadow # For using the embedded cluster

./gradlew :runners:flink:1.9:job-server:runShadow -PflinkMasterUrl=localhost:8081 # For using a separate cluster

go run examples/parse/xml/xml.go --location=gs://apache-beam-golang-udf \
    --endpoint=localhost:8099 \
    --runner=flink

On examples using GCS files, please upload the example files to one of your buckets first.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DownloadGlob

func DownloadGlob(ctx context.Context, glob string) ([]byte, error)

DownloadGlob shows

func GetFunction

func GetFunction(ctx context.Context, glob string, functionPackage string, functionName string) (interface{}, error)

GetFunction shows

Types

This section is empty.

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL