bucket2bq

command module

v0.3.0 Latest Latest Go to latest Published: Aug 9, 2023 License: Apache-2.0 Imports: 13 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/brews/bucket2bq

Links

Open Source Insights

README ¶

bucket2bq

Create an inventory of objects in a single GCS Bucket and upload the inventory to Big Query.

This small applications discovers all the objects in a Google Cloud Storage bucket and creates an Avro file containing all the objects and their attributes. This can be then imported into BigQuery.

Usage

The program to create bucket inventory files can be run as an independent program. For example,

./bucket2bq -bucket "name-of-bucket-to-inventory"

It has several options:

./bucket2bq -help
GCS Bucket object metadata to BigQuery, version 0.1.0
Usage of ./bucket2bq:
  -alsologtostderr
        log to standard error as well as files
  -avro_schema string
        Avro schema (default: use embedded) (default "embedded")
  -bucket string
        bucket name (default "bucketname")
  -buffer_size int
        file buffer (default 1000)
  -concurrency int
        concurrency (GOMAXPROCS) (default 4)
  -file string
        output file name (default "gcs.avro")
  -log_backtrace_at value
        when logging hits line file:N, emit a stack trace
  -log_dir string
        If non-empty, write log files in this directory
  -logtostderr
        log to standard error instead of files
  -stderrthreshold value
        logs at or above this threshold go to stderr
  -v value
        log level for V logs
  -versions
        include GCS object versions
  -vmodule value
        comma-separated list of pattern=N settings for file-filtered logging

You can also use the supplied run.sh script, which creates the bucket inventory and uploads the inventory to a BigQuery table. This script accepts the following environment variables as input:

BUCKET2BQ_BUCKET: GCS bucket name to inventory.
BUCKET2BQ_PROJECT: project ID where the scratch storage bucket and BigQuery dataset resides in
BUCKET2BQ_DATASET: BigQuery dataset name (eg. gcs2bq)
BUCKET2BQ_TABLE: BigQuery table name (eg. objects)
BUCKET2BQ_SCRATCH_BUCKET: Bucket for storing the temporary Avro file to be loaded into BigQuery (no gs:// prefix)
BUCKET2BQ_LOCATION: Location for the bucket and dataset (if they need to be created, eg. EU)
BUCKET2BQ_VERSIONS: Set to non-empty if you want to retrieve object versions as well

Installing

Docker containers with this application are publicly available at ghcr.io/brews/bucket2bq.

You can also install the binary to create the inventory file on your computer by running:

go install github.com/brews/bucket2bq@latest

Building

You can build it either manually, or using the supplied Dockerfile:

docker build -t bucket2bq .

Support

Source code is available online at https://github.com/brews/bucket2gcs.

Please file bugs in at https://github.com/brews/bucket2bq/issues.

This software is available under the Apache License, Version 2.0.

This software is a modification of the "gcs2bq" tool, available from https://github.com/GoogleCloudPlatform/professional-services/tree/main/tools/gcs2bq under an Apache-2.0 license.

Documentation ¶

Overview ¶

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL