sq

command module
v0.48.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2024 License: MIT Imports: 7 Imported by: 0

README

Go Reference Go Report Card License Main pipeline

sq data wrangler

sq is a command line tool that provides jq-style access to structured data sources: SQL databases, or document formats like CSV or Excel. It is the lovechild of sql+jq.

sq

sq executes jq-like queries, or database-native SQL. It can join across sources: join a CSV file to a Postgres table, or MySQL with Excel.

sq outputs to a multitude of formats including JSON, Excel, CSV, HTML, Markdown and XML, and can insert query results directly to a SQL database.

sq can also inspect sources to view metadata about the source structure (tables, columns, size). You can use sq diff to compare tables, or entire databases. sq has commands for common database operations to copy, truncate, and drop tables.

Find out more at sq.io.

Install

macOS
brew install neilotoole/sq/sq
Linux
/bin/sh -c "$(curl -fsSL https://sq.io/install.sh)"
Windows
scoop bucket add sq https://github.com/neilotoole/sq
scoop install sq
Go
go install github.com/neilotoole/sq
Docker

The ghcr.io/neilotoole/sq image is preloaded with sq and a handful of related tools like jq.

Local
# Shell into a one-time container.
$ docker run -it ghcr.io/neilotoole/sq zsh

# Start detached (background) container named "sq-shell".
$ docker run -d --name sq-shell ghcr.io/neilotoole/sq
# Shell into that container.
$ docker exec -it sq-shell zsh 
Kubernetes

Running sq in a Kubernetes environment is useful for DB migrations, as well as general data wrangling.

# Start pod named "sq-shell".
$ kubectl run sq-shell --image ghcr.io/neilotoole/sq
# Shell into the pod.
$ kubectl exec -it sq-shell -- zsh 

See other install options.

Overview

Use sq help to see command help. Docs are over at sq.io. Read the overview, and tutorial. The cookbook has recipes for common tasks, and the query guide covers sq's query language.

The major concept is: sq operates on data sources, which are treated as SQL databases (even if the source is really a CSV or XLSX file etc.).

In a nutshell, you sq add a source (giving it a handle), and then execute commands against the source.

Sources

Initially there are no sources.

$ sq ls

Let's add a source. First we'll add a SQLite database, but this could also be Postgres, SQL Server etc., or a document source such Excel or CSV.

Download the sample DB, and sq add the source.

$ wget https://sq.io/testdata/sakila.db

$ sq add ./sakila.db
@sakila  sqlite3  sakila.db

$ sq ls -v
HANDLE   ACTIVE  DRIVER   LOCATION                         OPTIONS
@sakila  active  sqlite3  sqlite3:///Users/demo/sakila.db

$ sq ping @sakila
@sakila       1ms  pong

$ sq src
@sakila  sqlite3  sakila.db

The sq ping command simply pings the source to verify that it's available.

sq src lists the active source, which in our case is @sakila. You can change the active source using sq src @other_src. When there's an active source specified, you can usually omit the handle from sq commands. Thus you could instead do:

$ sq ping
@sakila  1ms  pong

[!TIP] Document sources such as CSV or Excel can be added from the local filesystem, or from an HTTP URL.

$ sq add https://acme.s3.amazonaws.com/sales.csv

sq inspect remote See the sources docs for more.

Query

Fundamentally, sq is for querying data. The jq-style syntax is covered in detail in the query guide.

sq query where slq

The above query selected some rows from the actor table. You could also use native SQL, e.g.:

sq query where sql

But we're flying a bit blind here: how did we know about the actor table?

Inspect

sq inspect is your friend.

sq inspect

Use sq inspect -v to see more detail. Or use -j to get JSON output:

sq inspect -j

Combine sq inspect with jq for some useful capabilities. Here's how to list all the table names in the active source:

$ sq inspect -j | jq -r '.tables[] | .name'
actor
address
category
city
country
customer
[...]

And here's how you could export each table to a CSV file:

$ sq inspect -j | jq -r '.tables[] | .name' | xargs -I % sq .% --csv --output %.csv
$ ls
actor.csv     city.csv	    customer_list.csv  film_category.csv  inventory.csv  rental.csv		     staff.csv
address.csv   country.csv   film.csv	       film_list.csv	  language.csv	 sales_by_film_category.csv  staff_list.csv
category.csv  customer.csv  film_actor.csv     film_text.csv	  payment.csv	 sales_by_store.csv	     store.csv

Note that you can also inspect an individual table:

sq inspect actor verbose

Read more about sq inspect.

Diff

Use sq diff to compare metadata, or row data, for sources, or individual tables.

The default behavior is to diff table schema and row counts. Table row data is not compared in this mode.

sq diff

Use --data to compare row data.

sq diff data

There are many more options available. See the diff docs.

Insert query results

sq query results can be output in various formats (text, json, csv, etc.). Those results can also be "outputted" as an insert into a database table.

That is, you can use sq to insert results from a Postgres query into a MySQL table, or copy an Excel worksheet into a SQLite table, or a push a CSV file into a SQL Server table etc.

[!TIP] If you want to copy a table inside the same (database) source, use sq tbl copy instead, which uses the database's native table copy functionality.

Here we query a CSV file, and insert the results into a Postgres table.

sq query insert inspect

Cross-source joins

sq can perform the usual joins. Here's how you would join tables actor, film_actor, and film:

$ sq '.actor | join(.film_actor, .actor_id) | join(.film, .film_id) | .first_name, .last_name, .title'

But sq can also join across data sources. That is, you can join an Excel worksheet with a Postgres table, or join a CSV file with MySQL, and so on.

This example joins a Postgres database, an Excel worksheet, and a CSV file.

sq join multi source

Read more about cross-source joins in the query guide.

Table commands

sq provides several handy commands for working with tables: tbl copy, tbl truncate and tbl drop. Note that these commands work directly against SQL database sources, using their native SQL commands.

$ sq tbl copy .actor .actor_copy
Copied table: @sakila.actor --> @sakila.actor_copy (200 rows copied)

$ sq tbl truncate .actor_copy
Truncated 200 rows from @sakila.actor_copy

$ sq tbl drop .actor_copy
Dropped table @sakila.actor_copy
UNIX pipes

For file-based sources (such as CSV or XLSX), you can sq add the source file, but you can also pipe it:

$ cat ./example.xlsx | sq .Sheet1

Similarly, you can inspect:

$ cat ./example.xlsx | sq inspect

Drivers

sq knows how to deal with a data source type via a driver implementation. To view the installed/supported drivers:

$ sq driver ls
DRIVER     DESCRIPTION                          
sqlite3    SQLite                               
postgres   PostgreSQL                           
sqlserver  Microsoft SQL Server / Azure SQL Edge
mysql      MySQL                                
csv        Comma-Separated Values               
tsv        Tab-Separated Values                 
json       JSON                                 
jsona      JSON Array: LF-delimited JSON arrays 
jsonl      JSON Lines: LF-delimited JSON objects
xlsx       Microsoft Excel XLSX                 

Output formats

sq has many output formats:

CHANGELOG

See CHANGELOG.md.

Acknowledgements

Documentation

Overview

Package main contains sq's main function.

Directories

Path Synopsis
cli
Package cli implements sq's CLI.
Package cli implements sq's CLI.
buildinfo
Package buildinfo hosts build info variables populated via ldflags.
Package buildinfo hosts build info variables populated via ldflags.
cobraz
Package cobraz contains supplemental logic for dealing with spf13/cobra.
Package cobraz contains supplemental logic for dealing with spf13/cobra.
config
Package config holds CLI configuration.
Package config holds CLI configuration.
config/yamlstore
Package yamlstore contains an implementation of config.Store that uses YAML files for persistence.
Package yamlstore contains an implementation of config.Store that uses YAML files for persistence.
config/yamlstore/upgrades/v0.34.0
Package v0_34_0 upgrades YAML config to v0.34.0.
Package v0_34_0 upgrades YAML config to v0.34.0.
diff
Package diff contains sq's diff implementation.
Package diff contains sq's diff implementation.
flag
Package flag holds CLI flags.
Package flag holds CLI flags.
hostinfo
Package hostinfo provides high-level details about the runtime OS.
Package hostinfo provides high-level details about the runtime OS.
output
Package output provides interfaces and implementations for outputting data and messages.
Package output provides interfaces and implementations for outputting data and messages.
output/commonw
Package commonw contains miscellaneous common output writer functionality.
Package commonw contains miscellaneous common output writer functionality.
output/csvw
Package csvw implements writers for CSV.
Package csvw implements writers for CSV.
output/htmlw
Package htmlw implements a RecordWriter for HTML.
Package htmlw implements a RecordWriter for HTML.
output/jsonw
Package jsonw implements output writers for JSON.
Package jsonw implements output writers for JSON.
output/markdownw
Package markdownw implements writers for Markdown.
Package markdownw implements writers for Markdown.
output/tablew
Package tablew implements text table output writers.
Package tablew implements text table output writers.
output/tablew/internal
Package tablewriter creates & generates text based table
Package tablewriter creates & generates text based table
output/xlsxw
Package xlsxw implements output writers for Microsoft Excel.
Package xlsxw implements output writers for Microsoft Excel.
output/xmlw
Package xmlw implements output writers for XML.
Package xmlw implements output writers for XML.
output/yamlw
Package yamlw implements output writers for YAML.
Package yamlw implements output writers for YAML.
pprofile
Package pprofile encapsulates pprof functionality.
Package pprofile encapsulates pprof functionality.
run
Package run holds the run.Run construct, which encapsulates CLI state for a command execution.
Package run holds the run.Run construct, which encapsulates CLI state for a command execution.
testrun
Package testrun contains helper functionality for executing CLI tests.
Package testrun contains helper functionality for executing CLI tests.
Package drivers is the parent package of the concrete sq driver implementations.
Package drivers is the parent package of the concrete sq driver implementations.
csv
Package csv implements the sq driver for CSV/TSV et al.
Package csv implements the sq driver for CSV/TSV et al.
json
Package json implements the sq driver for JSON.
Package json implements the sq driver for JSON.
postgres
Package postgres implements the sq driver for postgres.
Package postgres implements the sq driver for postgres.
sqlite3
Package sqlite3 implements the sq driver for SQLite.
Package sqlite3 implements the sq driver for SQLite.
sqlite3/internal/sqlparser
Package sqlparser contains SQL parsing functionality for SQLite.
Package sqlparser contains SQL parsing functionality for SQLite.
sqlserver
Package sqlserver implements the sq driver for SQL Server.
Package sqlserver implements the sq driver for SQL Server.
userdriver
Package userdriver implements the "user-driver" functionality that allows users to define source driver types declaratively.
Package userdriver implements the "user-driver" functionality that allows users to define source driver types declaratively.
userdriver/xmlud
Package xmlud provides user driver XML import functionality.
Package xmlud provides user driver XML import functionality.
xlsx
Package xlsx implements the sq driver for Microsoft Excel.
Package xlsx implements the sq driver for Microsoft Excel.
Package libsq implements the core sq functionality.
Package libsq implements the core sq functionality.
ast
Package ast holds types and functionality for the SLQ AST.
Package ast holds types and functionality for the SLQ AST.
ast/antlrz
Package antlrz contains utilities for working with ANTLR4.
Package antlrz contains utilities for working with ANTLR4.
ast/render
Package render provides the mechanism for rendering ast into SQL.
Package render provides the mechanism for rendering ast into SQL.
core
Package core contains only libsq core sub-packages.
Package core contains only libsq core sub-packages.
core/cleanup
Package cleanup provides functionality for executing cleanup functions.
Package cleanup provides functionality for executing cleanup functions.
core/colorz
Package colorz provides supplemental color functionality.
Package colorz provides supplemental color functionality.
core/debugz
Package debugz contains functionality for debugging.
Package debugz contains functionality for debugging.
core/diffdoc
Package diffdoc provides core diff functionality, with a focus on streaming and concurrency.
Package diffdoc provides core diff functionality, with a focus on streaming and concurrency.
core/diffdoc/internal/go-udiff
Package udiff computes differences between text files or strings.
Package udiff computes differences between text files or strings.
core/diffdoc/internal/go-udiff/difftest
Package difftest supplies a set of tests that will operate on any implementation of a diff algorithm as exposed by diff "github.com/neilotoole/sq/libsq/core/diffdoc/internal/go-udiff"
Package difftest supplies a set of tests that will operate on any implementation of a diff algorithm as exposed by diff "github.com/neilotoole/sq/libsq/core/diffdoc/internal/go-udiff"
core/diffdoc/internal/go-udiff/lcs
package lcs contains code to find longest-common-subsequences (and diffs)
package lcs contains code to find longest-common-subsequences (and diffs)
core/diffdoc/internal/go-udiff/myers
Package myers implements the Myers diff algorithm.
Package myers implements the Myers diff algorithm.
core/errz
Package errz is sq's error package.
Package errz is sq's error package.
core/execz
Package execz builds on stdlib os/exec.
Package execz builds on stdlib os/exec.
core/ioz
Package ioz contains supplemental io functionality.
Package ioz contains supplemental io functionality.
core/ioz/checksum
Package checksum provides functions for working with checksums.
Package checksum provides functions for working with checksums.
core/ioz/contextio
Package contextio provides io decorators that are context-aware.
Package contextio provides io decorators that are context-aware.
core/ioz/httpz
Package httpz provides functionality supplemental to stdlib http.
Package httpz provides functionality supplemental to stdlib http.
core/ioz/lockfile
Package lockfile implements a pid lock file mechanism.
Package lockfile implements a pid lock file mechanism.
core/jointype
Package jointype enumerates the various SQL JOIN types.
Package jointype enumerates the various SQL JOIN types.
core/kind
Package kind encapsulates the notion of data "kind": that is, it is an abstraction over data types across implementations.
Package kind encapsulates the notion of data "kind": that is, it is an abstraction over data types across implementations.
core/langz
Package langz contains miscellaneous functionality supplemental to core golang stuff like slices and channels.
Package langz contains miscellaneous functionality supplemental to core golang stuff like slices and channels.
core/lg
Package lg contains utility functions for working with slog.
Package lg contains utility functions for working with slog.
core/lg/devlog
Package devlog contains a custom slog.Handler for developer-friendly log output.
Package devlog contains a custom slog.Handler for developer-friendly log output.
core/lg/lga
Package lga ("log attribute") holds constants for log attribute names.
Package lga ("log attribute") holds constants for log attribute names.
core/lg/lgm
Package lgm ("log message") contains constants for log messages.
Package lgm ("log message") contains constants for log messages.
core/lg/lgt
Package lgt provides a mechanism for getting a *slog.Logger that outputs to testing.T. See lgt.New.
Package lgt provides a mechanism for getting a *slog.Logger that outputs to testing.T. See lgt.New.
core/lg/slogbuf
Package slogbuf implements a Buffer that stores log records that can later be replayed on a slog.Handler.
Package slogbuf implements a Buffer that stores log records that can later be replayed on a slog.Handler.
core/lg/userlogdir
Package userlogdir has a single function, UserLogDir, that returns an OS-specific path for storing user logs.
Package userlogdir has a single function, UserLogDir, that returns an OS-specific path for storing user logs.
core/oncecache
Package oncecache contains a strongly-typed, concurrency-safe, context-aware, dependency-free, in-memory, on-demand object [Cache], focused on write-once, read-often ergonomics.
Package oncecache contains a strongly-typed, concurrency-safe, context-aware, dependency-free, in-memory, on-demand object [Cache], focused on write-once, read-often ergonomics.
core/options
Package options implements config options.
Package options implements config options.
core/progress
Package progress contains progress bar widget functionality.
Package progress contains progress bar widget functionality.
core/record
Package record holds the record.Record type, which is the core type for holding query results.
Package record holds the record.Record type, which is the core type for holding query results.
core/retry
Package retry implements retry functionality.
Package retry implements retry functionality.
core/runtimez
Package runtimez provides functionality supplemental to stdlib's runtime pkg.
Package runtimez provides functionality supplemental to stdlib's runtime pkg.
core/schema
Package schema provides functionality for modeling SQL constructs.
Package schema provides functionality for modeling SQL constructs.
core/sqlz
Package sqlz contains core types such as Kind and Record.
Package sqlz contains core types such as Kind and Record.
core/stringz
Package stringz contains string functions similar in spirit to the stdlib strings package.
Package stringz contains string functions similar in spirit to the stdlib strings package.
core/tablefq
Package tablefq is a tiny package that holds the tablefq.T type, which is a fully-qualified SQL table name.
Package tablefq is a tiny package that holds the tablefq.T type, which is a fully-qualified SQL table name.
core/tailbuf
Package tailbuf contains a tail buffer [Buf] of fixed size that provides a window on the tail of the items written via [Buf.Write].
Package tailbuf contains a tail buffer [Buf] of fixed size that provides a window on the tail of the items written via [Buf.Write].
core/termz
Package termz contains a handful of terminal utilities.
Package termz contains a handful of terminal utilities.
core/timez
Package timez contains time functionality.
Package timez contains time functionality.
core/tuning
Package tuning contains tuning options.
Package tuning contains tuning options.
core/urlz
Package urlz contains URL utility functionality.
Package urlz contains URL utility functionality.
driver/dialect
Package dialect contains functionality for SQL dialects.
Package dialect contains functionality for SQL dialects.
files
Package files contains functionality for dealing with files, including remote files (e.g.
Package files contains functionality for dealing with files, including remote files (e.g.
files/internal/downloader
Package downloader provides a mechanism for getting files from HTTP/S URLs, making use of a mostly RFC-compliant cache.
Package downloader provides a mechanism for getting files from HTTP/S URLs, making use of a mostly RFC-compliant cache.
source
Package source provides functionality for dealing with data sources.
Package source provides functionality for dealing with data sources.
source/drivertype
Package drivertype defines drivertype.Type, which is the type of a driver, e.g.
Package drivertype defines drivertype.Type, which is the type of a driver, e.g.
source/location
Package location contains functionality related to source location.
Package location contains functionality related to source location.
source/mdcache
Package mdcache contains a [Cache] that caches source metadata.
Package mdcache contains a [Cache] that caches source metadata.
source/metadata
Package metadata contains types that model source metadata.
Package metadata contains types that model source metadata.
Package testh (test helper) contains functionality for testing.
Package testh (test helper) contains functionality for testing.
fixt
Package fixt contains common test fixture values.
Package fixt contains common test fixture values.
proj
Package proj contains test utilities for dealing with project paths and the like.
Package proj contains test utilities for dealing with project paths and the like.
sakila
Package sakila holds test constants and such for the sakila test sources.
Package sakila holds test constants and such for the sakila test sources.
testsrc
Package testsrc holds testing constants (in addition to pkg sakila).
Package testsrc holds testing constants (in addition to pkg sakila).
tu
Package tu contains basic generic test utilities.
Package tu contains basic generic test utilities.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL