extractor

command module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 28, 2023 License: MIT Imports: 6 Imported by: 0

README

Extractor

A simple utility to extract, list columns from CSV or split files partitioned by a column.

Usage and Examples

There are mainly two commands as of now, list and extract.

Usage: extractor <command>

Flags:
  -h, --help           Show context-sensitive help.
  -v, --verbose        Enable debug mode
      --sep="comma"    Separator to be used.

Commands:
  list       List CSV Columns
  extract    Extracts columns from CSV

Run "extractor <command> --help" for more information on a command.

List

This command helps to list the columns available.

Usage: extractor list <input>

List CSV Columns

Arguments:
  <input>    Input filename

Flags:
  -h, --help           Show context-sensitive help.
  -v, --verbose        Enable debug mode
      --sep="comma"    Separator to be used.

Example:

$❯ extractor list eng.4.csv
* Round
* Date
* Team 1
* FT
* Team 2

Extract

This command will start extracting the columns from the input file to output file. If in debug mode, --count specifies the update frequency to show progress

Usage: extractor extract <input> <output> <columns> ...

Extracts columns from CSV

Arguments:
  <input>          Input filename
  <output>         Output filename
  <columns> ...    Columns to extract

Flags:
  -h, --help           Show context-sensitive help.
  -v, --verbose        Enable debug mode
      --sep="comma"    Separator to be used.

      --count=1000     Frequency to show progress

Example:

$❯ extractor extract eng.4.csv /dev/stdout "Team 1" "Team 2"
Team 1,Team 2
Barrow AFC,Stevenage FC
Bolton Wanderers FC,Forest Green Rovers FC
Bradford City AFC,Colchester United FC
Cambridge United FC,Carlisle United FC
Cheltenham Town FC,Morecambe FC
Walsall FC,Grimsby Town FC
Mansfield Town FC,Tranmere Rovers FC
Oldham Athletic AFC,Leyton Orient FC
Port Vale FC,Crawley Town FC
...

$❯ extractor extract eng.4.csv eng.4.teams.csv "Team 1" "Team 2" -v --count 10
Opening input file... eng.4.csv
Opening output file... eng.4.teams.csv
Starting extraction...
Extracted 553 records.
Finished.

Partition

This command helps to create files partitioned by the provided column. It does support verbose mode to print more details about the tasks it is running.

To drop the partitioned column one can use --drop so the output files won't have that.

Prefix and Suffix are identified by the name of file splitting at the last . and putting - and value in between.

Usage: extractor partition <input> <column>

Partitions a file based on the given column from CSV

Arguments:
  <input>     Input filename
  <column>    Column to use to split by

Flags:
  -h, --help             Show context-sensitive help.
  -v, --verbose          Enable debug mode
      --sep="comma"      Separator to be used.

      --prefix=STRING    Prefix of the output file
      --suffix=STRING    Suffix of the output file
      --drop             Drop Column in output file(s)

Example:

$❯ head -5 timezone.csv # Check the file
Value,Label,Group
Africa/Abidjan,Abidjan,Africa
Africa/Accra,Accra,Africa
Africa/Addis_Ababa,Addis Ababa,Africa
Africa/Algiers,Algiers,Africa
$❯ extractor partition timezone.csv Group # Run partition command
$❯ ls -1 timezone-*.csv
timezone-Africa.csv
timezone-America.csv
timezone-Antarctica.csv
...
timezone-Pacific.csv
timezone-UTC.csv
$❯ head -5 timezone-Pacific.csv
Value,Label,Group
Pacific/Apia,Apia,Pacific
Pacific/Auckland,Auckland,Pacific
Pacific/Bougainville,Bougainville,Pacific
Pacific/Chatham,Chatham,Pacific

Example:

$❯ # Run partition command and drop the partitioned column
$❯ extractor partition --drop timezone.csv Group
$❯ head -5 timezone-Pacific.csv
Value,Label
Pacific/Apia,Apia
Pacific/Auckland,Auckland
Pacific/Bougainville,Bougainville
Pacific/Chatham,Chatham

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL