GoWarehouseValidator

command module
v0.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2021 License: MIT Imports: 15 Imported by: 0

README

GoWarehouseValidator

A simple CLI tool for validate the given datasets

WHY ?

During the initial phase of the data warehouse creation, is necessary to be sure that the input dataset have a standard format.

During the initial loading of the dataset, is crucial to be sure that the input data have the same format and datatype. By this way, you can implement an ETL processing with less time, avoiding to check/validate the input dataset every time that a problem occurs in the ETL process.

This tool aim to validate the input dataset (for now support only csv file) checking that the type of the various column are the same of the one specified in the configuration file.

HOW ?

The tool uses a configuration file like the following one:

{
  // path will save all the file that are necessary to verify
  "path": [
    "s3://my-bucket-test-s3/covid.csv",
    "dataset/covid.csv"
  ],
  // path can be specified as a s3
  "_path": "s3://my-bucket/my_csv_file.csv",
  // region of the S3
  "region": "us-east-2",
  // separator of the CSV file
  "separator": ",",
  // date format as specified in strftime format.
  "date_format": "%Y-%m-%dT%H:%M:%S",
  // key-value of the csv header - type
  "validation": {
    "data": "DATE",
    "stato": "STRING",
    "ricoverati_con_sintomi": "INTEGER",
    "terapia_intensiva": "INTEGER",
    "totale_ospedalizzati": "INTEGER",
    "isolamento_domiciliare": "INTEGER",
    "totale_positivi": "INTEGER",
    "variazione_totale_positivi": "INTEGER",
    "nuovi_positivi": "INTEGER",
    "dimessi_guariti": "INTEGER",
    "deceduti": "INTEGER",
    "casi_da_sospetto_diagnostico": "INTEGER|NULLABLE",
    "casi_da_screening": "INTEGER|NULLABLE",
    "totale_casi": "INTEGER",
    "tamponi": "INTEGER",
    "casi_testati": "INTEGER|NULLABLE",
    "note": "STRING|NULLABLE",
    "ingressi_terapia_intensiva": "INTEGER|NULLABLE",
    "note_test": "STRING|NULLABLE",
    "note_casi": "STRING|NULLABLE"
  }
}

Usage

./GoWarehouseValidator -conf "conf/conf.json"
Using go
go get -v -u github.com/alessiosavi/GoWarehouseValidator

./GoWarehouseValidator -conf "conf/conf.json"

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL