tigon

command module
v0.0.0-...-cd3a907 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 11, 2018 License: MIT Imports: 21 Imported by: 0

README

Tigon

Tigon is a parser tool. Simply if files compressed, uncompress the files, parsing them according to the configs given, and load them to the database. Three operations run concurency. The concurency settings are given in config.toml

Tigon currently supports the following stuffs
  • Uncompress files | zip, tar.gz, tgz, tar, tar.bz2, tar.xz, rar, 7z, gz
  • Transform from csv, txt, xls, xlsx
  • Load to Oracle (Sqlldr)
  • Scheduler running
config.toml default settings
[path]
raw = "workspace/files_raw"
parsed = "workspace/files_parsed"
backup = "workspace/files_backup"
config = "config"

[customFileExtention]
parsedFileExt = ".parsed"
oracleControlFileExt = ".ctl"

[concurency]
uncompress = 8
transform = 16
load = 8

[scheduler]
every = 5 # seconds
Build & Usage

Before you start, run following commands. These commands create a necessary folders and your folder name in workspace according to config.toml and your given argument.

FolderName is a seperator for your raw files.

$ dep ensure
$ go build
$ ./tigon <FolderName>

Then copy your files to workspace/files_raw/. You need to create a config toml file like below in your config directory (workspace/files_config/.toml) for the settings of the files in the folder you are giving it. Then start tigon following commands.

Sample config file for txt files

Below ".test" keyword be the same as your in folderName file. Each file must have an transform, load setting in its name. The same setting file is used for the files in the same directory.

[transform]
    [transform.test]
        parseColumns = [0,3,4]
        parseDataStartIndex = 1
        parseDataEndIndex = -1
        fileSplitChar = ""
        fileRegexStr = "[^?!(\\s)]+"
        outputSplitChar = "|"

[load]
    [load.test]
        db = "oracle"
        username = "username"
        password = "password"
        tnsname = "tnsName"
        loadControlFile = """
                            LOAD DATA
                            INFILE 'workspace/files_parsed/sample_txt/test.parsed'
                            BADFILE 'workspace/files_parsed/sample_txt/test.bad'
                            DISCARDFILE 'workspace/files_parsed/sample_txt/test.dsc'
                            APPEND INTO TABLE SCHEMA_NAME.TABLE_NAME
                            Fields terminated by "|" Optionally enclosed by '"'
                            (
                            DATA_DATE DATE "YYYY-MM-DD HH24:MI" NULLIF (DATA_DATE="NULL"),
                            COLUMN1,
                            COLUMN2
                            )
                        """
$ ./tigon <FolderName>
To Do
  • XML parser
  • Add unit tests
  • Add Mysql, Postgresql, Elasticsearch loader
  • XLS transformer is not always working properly

License

MIT

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL