kazaam

package module
v3.4.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 10, 2021 License: MIT Imports: 7 Imported by: 2

README

kazaam

Travis Build Status Coverage Status MIT licensed GitHub release Go Report Card GoDoc

Description

Kazaam was created with the goal of supporting easy and fast transformations of JSON data with Golang. This functionality provides us with an easy mechanism for taking intermediate JSON message representations and transforming them to formats required by arbitrary third-party APIs.

Inspired by Jolt, Kazaam supports JSON to JSON transformation via a transform "specification" also defined in JSON. A specification is comprised of one or more "operations". See Specification Support, below, for more details.

Documentation

API Documentation is available at http://godoc.org/gopkg.in/qntfy/kazaam.v3.

Features

Kazaam is primarily designed to be used as a library for transforming arbitrary JSON. It ships with six built-in transform types, described below, which provide significant flexibility in reshaping JSON data.

Also included when you go get Kazaam, is a binary implementation, kazaam that can be used for development and testing of new transform specifications.

Finally, Kazaam supports the implementation of custom transform types. We encourage and appreciate pull requests for new transform types so that they can be incorporated into the Kazaam distribution, but understand sometimes time-constraints or licensing issues prevent this. See the API documentation for details on how to write and register custom transforms.

Due to performance considerations, Kazaam does not fully validate that input data is valid JSON. The IsJson() function is provided for convenience if this functionality is needed, it may significantly slow down use of Kazaam.

Specification Support

Kazaam currently supports the following transforms:

  • shift
  • concat
  • coalesce
  • extract
  • timestamp
  • uuid
  • default
  • pass
  • delete
Shift

The shift transform is the current Kazaam workhorse used for remapping of fields. The specification supports jsonpath-esque JSON accesses and sets. Concretely

{
  "operation": "shift",
  "spec": {
    "object.id": "doc.uid",
    "gid2": "doc.guid[1]",
    "allGuids": "doc.guidObjects[*].id"
  }
}

executed on a JSON message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  },
  "top-level-key": null
}

would result in

{
  "object": {
    "id": 12345
  },
  "gid2": "guid2",
  "allGuids": ["guid0", "guid2", "guid4"]
}

The jsonpath implementation supports a few special cases:

  • Array accesses: Retrieve nth element from array
  • Array wildcarding: indexing an array with [*] will return every matching element in an array
  • Top-level object capture: Mapping $ into a field will nest the entire original object under the requested key
  • Array append/prepend and set: Append and prepend an array with [+] and [-]. Attempting to write an array element that does not exist results in null padding as needed to add that element at the specified index (useful with "inplace").

The shift transform also supports a "require" field. When set to true, Kazaam will throw an error if any of the paths in the source JSON are not present.

Finally, shift by default is destructive. For in-place operation, an optional "inplace" field may be set.

Concat

The concat transform allows the combination of fields and literal strings into a single string value.

{
    "operation": "concat",
    "spec": {
        "sources": [{
            "value": "TEST"
        }, {
            "path": "a.timestamp"
        }],
        "targetPath": "a.timestamp",
        "delim": ","
    }
}

executed on a JSON message with format

{
    "a": {
        "timestamp": 1481305274
    }
}

would result in

{
    "a": {
        "timestamp": "TEST,1481305274"
    }
}

Notes:

  • sources: list of items to combine (in the order listed)
    • literal values are specified via value
    • field values are specified via path (supports the same addressing as shift)
  • targetPath: where to place the resulting string
    • if this an existing path, the result will replace current value.
  • delim: Optional delimiter

The concat transform also supports a "require" field. When set to true, Kazaam will throw an error if any of the paths in the source JSON are not present.

Coalesce

A coalesce transform provides the ability to check multiple possible keys to find a desired value. The first matching key found of those provided is returned.

{
  "operation": "coalesce",
  "spec": {
    "firstObjectId": ["doc.guidObjects[0].uid", "doc.guidObjects[0].id"]
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  },
  "firstObjectId": "guid0"
}

Coalesce also supports an ignore array in the spec. If an otherwise matching key has a value in ignore, it is not considered a match. This is useful e.g. for empty strings

{
  "operation": "coalesce",
  "spec": {
    "ignore": [""],
    "firstObjectId": ["doc.guidObjects[0].uid", "doc.guidObjects[0].id"]
  }
}
Extract

An extract transform provides the ability to select a sub-object and have kazaam return that sub-object as the top-level object. For example

{
  "operation": "extract",
  "spec": {
    "path": "doc.guidObjects[0].path.to.subobject"
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"path": {"to": {"subobject": {"name": "the.subobject", "field", "field.in.subobject"}}}}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "name": "the.subobject",
  "field": "field.in.subobject"
}
Timestamp

A timestamp transform parses and formats time strings using the golang syntax. Note: this operation is done in-place. If you want to preserve the original string(s), pair the transform with shift. This transform also supports the $now operator for inputFormat, which will set the current timestamp at the specified path, formatted according to the outputFormat. $unix is supported for both input and output formats as a Unix time, the number of seconds elapsed since January 1, 1970 UTC as an integer string.

{
  "operation": "timestamp",
  "timestamp[0]": {
    "inputFormat": "Mon Jan _2 15:04:05 -0700 2006",
    "outputFormat": "2006-01-02T15:04:05-0700"
  },
  "nowTimestamp": {
    "inputFormat": "$now",
    "outputFormat": "2006-01-02T15:04:05-0700"
  },
  "epochTimestamp": {
    "inputFormat": "2006-01-02T15:04:05-0700",
    "outputFormat": "$unix"
  }
}

executed on a json message with format

{
  "timestamp": [
    "Sat Jul 22 08:15:27 +0000 2017",
    "Sun Jul 23 08:15:27 +0000 2017",
    "Mon Jul 24 08:15:27 +0000 2017"
  ]
}

would result in

{
  "timestamp": [
    "2017-07-22T08:15:27+0000",
    "Sun Jul 23 08:15:27 +0000 2017",
    "Mon Jul 24 08:15:27 +0000 2017"
  ]
  "nowTimestamp": "2017-09-08T19:15:27+0000"
}
UUID

A uuid transform generates a UUID based on the spec. Currently supports UUIDv3, UUIDv4, UUIDv5.

For version 4 is a very simple spec

{
    "operation": "uuid",
    "spec": {
        "doc.uuid": {
            "version": 4, //required
        }
    }
}

executed on a json message with format

{
  "doc": {
    "author_id": 11122112,
    "document_id": 223323,
    "meta": {
      "id": 23
    }
  }
}

would result in

{
  "doc": {
    "author_id": 11122112,
    "document_id": 223323,
    "meta": {
      "id": 23
    }
    "uuid": "f03bacc1-f4e0-4371-a5c5-e8160d3d6c0c"
  }
}

For UUIDv3 & UUIDV5 are a bit more complex. These require a Name Space which is a valid UUID already, and a set of paths, which generate UUID's based on the value of that path. If that path doesn't exist in the incoming document, a default field will be used instead. Note both of these fields must be strings. Additionally you can use the 4 predefined namespaces such as DNS, URL, OID, & X500 in the name space field otherwise pass your own UUID.

{
   "operation":"uuid",
   "spec":{
      "doc.uuid":{
         "version":5,
         "namespace":"DNS",
         "names":[
            {"path":"doc.author_name", "default":"some string"},
            {"path":"doc.type", "default":"another string"},
         ]
      }
   }
}

executed on a json message with format

{
  "doc": {
    "author_name": "jason",
    "type": "secret-document"
    "document_id": 223323,
    "meta": {
      "id": 23
    }
  }
}

would result in

{
  "doc": {
    "author_name": "jason",
    "type": "secret-document",
    "document_id": 223323,
    "meta": {
      "id": 23
    },
    "uuid": "f03bacc1-f4e0-4371-a7c5-e8160d3d6c0c"
  }
}
Default

A default transform provides the ability to set a key's value explicitly. For example

{
  "operation": "default",
  "spec": {
    "type": "message"
  }
}

would ensure that the output JSON message includes {"type": "message"}.

Delete

A delete transform provides the ability to delete keys in place.

{
  "operation": "delete",
  "spec": {
    "paths": ["doc.uid", "doc.guidObjects[1]"]
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "doc": {
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid4"}]
  }
}
Pass

A pass transform, as the name implies, passes the input data unchanged to the output. This is used internally when a null transform spec is specified, but may also be useful for testing.

Usage

To start, go get the versioned repository:

go get gopkg.in/qntfy/kazaam.v3
Using as an executable program

If you want to create an executable binary from this project, follow these steps (you'll need go installed and $GOPATH set):

go get gopkg.in/qntfy/kazaam.v3
cd $GOPATH/src/gopkg.in/qntfy/kazaam.v3/kazaam
go install

This will create an executable in $GOPATH/bin like you would expect from the normal go build behavior.

Examples

See godoc examples.

Documentation

Overview

Package kazaam provides a simple interface for transforming arbitrary JSON in Golang.

Index

Examples

Constants

View Source
const (
	// ParseError is thrown when there is a JSON parsing error
	ParseError = iota
	// RequireError is thrown when the JSON path does not exist and is required
	RequireError
	// SpecError is thrown when the kazaam specification is not properly formatted
	SpecError
)

Variables

This section is empty.

Functions

func IsJson

func IsJson(s []byte) bool

by default, kazaam does not fully validate input data. Use IsJson() if you need to confirm input is valid before transforming. Note: This operation is very slow and memory/alloc intensive relative to most transforms.

func IsJsonFast

func IsJsonFast(s []byte) bool

experimental fast validation with jsonparser

Types

type Config

type Config struct {
	// contains filtered or unexported fields
}

Config is used to configure a Kazaam Transformer object. Note: a manually-initialized config object (not created with `NewDefaultConfig`) will be UNAWARE of the built-in Kazaam transforms. Built-in and third-party Kazaam transforms will have to be manually registered for Kazaam to be able to transform data.

func NewDefaultConfig

func NewDefaultConfig() Config

NewDefaultConfig returns a properly initialized Config object that contains required mappings for all the built-in transform types.

func (*Config) RegisterTransform

func (c *Config) RegisterTransform(name string, function TransformFunc) error

RegisterTransform registers a new transform type that satisfies the TransformFunc signature within the Kazaam configuration with the provided name. This function enables end-users to create and use custom transforms within Kazaam.

Example
// use the default config to have access to built-in kazaam transforms
kc := NewDefaultConfig()

// register the new custom transform called "copy" which supports copying the
// value of a top-level key to another top-level key
kc.RegisterTransform("copy", func(spec *transform.Config, data []byte) ([]byte, error) {
	// the internal `Spec` will contain a mapping of source and target keys
	for targetField, sourceFieldInt := range *spec.Spec {
		sourceField := sourceFieldInt.(string)
		// Note: jsonparser.Get() strips quotes from returned strings, so a real
		// transform would need handling for that. We use a Number below for simplicity.
		result, _, _, _ := jsonparser.Get(data, sourceField)
		data, _ = jsonparser.Set(data, result, targetField)
	}
	return data, nil
})

k, _ := New(`[{"operation": "copy", "spec": {"output": "input"}}]`, kc)
kazaamOut, _ := k.TransformJSONStringToString(`{"input":72}`)

fmt.Println(kazaamOut)
Output:

{"input":72,"output":72}

type Error

type Error struct {
	ErrMsg  string
	ErrType int
}

Error provides an error message (ErrMsg) and integer code (ErrType) for errors thrown during the execution of a transform

func (*Error) Error

func (e *Error) Error() string

Error returns a string representation of the Error

type Kazaam

type Kazaam struct {
	// contains filtered or unexported fields
}

Kazaam includes internal data required for handling the transformation. A Kazaam object must be initialized using the `New` or `NewKazaam` functions.

func New

func New(specString string, config Config) (*Kazaam, error)

New creates a new Kazaam instance by parsing the `spec` argument as JSON and returns a pointer to it. The string `spec` must be valid JSON or empty for `New` to return a Kazaam object. This function also accepts a `Config` object used for modifying the behavior of the Kazaam Transformer.

If `spec` is an empty string, the default Kazaam behavior when the Transform variants are called is to return the original data unmodified.

At initialization time, the `spec` is checked to ensure that it is valid JSON. Further, it confirms that all individual specs have a properly-specified `operation` and details are set if required. If the spec is invalid, a nil Kazaam pointer and an explanation of the error is returned. The contents of the transform specification is further validated at Transform time.

Currently, the Config object allows end users to register additional transform types to support performing custom transformations not supported by the canonical set of transforms shipped with Kazaam.

Example
// Initialize a default Kazaam instance (i.e. same as NewKazaam(spec string))
k, _ := New(`[{"operation": "shift", "spec": {"output": "input"}}]`, NewDefaultConfig())
kazaamOut, _ := k.TransformJSONStringToString(`{"input":"input value"}`)

fmt.Println(kazaamOut)
Output:

{"output":"input value"}

func NewKazaam

func NewKazaam(specString string) (*Kazaam, error)

NewKazaam creates a new Kazaam instance with a default configuration. See documentation for `New` for complete details.

Example
k, _ := NewKazaam(`[{"operation": "shift", "spec": {"output": "input"}}]`)
kazaamOut, _ := k.TransformJSONStringToString(`{"input":"input value"}`)

fmt.Println(kazaamOut)
Output:

{"output":"input value"}

func (*Kazaam) Transform

func (k *Kazaam) Transform(data []byte) ([]byte, error)

Transform makes a copy of the byte slice `data`, transforms it according to the loaded spec, and returns the new, modified byte slice.

func (*Kazaam) TransformInPlace

func (k *Kazaam) TransformInPlace(data []byte) ([]byte, error)

TransformInPlace takes the byte slice `data`, transforms it according to the loaded spec, and modifies the byte slice in place.

Note: this is a destructive operation: the transformation is done in place. You must perform a deep copy of the data prior to calling TransformInPlace if the original JSON object must be retained.

func (*Kazaam) TransformJSONString

func (k *Kazaam) TransformJSONString(data string) ([]byte, error)

TransformJSONString loads the JSON string, transforms it as per the spec, and returns a pointer to a transformed []byte.

This function is especially useful when one may need to extract multiple fields from the transformed JSON.

func (*Kazaam) TransformJSONStringToString

func (k *Kazaam) TransformJSONStringToString(data string) (string, error)

TransformJSONStringToString loads the JSON string `data`, transforms it as per the spec, and returns the transformed JSON string.

type TransformFunc

type TransformFunc func(spec *transform.Config, data []byte) ([]byte, error)

TransformFunc defines the contract that any Transform function implementation must abide by. The transform's first argument is a `kazaam.Spec` object that contains any configuration necessary for the transform. The second argument is a `[]byte` object that contains the data to be transformed.

The data object passed in should be modified in-place and returned. Where that is not possible, a new `[]byte` object should be created and returned. The function should return an error if necessary. Transforms should strive to fail gracefully whenever possible.

Directories

Path Synopsis
A simple command-line interface (CLI) for executing kazaam transforms on data from files or stdin.
A simple command-line interface (CLI) for executing kazaam transforms on data from files or stdin.
Package transform package contains canonical implementations of Kazaam transforms.
Package transform package contains canonical implementations of Kazaam transforms.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL