godap

command module
v0.0.0-...-a32b705 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 20, 2019 License: MIT Imports: 14 Imported by: 0

README

GODAP: The Data Analysis Pipeline

(a port of the ruby-based DAP: https://github.com/rapid7/dap)

Build Status

DAP was created to transform text-based data on the command-line, specializing in transforms that are annoying or difficult to do with existing tools.

DAP reads data using an input plugin, transforms it through a series of filters, and prints it out again using an output plugin. Every record is treated as a document (aka: hash/dict) and filters are used to reduce, expand, and transform these documents as they pass through. Think of DAP as a mashup between sed, awk, grep, csvtool, and jq, with map/reduce capabilities.

DAP was written to process terabyte-sized public scan datasets, such as those provided by https://opendata.rapid7.com/. This go version of dap supports parallel processing of data. Results are forwarded to stdout and consistency of ordering is not guaranteed (and are highly likely to be out of order when compared to the input data stream).

Installation

Install Go version 1.12 or higher.

go get github.com/rapid7/godap

godap supports pcap and geoip, which provide an input and filters, respectively. To enable support for these, you must pass a libpcap or libgeoip tag to your go get command. You must also have those libraries installed on your system (libpcap-dev or libgeoip).

For example:

go get -tags="libpcap libgeoip" github.com/rapid7/godap

Will compile in support for both the pcap input and the geoip filters (geo_ip and geo_ip_org)

Usage

Quick Setup for GeoIP Lookups

Note: The documentation below assumes you've properly setup $GOPATH and $PATH (usually $GOPATH/bin:$PATH) per the official golang documentation.

$ go get github.com/rapid7/godap
$ sudo bash
# mkdir -p /var/lib/geoip && cd /var/lib/geoip && wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz && gunzip GeoLiteCity.dat.gz && mv GeoLiteCity.dat geoip.dat
$  echo 8.8.8.8 | godap lines + geo_ip line + json
{"line":"8.8.8.8","line.country_code":"US","line.country_code3":"USA","line.country_name":"United States","line.latitude":"38.0","line.longitude":"-97.0"}

Where dap gets fun is doing transforms, like just grabbing the country code:

$  echo 8.8.8.8 | godap lines + geo_ip line + select line.country_code3 + lines
USA

Inputs, filters and outputs

The general syntax when calling godap is godap <input> + (<filter +> <filter +> <filter +> ...) + <output>, where input, filter and output correspond to one of the supported features below. Filters are optional, though an input and output are required. Each feature component is separated by +. Component options are specified immediately after the component declaration. For example, streaming from a wifi adapter and spitting out json documents would resemble: godap pcap iface=en0 rfmon=true + json. Component options with spaces or other complexities can be specified using shell-like quoting. For example, for a bpf pcap filter on the pcap component: godap pcap iface=en0 'filter="tcp port 80"' + json.

Inputs

Specifies that the input stream is a packet capture. Currently supports streaming in from a file or interface.

NOTE: godap MUST be built using -tags="libpcap" for pcap filter support. If not, the pcap filter will be unavailable.

Option Description Value Default
iface The interface to read packets from. If iface is specified, file must not be specified string interface id <none>
file The pcap file to read from. If file is specified, iface must not be specified string filename <none>
promisc Whether to capture in promiscuous mode boolean false
timeout The capture timeout integer -1 (inf)
snaplen The snap length integer 65536
rfmon Whether to capture in monitor mode (applicable only to adapters which support it) boolean false

Example:

  • Pull packets in from monitor mode: godap pcap iface=en0 rfmon=true + json

  • Read pcap (or pcap-ng) file contents and convert to json: godap pcap file=foo.pcap + json

  • Live capture in promiscuous mode: godap pcap iface=en0 promisc=true + json

  • json

Specifies that the input stream is represented as JSON data.

Option Description Value Default
file The file to stream from. If not specified, stdin is assumed. Can also be - for stdin. string filename stdin

Example:

$  echo '{"a":2}' | godap json + lines
2

Specifies that the input stream is represented as newline-terminated plaintext.

Option Description Value Default
file The file to stream from. If not specified, stdin is assumed. Can also be - for stdin. string filename stdin

Example:

$  echo hello world | godap lines + json
{"line":"hello world"}

Filters

Renames a document field.

Option Description Value Default
<document key> The document key to rename string <destination key> <none>

Example:

$  echo world | godap lines + rename line=hello + json
{"hello":"world"}

Prevents a document from further processing if a specified key is not present

Option Description Value Default
<document key> The document key that must not exist <none> <none>

Example:

$  echo '{"foo":"bar"}' | godap json + not_exists foo + json

Extracts comma-separated fields in the specified key's value into new documents

Option Description Value Default
<document key> The document key which will be split <none> <none>

Example:

$  echo '{"foo":"bar,baz"}' | godap json + split_comma foo + json
{"foo":"bar,baz","foo.word":"bar"}
{"foo":"bar,baz","foo.word":"baz"}
  • field_split_line

Extracts fields separated by a newline from the source key's value into new fields of the same document. Each new key is named <origkey>.f### where ### is an incremental integer indicating the original field position from left to right.

Option Description Value Default
<document key> The document key which will be split <none> <none>

Example:

$  echo '{"foo":"bar\nbaz"}' | godap json + field_split_line foo + json
{"foo":"bar\nbaz","foo.f1":"bar","foo.f2":"baz"}

Filters out a document if the value for a given key is empty

Option Description Value Default
<document key> The document key to filter on <none> <none>

Example:

$  echo '{"foo":"bar,baz"}' | godap json + not_empty foo + json
{"foo":"bar,baz"}

Splits a key into multiple new subkeys each containing a field from the original value split by \t. Each new key is named <origkey>.f### where ### is an incremental integer indicating the original field position from left to right.

Option Description Value Default
<document key> The document key to split <none> <none>

Example:

$  echo '{"foo":"bar\tbaz"}' | godap json + field_split_tab foo + json
{"foo":"bar\tbaz","foo.f1":"bar","foo.f2":"baz"}

Sets the value of the specified key to the empty string

Option Description Value Default
<document key> The key to truncate <none> <none>

Example:

$  echo '{"foo":"bar\tbaz"}' | godap json + truncate foo + json
{"foo":""}

Flattens pesky nested json into "key.name" properties of the top-level document. The original key/value are left untouched (you can remove them using the remove filter).

Option Description Value Default
<document key> The key to flatten <none> <none>

Example:

$ echo '{"foo":{"bar": "baz"}}' | godap json + flatten foo + json
{"foo":{"bar":"baz"},"foo.bar":"baz"}

Adds a new value to the document

Option Description Value Default
<document key> The key to truncate <document value> <empty string>

Example:

$  echo '{"foo":"bar\tbaz"}' | godap json + insert a=b + json
{"a":"b","foo":"bar\tbaz"}

Splits a field that contains an array data type value into multiple new fields. Each new key is named <origkey>.f### where ### is an incremental integer indicating the original field position from left to right. The array can contain multiple different data types.

Option Description Value Default
<document key> The key to split <none> <none>

Example:

$  echo '{"foo":["a",2]}' | godap json + field_split_array foo + json
{"foo":["a",2],"foo.f1":"a","foo.f2":2}

Ensures the specified key exists in the source document. If it does not, the document is removed from the pipeline.

Option Description Value Default
<document key> The key that must exist <none> <none>

Example:

$  echo '{"foo":"bar\tbaz"}' | godap json + exists a + json
$  echo '{"foo":"bar\tbaz"}' | godap json + exists foo + json
{"foo":"bar\tbaz"}

Splits a given key's value into multiple new documents with the same key name, each document containing a field extracted from the source key's value separated by a newline.

Option Description Value Default
<document key> The key to split <none> <none>

Example:

$  echo '{"foo":"bar\nbaz"}' | godap json + split_line foo + json
{"foo":"bar\nbaz","foo.line":"bar"}
{"foo":"bar\nbaz","foo.line":"baz"}

Keeps only the specified keys in the resulting document. Multiple key names can be specified.

Option Description Value Default
<document key> The key to keep <none> <none>

Example:

$  echo '{"foo":"bar", "baz":"qux", "a":"b"}' | godap json + select foo + json
{"foo":"bar"}
$  echo '{"foo":"bar", "baz":"qux", "a":"b"}' | godap json + select foo baz + json
{"baz":"qux","foo":"bar"}

Removes the specified keys from the source document.

Option Description Value Default
<document key> The key to remove <none> <none>

Example:

$  echo '{"foo":"bar", "baz":"qux", "a":"b"}' | godap json + remove foo baz + json
{"a":"b"}

Ensures a document key includes a specified string.

Option Description Value Default
<document key> The key to remove string contains_str <none>

Example:

$  echo '{"foo":"bar", "baz":"qux", "a":"b"}' | godap json + include a=c + json
$  echo '{"foo":"bar", "baz":"qux", "a":"b"}' | godap json + include a=b + json
{"a":"b","baz":"qux","foo":"bar"}

Ensures a document key includes a specified string.

Option Description Value Default
<document key> The key to transform utf8encode or ascii or base64encode or base64decode or upcase or downcase or hexencode or ascii <none>

Example:

$  echo '{"foo":"bar"}' | godap json + transform foo=base64encode + json
{"foo":"YmFy"}

Reverses the string contents of one or more fields. If the field is not a string, this is a no-op.

Option Description Value Default
<field name> The name of the string field to reverse <none> <none>

Example:

$ echo '{"foo":"baz","bar":"qux"}' | godap json + reverse foo bar + json
{"bar":"xuq","foo":"zab"}

Joins one or more source fields into a destination field, separated by a comma. This filter will attempt to cast the source field types to a string representation.

Option Description Value Default
source A comma-separated list of source field names <string field name comma separated> <none>
dest A destination field which will receive the join result <string field name> <none>
sep A separator to join the fields by <string> ,

Example:

$ echo '{"foo":"baz","bar":"qux"}' | godap json + join source=foo,bar dest=example sep=# + json
{"bar":"qux","example":"baz#qux","foo":"baz"}

Runs an input field through a given database, adding recog match fields to the input document.

Option Description Value Default
<field_name>=<database_name> one or more space separated pairs of a field name "=" a recog database name (the name from the recog xml matches attribute) <string field name "=" string database name, space separated> <none>

Example:

$ echo "9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.2" | godap lines + recog line=dns.versionbind + json
{"line":"9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.2","line.recog.os.cpe23":"cpe:/o:redhat:enterprise_linux:6","line.recog.os.family":"Linux","line.recog.os.product":"Enterprise Linux","line.recog.os.vendor":"Red Hat","line.recog.os.version":"6","line.recog.os.version.version":"9","line.recog.service.cpe23":"cpe:/a:isc:bind:9.8.2rc1","line.recog.service.family":"BIND","line.recog.service.product":"BIND","line.recog.service.vendor":"ISC","line.recog.service.version":"9.8.2rc1"}

Outputs

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL