grok

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 11, 2017 License: Apache-2.0 Imports: 4 Imported by: 15

README

Grok

GoDoc Build Status Coverage Status Go Report Card

This is a fork of github.com/vjeantet/grok with improved concurrency. This fork is not 100% API compatible but the underlying implementation is (mostly) the same.

The main intention of this fork is to get rid of all the mutexes in this library to make it scale properly when using multiple go routines. Also as grok is an extension of the regexp package the function scheme of this library should be closer to golang's regexp package.

Changes

  • All patterns have to be known at creation time
  • No storage of known grok expressions (has to be done be the user, similar to the go regexp package)
  • No Mutexes used anymore (this library now scales as it should)
  • No Graphsort required anymore to resolve dependencies
  • All known patterns text files have been converted to go maps
  • Structured code to make it easier to maintain
  • Added tgo.ttesting dependencies for easier to write unittests
  • Fixed type hint case sensitivity and added string type
  • Added []byte based functions

Benchmarks

Original version

BenchmarkNew-8                      2000        899731 ns/op      720324 B/op       3438 allocs/op
BenchmarkCaptures-8                10000        200695 ns/op        4570 B/op          5 allocs/op
BenchmarkCapturesTypedFake-8       10000        197983 ns/op        4571 B/op          5 allocs/op
BenchmarkCapturesTypedReal-8       10000        206392 ns/op        4754 B/op         16 allocs/op
BenchmarkParallelCaptures-8        10000        208389 ns/op        4570 B/op          5 allocs/op (added locally)

This version

BenchmarkNew-8                      5000        357586 ns/op      285374 B/op       1611 allocs/op
BenchmarkCaptures-8                10000        200825 ns/op        4570 B/op          5 allocs/op
BenchmarkCapturesTypedFake-8       10000        197306 ns/op        4570 B/op          5 allocs/op
BenchmarkCapturesTypedReal-8       10000        194882 ns/op        4140 B/op         12 allocs/op
BenchmarkParallelCaptures-8        30000         55583 ns/op        4576 B/op          5 allocs/op

Improvements

BenchmarkNew-8                     +150%
BenchmarkParallelCaptures-8        +274%

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DefaultPatterns = map[string]string{
	"USERNAME":           `[a-zA-Z0-9._-]+`,
	"USER":               `%{USERNAME}`,
	"EMAILLOCALPART":     `[a-zA-Z][a-zA-Z0-9_.+-=:]+`,
	"EMAILADDRESS":       `%{EMAILLOCALPART}@%{HOSTNAME}`,
	"HTTPDUSER":          `%{EMAILADDRESS}|%{USER}`,
	"INT":                `(?:[+-]?(?:[0-9]+))`,
	"BASE10NUM":          `([+-]?(?:[0-9]+(?:\.[0-9]+)?)|\.[0-9]+)`,
	"NUMBER":             `(?:%{BASE10NUM})`,
	"BASE16NUM":          `(0[xX]?[0-9a-fA-F]+)`,
	"POSINT":             `\b(?:[1-9][0-9]*)\b`,
	"NONNEGINT":          `\b(?:[0-9]+)\b`,
	"WORD":               `\b\w+\b`,
	"NOTSPACE":           `\S+`,
	"SPACE":              `\s*`,
	"DATA":               `.*?`,
	"GREEDYDATA":         `.*`,
	"QUOTEDSTRING":       `"([^"\\]*(\\.[^"\\]*)*)"|\'([^\'\\]*(\\.[^\'\\]*)*)\'`,
	"UUID":               `[A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}`,
	"MAC":                `(?:%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC})`,
	"CISCOMAC":           `(?:(?:[A-Fa-f0-9]{4}\.){2}[A-Fa-f0-9]{4})`,
	"WINDOWSMAC":         `(?:(?:[A-Fa-f0-9]{2}-){5}[A-Fa-f0-9]{2})`,
	"COMMONMAC":          `(?:(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2})`,
	"IPV6":               `((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?`,
	"IPV4":               `(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)`,
	"IP":                 `(?:%{IPV6}|%{IPV4})`,
	"HOSTNAME":           `\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)`,
	"HOST":               `%{HOSTNAME}`,
	"IPORHOST":           `(?:%{IP}|%{HOSTNAME})`,
	"HOSTPORT":           `%{IPORHOST}:%{POSINT}`,
	"PATH":               `(?:%{UNIXPATH}|%{WINPATH})`,
	"UNIXPATH":           `(/[\w_%!$@:.,-]?/?)(\S+)?`,
	"TTY":                `(?:/dev/(pts|tty([pq])?)(\w+)?/?(?:[0-9]+))`,
	"WINPATH":            `([A-Za-z]:|\\)(?:\\[^\\?*]*)+`,
	"URIPROTO":           `[A-Za-z]+(\+[A-Za-z+]+)?`,
	"URIHOST":            `%{IPORHOST}(?::%{POSINT:port})?`,
	"URIPATH":            `(?:/[A-Za-z0-9$.+!*'(){},~:;=@#%_\-]*)+`,
	"URIPARAM":           `\?[A-Za-z0-9$.+!*'|(){},~@#%&/=:;_?\-\[\]<>]*`,
	"URIPATHPARAM":       `%{URIPATH}(?:%{URIPARAM})?`,
	"URI":                `%{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?`,
	"MONTH":              `\b(?:Jan(?:uary|uar)?|Feb(?:ruary|ruar)?|M(?:a|ä)?r(?:ch|z)?|Apr(?:il)?|Ma(?:y|i)?|Jun(?:e|i)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|O(?:c|k)?t(?:ober)?|Nov(?:ember)?|De(?:c|z)(?:ember)?)\b`,
	"MONTHNUM":           `(?:0?[1-9]|1[0-2])`,
	"MONTHNUM2":          `(?:0[1-9]|1[0-2])`,
	"MONTHDAY":           `(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])`,
	"DAY":                `(?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)`,
	"YEAR":               `(\d\d){1,2}`,
	"HOUR":               `(?:2[0123]|[01]?[0-9])`,
	"MINUTE":             `(?:[0-5][0-9])`,
	"SECOND":             `(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)`,
	"TIME":               `([^0-9]?)%{HOUR}:%{MINUTE}(?::%{SECOND})([^0-9]?)`,
	"DATE_US":            `%{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}`,
	"DATE_EU":            `%{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}`,
	"ISO8601_TIMEZONE":   `(?:Z|[+-]%{HOUR}(?::?%{MINUTE}))`,
	"ISO8601_SECOND":     `(?:%{SECOND}|60)`,
	"TIMESTAMP_ISO8601":  `%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?`,
	"DATE":               `%{DATE_US}|%{DATE_EU}`,
	"DATESTAMP":          `%{DATE}[- ]%{TIME}`,
	"TZ":                 `(?:[PMCE][SD]T|UTC)`,
	"DATESTAMP_RFC822":   `%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{TZ}`,
	"DATESTAMP_RFC2822":  `%{DAY}, %{MONTHDAY} %{MONTH} %{YEAR} %{TIME} %{ISO8601_TIMEZONE}`,
	"DATESTAMP_OTHER":    `%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{TZ} %{YEAR}`,
	"DATESTAMP_EVENTLOG": `%{YEAR}%{MONTHNUM2}%{MONTHDAY}%{HOUR}%{MINUTE}%{SECOND}`,
	"HTTPDERROR_DATE":    `%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}`,
	"SYSLOGTIMESTAMP":    `%{MONTH} +%{MONTHDAY} %{TIME}`,
	"PROG":               `[\x21-\x5a\x5c\x5e-\x7e]+`,
	"SYSLOGPROG":         `%{PROG:program}(?:\[%{POSINT:pid}\])?`,
	"SYSLOGHOST":         `%{IPORHOST}`,
	"SYSLOGFACILITY":     `<%{NONNEGINT:facility}.%{NONNEGINT:priority}>`,
	"HTTPDATE":           `%{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{INT}`,
	"QS":                 `%{QUOTEDSTRING}`,
	"SYSLOGBASE":         `%{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:`,
	"COMMONAPACHELOG":    `%{IPORHOST:clientip} %{HTTPDUSER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)`,
	"COMBINEDAPACHELOG":  `%{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}`,
	"HTTPD20_ERRORLOG":   `\[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:errormsg}`,
	"HTTPD24_ERRORLOG":   `\[%{HTTPDERROR_DATE:timestamp}\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}:tid %{NUMBER:tid}\]( \(%{POSINT:proxy_errorcode}\)%{DATA:proxy_errormessage}:)?( \[client %{IPORHOST:client}:%{POSINT:clientport}\])? %{DATA:errorcode}: %{GREEDYDATA:message}`,
	"HTTPD_ERRORLOG":     `%{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}`,
	"LOGLEVEL":           `([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)`,
}

DefaultPatterns is a collection of patterns that are added to each Grok instance if not explicitly disabled.

Functions

This section is empty.

Types

type CompiledGrok

type CompiledGrok struct {
	// contains filtered or unexported fields
}

CompiledGrok represents a compiled Grok expression. Use Grok.Compile to generate a CompiledGrok object.

func (CompiledGrok) GetFields

func (compiled CompiledGrok) GetFields() []string

GetFields returns a list of all named fields in this grok expression

func (CompiledGrok) Match

func (compiled CompiledGrok) Match(data []byte) bool

Match returns true if the given data matches the pattern.

func (CompiledGrok) MatchString

func (compiled CompiledGrok) MatchString(text string) bool

MatchString returns true if the given text matches the pattern.

func (CompiledGrok) Parse

func (compiled CompiledGrok) Parse(data []byte) map[string][]byte

Parse processes the given data and returns a map containing the values of all named fields as byte arrays. If a field is parsed more than once, the last match is return.

func (CompiledGrok) ParseString

func (compiled CompiledGrok) ParseString(text string) map[string]string

ParseString processes the given text and returns a map containing the values of all named fields as strings. If a field is parsed more than once, the last match is return.

func (CompiledGrok) ParseStringToMultiMap

func (compiled CompiledGrok) ParseStringToMultiMap(text string) map[string][]string

ParseStringToMultiMap acts like ParseString but allows multiple matches per field.

func (CompiledGrok) ParseStringTyped

func (compiled CompiledGrok) ParseStringTyped(text string) (map[string]interface{}, error)

ParseStringTyped processes the given data and returns a map containing the values of all named fields converted to their corresponding types. If no typehint is given, the value will be converted to string.

func (CompiledGrok) ParseToMultiMap

func (compiled CompiledGrok) ParseToMultiMap(data []byte) map[string][][]byte

ParseToMultiMap acts like Parse but allows multiple matches per field.

func (CompiledGrok) ParseTyped

func (compiled CompiledGrok) ParseTyped(data []byte) (map[string]interface{}, error)

ParseTyped processes the given data and returns a map containing the values of all named fields converted to their corresponding types. If no typehint is given, the value will be converted to string.

type Config

type Config struct {
	NamedCapturesOnly   bool
	SkipDefaultPatterns bool
	RemoveEmptyValues   bool
	Patterns            map[string]string
}

Config is used to pass a set of configuration values to the grok.New function.

type Grok

type Grok struct {
	// contains filtered or unexported fields
}

Grok holds a cache of known pattern substitions and acts as a builder for compiled grok patterns. All pattern substitutions must be passed at creation time and cannot be changed during runtime.

func New

func New(config Config) (*Grok, error)

New returns a Grok object that caches a given set of patterns and creates compiled grok patterns based on the passed configuration settings. You can use multiple grok objects that act independently.

func (Grok) Compile

func (grok Grok) Compile(pattern string) (*CompiledGrok, error)

Compile precompiles a given grok expression. This function should be used when a grok expression is used more than once.

func (Grok) Match

func (grok Grok) Match(pattern string, data []byte) (bool, error)

Match returns true if the given data matches the pattern. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) MatchString

func (grok Grok) MatchString(pattern, text string) (bool, error)

MatchString returns true if the given text matches the pattern. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) Parse

func (grok Grok) Parse(pattern string, data []byte) (map[string][]byte, error)

Parse processes the given data and returns a map containing the values of all named fields as byte arrays. If a field is parsed more than once, the last match is return. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) ParseString

func (grok Grok) ParseString(pattern, text string) (map[string]string, error)

ParseString processes the given text and returns a map containing the values of all named fields as strings. If a field is parsed more than once, the last match is return. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) ParseStringToMultiMap

func (grok Grok) ParseStringToMultiMap(pattern, text string) (map[string][]string, error)

ParseStringToMultiMap acts like ParseString but allows multiple matches per field. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) ParseStringTyped

func (grok Grok) ParseStringTyped(pattern, text string) (map[string]interface{}, error)

ParseStringTyped processes the given data and returns a map containing the values of all named fields converted to their corresponding types. If no typehint is given, the value will be converted to string. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) ParseToMultiMap

func (grok Grok) ParseToMultiMap(pattern string, data []byte) (map[string][][]byte, error)

ParseToMultiMap acts like Parse but allows multiple matches per field. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

func (Grok) ParseTyped

func (grok Grok) ParseTyped(pattern string, data []byte) (map[string]interface{}, error)

ParseTyped processes the given data and returns a map containing the values of all named fields converted to their corresponding types. If no typehint is given, the value will be converted to string. The given pattern is compiled on every call to this function. If you want to call this function more than once consider using Compile.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL