dendrite

package module
v0.0.0-...-2fd0859 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 6, 2015 License: MIT Imports: 20 Imported by: 0

README

Dendrite

Are you looking for downloads, a tutorial, or the manifesto? </clippy>

Overview

Dendrite scrapes your existing logs and re-emits a unified log stream in modern, sensible, structured formats like JSON and StatsD over common protocols such as TCP, UDP, and file streams. SSL/TLS, RFC5424 Syslog, HTTP, etc., should be coming soon (Want to contribute?).

Why Dendrite?

Unified, structured logs and metrics are awesome.

Once you have a unified log stream, it's easy to build and use tools that consume, forward, and analyze your logs in scalable and understandable ways.

Logging is easier than instrumentation.

All applications generate logs. Not all applications are instrumented for metrics.

There are too many disparate instrumentation libraries such as JMX, StatsD, Metrics, Ostrich, and others. (We might start polling them later, if people find this convenient.)

Files are easy to read. Extracting metrics and statistics out of log files can be much easier than instrumenting an entire application to emit metrics.

Configure dendrite, not every application.

In today's open-source environment, it's common for, e.g., a Ruby on Rails app, to be served by HAProxy, Nginx, a Varnish server, a Rack server, and Rails itself. And then you'll want slow query logs from your database and your Redis server... and what about your work queue system...? The list goes on!

With Dendrite, it's easy to create and share useful configuration cookbooks for each of these services, drop them into your /etc/dendrite/conf.d directory, reload Dendrite, and be off and running with real-time metrics.

Dendrite is structured.

Logs are more than lines of text. Dendrite understands dates, numbers, counters, timings, searchable strings, fields, and more.

Dendrite is tiny.

Running the optimized agent on your servers typically consumes less than 5MB of RAM, and very little CPU.

Configuration

Dendrite will load a config file at /etc/dendrite/conf.yaml. This is overridable with the -f flag. Dendrite will then load any YAML files it can find in a conf.d directory below the main conf.yml config file. The configurations in these files will be merged into the main config file. By convention, conf.d config files should only contain one source or destination group each.

The primary YAML file follows a format looks like this:

global:
  # Where we store pointers into the current position of files
  offset_dir: /var/lib/dendrite
  
  # Number of exiting bytes to consume when we observe a new file.  Equivalent 
  # to tail -n, but with bytes instead of lines.  -1 for unlimited.
  max_backfill_bytes: 0
  # more keys may be added in later dendrite versions
sources:
  # ... (usually empty, delegated to the conf.d)
destinations:
  # ... (usually empty, delegated to the conf.d)

A typical conf.d yaml file looks like:

sources:
  # a key/name for the service
  syslog:
  
    # Astericks, etc are useful. Syntax is documented at
    # http://golang.org/pkg/path/filepath/#Match
    glob: /var/log/system.log

    # The log lines are parsed with a RE2 regex
    # (https://code.google.com/p/re2/wiki/Syntax). Named matching groups
    # become columns in the structured output.
    #
    # This pattern parses my OS X syslog. Syslog isn't consistent, 
    # so this may not work on your system.
    pattern: "(?P<date>.*?:[0-9]+) (?P<user>\\S+) (?P<prog>\\w+)\\[(?P<pid>\\d+)\\]: (?P<text>.*)"
    
    # The output of the regexp can be post-processed. This allows you
    # to specify type information, etc.
    #
    # Current field types are string, date, int, double and timestamp.
    #
    # There are "treatments," which include tokenized, as well as 
    # gauge, timing, and metric. The last few treatments are for 
    # specialized integers, and will be treated differently by statsd.
    fields:
      # tstamp is the field name in the output.
      tstamp:
        # date is the name of the regex match group.
        name: date
        type: timestamp
        format: "Jan _2 15:04:05"
      line: 
        # you can match numbered subgroups, in addition to named ones.
        group: 0
      tokens: 
        name: text
        type: string
        # this will create an array of the matched tokens.
        treatment: tokenized
        pattern: \S+\b
      text: 
        # If there wasn't the tokens field above, this would be 
        # unneccessary. All named match groups are implicitly turned into 
        # string fields. However, since I used the "text" match group  
        # above, the implicit string match no longer exists.
        type: string
      pid:
        type: int

Or for a destination conf.d YAML file:

destinations:
  # a key/url for the destination.  Typically, the scheme portion of the url
  # will be of the form transport+encoding.  We currently support statsd and 
  # json encodings, as well as udp, tcp, and file transports.
  #
  # Also, the colon in urls always needs to be quoted, so as not to be 
  # confused with nested yaml.
  stats: "udp+statsd://foo.bar.com:1234"
destinations:
  # another example
  tmp: "file+json:///tmp/json.log"

Look in the cookbook directory for more examples.

Getting Started

  1. Download the latest binary for your system from our downloads page. Alternately, if you have a Go environment and want to run from trunk, you can simply run go get github.com/onemorecloud/dendrite/cmd/dendrite.
  2. Walk through the tutorial.

Contributing

Join us in our Google Group or HipChat room to give feedback and chat about how you might be interested in using and/or contributing.

Ideas:

  • Implement a cookbook entry for your favorite log format.
  • Open issues
  • tackle an output protocol/encoding you'd like to see included.

Documentation

Index

Constants

View Source
const (
	String = iota
	Integer
	Double
	Timestamp
)
View Source
const (
	Simple = iota
	Tokens
	Hash
	Gauge
	Metric
	Counter
)

Variables

View Source
var DefaultPattern = "(?P<line>.*?)\r?\n"
View Source
var EmptyReader = new(noOpReader)

Functions

func NewAnyReader

func NewAnyReader(r []io.Reader) io.Reader

func NewFileReadWriter

func NewFileReadWriter(path string) (io.ReadWriteCloser, error)

func NewReadWriter

func NewReadWriter(u *url.URL) (io.ReadWriteCloser, error)

func NewTCPReadWriter

func NewTCPReadWriter(u *url.URL) (io.ReadWriteCloser, error)

func NewUDPReadWriter

func NewUDPReadWriter(u *url.URL) (io.ReadWriteCloser, error)

func RecursiveMergeNoConflict

func RecursiveMergeNoConflict(a map[string]interface{}, b map[string]interface{}, path string) error

func Unescape

func Unescape(in string) string

func YamlUnmarshal

func YamlUnmarshal(node yaml.Node) interface{}

Types

type Column

type Column struct {
	Type      FieldType
	Treatment FieldTreatment
	Value     interface{}
}

type Config

type Config struct {
	OffsetDir        string
	MaxBackfillBytes int64
	MaxLineSizeBytes int64
	Destinations     []DestinationConfig
	Sources          []SourceConfig
}

func NewConfig

func NewConfig(configFile string, hostname string) (*Config, error)

Mostly delegate

func (*Config) CreateAllTailGroups

func (config *Config) CreateAllTailGroups(drain chan Record) TailGroups

func (*Config) CreateDestinations

func (config *Config) CreateDestinations() Destinations

type Destination

type Destination struct {
	Encoder Encoder
	RW      io.ReadWriter
}

func NewDestination

func NewDestination(config DestinationConfig) (*Destination, error)

type DestinationConfig

type DestinationConfig struct {
	Name string
	Url  *url.URL
}

type Destinations

type Destinations []*Destination

func NewDestinations

func NewDestinations() Destinations

func (*Destinations) Consume

func (dests *Destinations) Consume(ch chan Record, finished chan bool)

func (*Destinations) Reader

func (dests *Destinations) Reader() io.Reader

type Encoder

type Encoder interface {
	Encode(out map[string]Column, writer io.Writer)
}

func NewEncoder

func NewEncoder(u *url.URL) (Encoder, error)

type FieldConfig

type FieldConfig struct {
	Name      string
	Alias     string
	Type      FieldType
	Treatment FieldTreatment
	Group     int
	Format    string
	Pattern   *regexp.Regexp
	Salt      string
}

type FieldTreatment

type FieldTreatment int

type FieldType

type FieldType int

type JsonEncoder

type JsonEncoder struct{}

func (*JsonEncoder) Encode

func (*JsonEncoder) Encode(out map[string]Column, writer io.Writer)

type Parser

type Parser interface {
	Consume(bytes []byte, counter *int64)
}

func NewRegexpParser

func NewRegexpParser(hostname string, group string, file string, output chan Record, pattern string, fields []FieldConfig, maxLineSize int64) Parser

type RawStringEncoder

type RawStringEncoder struct{}

func (*RawStringEncoder) Encode

func (*RawStringEncoder) Encode(out map[string]Column, writer io.Writer)

type Record

type Record map[string]Column

type RegexpParser

type RegexpParser struct {
	// contains filtered or unexported fields
}

func (*RegexpParser) Consume

func (parser *RegexpParser) Consume(bytes []byte, counter *int64)

type SourceConfig

type SourceConfig struct {
	Glob             string
	Pattern          string
	Fields           []FieldConfig
	Name             string
	OffsetDir        string
	Hostname         string
	MaxBackfillBytes int64
	MaxLineSizeBytes int64
}

type StatsdEncoder

type StatsdEncoder struct{}

func (*StatsdEncoder) Encode

func (*StatsdEncoder) Encode(out map[string]Column, writer io.Writer)

type SystemTimeProvider

type SystemTimeProvider struct{}

func (*SystemTimeProvider) Now

func (*SystemTimeProvider) Now() time.Time

type Tail

type Tail struct {
	Path       string
	OffsetPath string
	Watcher    watch.FileWatcher
	Parser     Parser
	// contains filtered or unexported fields
}

func NewTail

func NewTail(parser Parser, maxBackfill int64, path string, offsetPath string, offset int64) *Tail

func (*Tail) Close

func (tail *Tail) Close()

func (*Tail) LoadOffset

func (tail *Tail) LoadOffset()

func (*Tail) Offset

func (tail *Tail) Offset() int64

func (*Tail) Poll

func (tail *Tail) Poll()

func (*Tail) SetOffset

func (tail *Tail) SetOffset(o int64)

func (*Tail) StartWatching

func (tail *Tail) StartWatching()

func (*Tail) Stat

func (tail *Tail) Stat() (fi os.FileInfo, err error)

func (*Tail) WriteOffset

func (tail *Tail) WriteOffset()

type TailGroup

type TailGroup struct {
	Glob      string
	Pattern   string
	OffsetDir string
	Name      string
	Hostname  string
	Tails     map[string]*Tail
	// contains filtered or unexported fields
}

func NewTailGroup

func NewTailGroup(config SourceConfig, output chan Record) *TailGroup

func (*TailGroup) NewParser

func (group *TailGroup) NewParser(file string) Parser

func (*TailGroup) Poll

func (group *TailGroup) Poll()

func (*TailGroup) Refresh

func (group *TailGroup) Refresh()

type TailGroups

type TailGroups []*TailGroup

func (*TailGroups) Loop

func (groups *TailGroups) Loop()

func (*TailGroups) Poll

func (groups *TailGroups) Poll()

func (*TailGroups) Refresh

func (groups *TailGroups) Refresh()

type TimeProvider

type TimeProvider interface {
	Now() time.Time
}
var StandardTimeProvider TimeProvider = new(SystemTimeProvider)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL