counterd

module
v0.0.0-...-a548f54 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 11, 2018 License: MIT

README

counterd daemon

This project implements a simple daemon for counting unique events for a set of attributes. The counterd accepts events that looks like:

{
    "id": "3D8125BD-BEE4-4E90-A15F-81F42C380C55",
    "date": "2018-01-31T01:12:53Z",
    "attributes": {
        "foo": "bar",
        "zip": "zap"
    }
}

Where the id unique identifies the event, and the attributes can be an arbitrary set of key/value pairs. The date can be omitted and the server will substitute in the current time.

When a new event is received, it is turned into a counter key and then corresponding HyperLogLog key in Redis is updated. For the above event the set of keys that would be updated are:

  • day:2018-01-31:foo:bar:zip:zap
  • week:2018-01-28:foo:bar:zip:zap
  • month:2018-01:foo:bar:zip:zap

By using a HyperLogLog key in Redis, counterd can handle many thousands of updates per second. The tradeoff is that the count of unique events is not perfectly accurate. However, this value is accurate within a few percentage points. See the Redis documentation for more details.

To make the counters more usable, counterd supports snapshotting the counter values into a PostgreSQL database. When a snapshot is taken the domain of key/value pairs seen is updated in the attributes_domain table, and the counters are updated in the counters table.

Here is an example of each:

select * from attributes_domain;
 attribute |   value
-----------+-----------
 foo       | bar
 zip       | zap
(2 rows)

select * from counters limit 5;
                  id                  | interval |        date         |           attributes            | count
--------------------------------------+----------+---------------------+---------------------------------+-------
 0b9e8f7e-6f61-43c9-9f93-cc4a2a28cd80 | day      | 2018-01-30 00:00:00 | {"foo": "bar", "zip": "zap"}    |   406

This format makes it easy to query for the sum across various attributes:

select sum(count) from counters where attributes->'foo' = '"bar"';
  sum
-------
  406
(1 row)

Usage

The counterd command has a few subcommands:

* server: Runs a long lived daemon which serves the API and can optionally snapshot periodically
* snapshot: Used to snapshot the counters and update the database
* sim: Used to simulate input to the server API. Used for testing and benchmarking.
* dbinit: Used to initialize the database and create the needed tables.

Each command documents the arguments. All the commands share an input file which is defined in HCL or HashiCorp Configuration Language. Below is an example file:


// Configures the listen address for the API server. Below is the default.
listen_address = "127.0.0.1:8001"

// Configures the address of the redis server to use. Below is the default.
redis_address = "127.0.0.1:6379

// Provides the address of the postgresql database in URL format. Below is the default.
postgresql_address = "postgres://postgres@localhost/postgres?sslmode=disable",

// Configure details of the snapshot
snapshot {
    // Configures how often the server daemon should perform snapshotting.
    // By default this is blank, and snapshotting is disabled. The cron syntax
    // is documented here: https://godoc.org/github.com/robfig/cron
    cron = "@hourly"

    // Configures which counter values to update in the database. The update threshold
    // is how long before the current time to scan for counters and update the database.
    // As an example, if set to "24h", all counters that could have been modified by
    // a date in the last 24h will be updated. Defaults to 3 hours.
    update_threshold = "3h"

    // Configures which counter values to delete from redis. The delete threshold
    // is how long before the current time to scan for counters and delete from redis.
    // As an example, if set to "2232h" (e.g. 3 months), all counters older than then
    // would be deleted. Defaults to 3 months.
    delete_threshold = "2232h"
}

// Configure optional authentication
auth {
    // Required is used to optionally enable authentication. When enabled, an API client
    // must provide an "Authorization: Bearer <Token>" header to authenticate. Defaults to false.
    required = false

    // Tokens is a list of bearer tokens that are authorized to use the API.
    // Any number of tokens can be specified.
    tokens = ["D0816608-AB58-4AC8-9563-8D9F13B2F89D", "31937DCC-748A-4F4C-B568-016E3293B60D"]
}

// Configure optional filtering of attributes
attributes {
    // Whitelist is used to filter the set of attribute keys to only those explicitly in the list.
    // Any other attribute keys will be ignored.
    whitelist = ["foo", "bar"]

    // Blacklist is used to filter the set of attribute keys to exclude those in the list.
    // Any other attribute keys will be allowed.
    blacklist = ["zip"]
}

API

The counterd daemon serves an REST API over HTTP. The following endpoints are documented below.

/v1/ingress

This endpoint is used to ingess a new event. It supports the PUT method and expects a JSON object as the request body, matching the format of:

{
    "id": "3D8125BD-BEE4-4E90-A15F-81F42C380C55",
    "date": "2018-01-31T01:12:53Z",
    "attributes": {
        "foo": "bar",
        "zip": "zap"
    }
}

The id field must uniquely identify the event. The attributes can be an arbitrary set of key/value pairs, but cannot use the reserved colon (":") value. The date can be omitted and the server will substitute in the current time.

The server will return a 200 response code and no body on success.

Caveats

The counter structure used means there is a key in redis and a row in the database for every permutation of attributes. If you have a very large domain of attributes (lots of keys or values) then you should ensure Redis has enough memory to store all the counters and that your database is appropriately sized.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL