consul2pd

command module
v0.0.0-...-529526c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2018 License: MIT Imports: 13 Imported by: 0

README

Consul 2 Pagerduty

Overview:

Health checks in consul generate events. Those events you can watch. More details on this can be found [here] (https://www.consul.io/docs/agent/watches.html).

Pagerduty is a service for alert notifications. They call it incident management. It's not a free service, as such check out if you really want to you [them] (https://www.pagerduty.com/). They do have a 30 day free trial setup, which is useful for testings.

Consul events contain the full overview of a service. If you have the same service on 3 hosts and 1 is failing, all 3 services are reported in the event you are watching and you will see 1 failing and 2 passing. All the information in the event from consul is added to the PD event as details.

If you plan to use consul for monitoring you should look at something like CCH. Consul can be very noisy and failing / removing a service on the first fail isn't always useful. CCH let's you set parameters around your check that not present in consul service monitoring.

Consul2pd:

Consul2pd pushes the events 1:1 to pagerduty. It requires nearly no configruation.

  • You have to supply your PD API key to consul.
  • You can supply a proxy. It will read the HTTPS_PROXY env if set
  • You can add URL information to support troubleshooing links
Configuration Details and options:

Consul2pd will check consul KV store for the API key to use. It will first check for a key that matches the service name from the consul event. If that can't be found it will try a default key. Therefore at a minimum you have to supply 1 default API key.

consul kv put infra/pagerduty/http-test/key YourKeyHere

or set the default key:

consul kv put infra/pagerduty/default/key yourdefaultkey

You can set a service key as tag to the service. Not a check only the full service. The tag needs to be set to Pdkey:keyname.

consul kv put infra/pagerduty/_keyname_/key aKeyForThatService

Using different keys is in production something you should consider. PD events will be pushed to PD without further checking or such. You properly don't want pager events for all of that. Each PD key you can assign different queues, which in turn have different time to page threshold. Therefore you have set keys for the different urgencies.

Consul2pd resolved alerts if the events shows healthy again.

Setting a proxy:

consul kv put infra/pagerdutyproxy http://127.0.0.1:8888

This setting will overwrite the HTTPS_PROXY if set as ENV. If the consul KV is not set the ENV is automatically used.

The automated pickup of the service works by service name. In the examples above the service is called http-test. The JSON output from consul -watch will have that name with it and it's used for the KV lookup for the right key.

You still need to install a watch per service - this is required by consul

consul watch -type service  -service http-test /usr/local/bin/consul2pd

If you don't want to create those watches manual you can by service tag use a tool like [consulwatches] (https://gitlab.com/strasheim/consulwatches) or you can also use consul-template to generate it.

{
  "watches": [
      {{range $index,$element := services }}
      {{if gt $index 0 }}   ,{{end}}{
      "type": "service",
      "service": "{{.Name}}",
      "handler": "/usr/local/bin/consul2pd"
    } {{end}}
  ]
}
Logging

Consul2pd automatically logs to syslog, with the level INFO. If syslog isn't found logging will go to stdout. It will log

  • The number of request to be worked on
  • The critical events is has received
  • The failed submissions
Decoration of Events

Consul2pd will add information to the single event reported to PD. Every trigger event contain the complete overview as custom_details to pagerduty.

Backlog Queue

Consul2pd will add an note in the Consul KV "infra/pagerduty/queue/" if a service fail can't be reported to PD. Before making a new HTTP call to PD it will consult the queue and fetch the latest of the service state from consul and report that to PD.

Details about the APIs
Nitty Gritty

Where possible the consul data is queried with the stale parameter, which allows for a no-leader read of data. The retry queue has a delay of at least 2 minutes and requires another event in consul to be triggered at all. Further only events which are marked 429 (ask for the server to retry) are put there.

Links for the masses: The end point is: infra/pagerduty/default/links or if specific for a service infra/pagerduty/{service_name}/links Internally an array is build. Each line is sperated with a newline and 2 lines are required per link.

consul kv put infra/pagerduty/default/links "https://example.org
Help"

or

consul kv put infra/pagerduty/default/links "https://example.org
Help
https://example.org/morehelp
REALLY I need HELP"
Eventually Consistent Bridge

Consul2pd is an eventually consistent bridge. If the number of events are above the number of API request PD is accepting, the backlog queue is created and worked thou on the next event that triggered.

  • There is no active selftrigger - if required you needs to build on yourself via a consul check. ( simple check that runs date for example )
  • The queue is worked from a-z, meaning the order depends on the consul KV recurse read which is a lexical read
  • PD only accepts 60 events per minute per key, if that limit is an issue for your envirnoment, create different keys per service
  • When a new consul client joins the complete state is triggered by consul and pushed to PD. At those times you very likely will require a minute and more to have consistency again
Tests and no Tests and CI

There are no unit tests added to this tools. There is only the CI tests which is run on every commit. Since the number of isolated functions which are not networking functions is very low, the iteration tests need to be enough. If you disagree please create PRs with your unit tests. The CI pipeline includes go lint and vet as well as a function test

Consul Tokens

Consul2pd will read the ENV TOKEN and if that has a value it's assumed to be a consul ACL token and added to every consul request it makes.

Maintainance Mode

You can set a PD key to make consul2pd do nothing.

consul kv put infra/pagerduty/maintainance true

The value does not need to be true, if set to anything but false it will be seen as true. You best delete the value once the maintainance mode as ended.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL