minimal-server-monitoring

module
v1.1.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2024 License: MIT

README

minimal-server-monitoring

License Tag Stars Go Report Card

This tool lets you monitor a typical home server running applications in containers and receive alerts on your smartphone. It is designed to be light and simple (no database, no GUI, a single configuration file).

Features

  • run in a container (tested with both docker and podman)
  • send notifications to any supported services by shoutrrr
  • alert when a container is restarting forever
  • alert when a container isn't started
  • alert when a target is unreachable (ping)
  • alert when available disk space is low
  • alert when systemd service is failed
  • notify when a container image is updated (provide an alternative to watchtower if you are running podman with podman-auto-update)

Versioning and packaging

This tool follows semantic versioning.

Pre-built images are available on github packages:

  • ghcr.io/mcarbonne/minimal-server-monitoring:main (main branch)
  • ghcr.io/mcarbonne/minimal-server-monitoring:latest: latest tagged version
  • ghcr.io/mcarbonne/minimal-server-monitoring:x.x.x
  • ghcr.io/mcarbonne/minimal-server-monitoring:x.x
  • ghcr.io/mcarbonne/minimal-server-monitoring:x

For automatic updates (watchtower, podman-auto-update...), using the lastest major tag available (ghcr.io/mcarbonne/minimal-server-monitoring:1) is recommanded to avoid breaking changes.

Minimal configuration

Bare minimum (container monitoring only, and alerts with shoutrrr)
docker run -e MACHINENAME=$(hostname) -e SHOUTRRR=XXXXXXX -v .../cache.json:/app/cache.json -v /var/run/docker.sock:/var/run/docker.sock:ro \
--name minimal-server-monitoring -d ghcr.io/mcarbonne/minimal-server-monitoring:1
Custom config.json
docker run \
-v .../config.json:/app/config.json:ro \
-v .../cache.json:/app/cache.json \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /run/systemd:/run/systemd:ro \
--name minimal-server-monitoring -d ghcr.io/mcarbonne/minimal-server-monitoring:1
  • -v .../config.json:/app/config.json:ro: override default configuration file with your settings. Default configuration file is available here. Have a look at example_config.json for an exhaustive lists of available parameters.
  • -v .../cache.json:/app/cache.json: persist the cache
  • -v /var/run/docker.sock:/var/run/docker.sock:ro: give access to the host docker daemon (required for container provider). Use /run/podman/podman.sock:/var/run/docker.sock:ro if you are using podman.
  • -v /run/systemd:/run/systemd:ro: give access to the host systemd (required for systemd provider)

Internal

flowchart TD
subgraph Scraping
    Storage
      Sc(Schedule scrapers)
      Sc-..->S1 & S2 & S3
      S1("`**Scraper n°1**
      - provider: container
      - scrape_interval: 15s`")
      S2("`**Scraper n°2**
      - provider: ping
      - scrape_interval: 30s`")
      S3(...)
    S1 & S2 & S3 -->SC
    SC{{Collect ScrapeResult}}
    Storage[(Storage)]
    S1 & S2 & S3<-.->Storage
end

SC--"- states\n- messages"-->AlertCenter

subgraph AlertCenter
    AC{{"Generate notifications"}}
    AC--notifications-->F
    F{{Filtering}}
    F--filtered notifications-->G
    G{{Grouping}}
end
G--filtered and grouped notifications-->Notifier
subgraph Notifier
    C{{Send notifications}}
    N1(Shoutrrr)
    N2(...)
    C-->N1
    C-->N2
end
Scraping

Schedule configured scrapers. Each scraper may emit multiple states and multiple messages. On contrary to some other monitoring tools, decisions are taken in scrapers (i.e. is metric healthy).

Multiple instances of a given provider may be allowed (depending on provider).

A State metric is the combination of a metricId, a state (boolean) and a message. Example: metricId: "container_XXXX_state", isHealthy: false, message: "XXXX isn't running"

A Message metric is the combination of a metricId and a message. Example: metricId: "container_XXXX_updated", message: "container XXXX was updated ...."

Providers can persist data using Storage, a simple key-value database.

The following providers are implemented :

container
  • no parameters
  • only one instance allowed
  • messages (for every running containers):
    • when a container image is updated
  • states (for every running containers):
    • container status (check if started)
    • container restart (check if restarting forever)
ping
parameter description required default value
targets list of ip addresses/hostnames to ping yes -
retry_count how many times to retry if ping failed no 3
  • provide one state: is target reachable.
  • multiple instances allowed
filesystemusage
parameter description required default value
mountpoints list of mount points to check yes -
threshold_percent minimum threshold (percentage) of available disk space no 20
  • provide one state per mountpoint
  • multiple instances allowed
systemd
  • no parameters
  • only one instance allowed
  • states (for every services):
    • service active state (ActiveState != failed)
AlertCenter

AlertCenter is here to:

  • emit notifications from scrape result
  • avoid beeing flooded with notifications (filtering + grouping)
Generate notifications

If a state is marked as failed unhealthy_threshold time in a row, a notification is sent (metric XX failed). If a state is marked as OK healthy_threshold time in a row, a notification is sent (metric XX OK).

Messages are forwared as notifications (no processing at this step).

Filtering

Avoid sending too many notifications for a given metricId. Each metricId is allowed to send at most 5 messages every 30 minutes.

Grouping

When processing a notification, wait up to 15 seconds to group at most 10 notifications.

Notifier

Send all notifications to all configured notifiers. Multiple instances of each type are allowed.

Directories

Path Synopsis
cmd
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL