ingester

command
v0.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2024 License: MIT Imports: 13 Imported by: 0

README

ClickHouse Ingester

The ClickHouse ingester can be used to write logs from Kafka into ClickHouse. To write the logs from Fluent Bit into Kafka the official Kafka output plugin can be used.

Ingester

Configuration

An example Deployment for the ClickHouse ingester can be found in the ingester.yaml file. The following command-line flags and environment variables can be used to configure the ingester:

Command-Line Flag Environment Variable Description Default
--metrics-server.address METRICS_SERVER_ADDRESS The address, where the metrics server should listen on. :2021
--clickhouse.address CLICKHOUSE_ADDRESS ClickHouse address to connect to.
--clickhouse.database CLICKHOUSE_DATABASE ClickHouse database name. logs
--clickhouse.username CLICKHOUSE_USERNAME ClickHouse username for the connection.
--clickhouse.password CLICKHOUSE_PASSWORD ClickHouse password for the connection.
--clickhouse.dial-timeout CLICKHOUSE_DIAL_TIMEOUT ClickHouse dial timeout. 10s
--clickhouse.conn-max-lifetime CLICKHOUSE_CONN_MAX_LIFETIME ClickHouse maximum connection lifetime. 1h
--clickhouse.max-idle-conns CLICKHOUSE_MAX_IDLE_CONNS ClickHouse maximum number of idle connections. 1
--clickhouse.max-open-conns CLICKHOUSE_MAX_OPEN_CONNS ClickHouse maximum number of open connections. 1
--clickhouse.async-insert CLICKHOUSE_ASYNC_INSERT Enable async inserts. false
--clickhouse.wait-for-async-insert CLICKHOUSE_WAIT_FOR_ASYNC_INSERT Wait for async inserts. false
--clickhouse.batch-size CLICKHOUSE_BATCH_SIZE The size for how many log lines should be buffered, before they are written to ClickHouse. 100000
--clickhouse.flush-interval CLICKHOUSE_FLUSH_INTERVAL The maximum amount of time to wait, before logs are written to ClickHouse. 60s
--clickhouse.force-number-fields CLICKHOUSE_FORCE_NUMBER_FIELDS A list of fields which should be parsed as number. []
--clickhouse.force-underscores CLICKHOUSE_FORCE_UNDERSCORES Replace all . with _ in keys. false
--kafka.brokers KAFKA_BROKERS Kafka bootstrap brokers to connect to, as a comma separated list
--kafka.group KAFKA_GROUP Kafka consumer group definition kafka-clickhouse
--kafka.version KAFKA_VERSION Kafka cluster version 2.1.1
--kafka.topics KAFKA_TOPICS Kafka topics to be consumed, as a comma separated list fluent-bit
--log.format LOG_FORMAT The log format. Must be console or json. console
--log.level LOG_LEVEL The log level. Must be debug, info, warn, error, fatal or panic. info

Development

We are using kind for local development. To create a new Kubernetes cluster using kind you can run the cluster/cluster.sh script, which will create such a cluster with a Docker registry:

./cluster/cluster.sh

Once the cluster is running we can build and push the Docker image for Fluent Bit:

docker build -f cmd/ingester/Dockerfile -t localhost:5000/klogs:latest-ingester .
docker push localhost:5000/klogs:latest-ingester

# To run the Docker image locally, the following command can be used:
docker run -it --rm localhost:5000/klogs:latest-ingester

In the next step we have to create our ClickHouse cluster via the ClickHouse Operator. To do that we can deploy all the files from the cluster/clickhouse-operator and cluster/clickhouse folder:

k apply -f cluster/clickhouse-operator
k apply -f cluster/clickhouse

Once ClickHouse is running we have to connect to one ClickHouse instance to create our SQL schema. The schema can be found in the schema.sql file, just execute each SQL command one by one on the ClickHouse instance:

k exec -n clickhouse -it chi-clickhouse-sharded-0-0-0 -c clickhouse -- clickhouse-client

Before we can deploy Fluent Bit and the ingester, we have to deploy Kafka using the following command:

k apply -f cluster/kafka

Now we can deploy Fluent Bit to ingest all logs into Kafka and the ingester to write the logs from Kafka into ClickHouse:

k apply -f cluster/fluent-bit/ingester
k logs -n fluent-bit -l app=fluent-bit -f

To check if the logs are arriving in ClickHouse you can use the following SQL commands:

SELECT count(*) FROM logs.logs;
SELECT * FROM logs.logs LIMIT 10;

SELECT count(*) FROM logs.logs_local;
SELECT * FROM logs.logs_local LIMIT 10;

To clean up all the created resources run the following commands:

kind delete cluster --name fluent-bit-clickhouse
docker stop kind-registry
docker rm kind-registry

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL