dp-observation-extractor

module
v1.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 30, 2023 License: MIT

README

dp-observation-extractor

  • Consumes a Kafka message specifying a CSV file hosted on AWS S3
  • Retrieves the file and produces a Kafka message for each row in the CSV

Getting started

You may want vault to run this service:

  • Run brew install vault
  • Run vault server -dev
  • Clone the repo go get github.com/ONSdigital/dp-csv-splitter
  • Run the application make debug
Running in isolation
  • run kafka consumer / producer apps
  • run local S3 store?

Kafka scripts

Scripts for updating and debugging Kafka can be found here(dp-data-tools)

Configuration

Environment variable Default Description
BIND_ADDR ":21600" The port to bind to
AWS_REGION "eu-west-1" The AWS region to use
BUCKET_NAMES ons-dp-publishing-uploaded-datasets The expected S3 bucket names where the CSV files will be obtained from
ENCRYPTION_DISABLED true A boolean flag to identify if encryption of files is disabled or not
GRACEFUL_SHUTDOWN_TIMEOUT "5s" The shutdown timeout in seconds
HEALTHCHECK_INTERVAL 30s The period of time between health checks
HEALTHCHECK_CRITICAL_TIMEOUT 90s The period of time after which failing checks will result in critical global
KAFKA_ADDR "localhost:9092" The addresses of the Kafka brokers (comma-separated)
KAFKA_VERSION "1.0.2" The kafka version that this service expects to connect to
KAFKA_OFFSET_OLDEST true set kafka offset to be oldest if true
KAFKA_SEC_PROTO unset if set to TLS, kafka connections will use TLS [1]
KAFKA_SEC_CA_CERTS unset CA cert chain for the server cert [1]
KAFKA_SEC_CLIENT_KEY unset PEM for the client key [1]
KAFKA_SEC_CLIENT_CERT unset PEM for the client certificate [1]
KAFKA_SEC_SKIP_VERIFY false ignores server certificate issues if true [1]
LOCALSTACK_HOST "" Localstack to connect to for local S3 functionality
ERROR_PRODUCER_TOPIC "report-events" The Kafka topic to send report event errors to
FILE_CONSUMER_GROUP "dimensions-inserted" The Kafka consumer group to consume file messages from
FILE_CONSUMER_TOPIC "dimensions-inserted" The Kafka topic to consume file messages from
OBSERVATION_PRODUCER_TOPIC "observation-extracted" The Kafka topic to send the observation messages to
VAULT_ADDR http://localhost:8200 The vault address
VAULT_TOKEN - Vault token required for the client to talk to vault. (Use make debug to create a vault token)
VAULT_PATH secret/shared/psk The path where the psks will be stored in for vault
check status

Notes:

1. <a name="notes_1">For more info, see the [kafka TLS examples documentation](https://github.com/ONSdigital/dp-kafka/tree/main/examples#tls)</a>

Contributing

See CONTRIBUTING for details.

License

Copyright © 2016-2021, Office for National Statistics (https://www.ons.gov.uk)

Released under MIT license, see LICENSE for details.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL