A lightweight system to automatically scale Kinesis Data Streams up and down based on throughput.
Event Flow
Step 1: Metrics flow from the Kinesis Data Stream(s) into CloudWatch Metrics (Bytes/Sec, Records/Sec)
Step 2: Two alarms, Scale Up and Scale Down, evaluate those metrics and decide when to scale
Step 3: When a scaling alarm triggers it sends a message to the Scaling SNS Topic
Step 4: The Scaling Lambda processes that SNS message and…
Scales the Kinesis Data Stream up or down using UpdateShardCount
Scale Up events double the number of shards in the stream
Scale Down events halve the number of shards in the stream
Updates the metric math on the Scale Up and Scale Down alarms to reflect the new shard count.
Features
Designed for simplicity and a minimal service footprint.
Proven. This system has been battle tested, scaling thousands of production streams without issue.
Suitable for scaling massive amounts of streams. Each additional stream requires only 2 CloudWatch alarms.
Operations friendly. Everything is viewable/editable/debuggable in the console, no need to drop into the CLI to see what's going on.
Takes into account both ingress metrics Records Per Second and Bytes Per Second when deciding to scale a stream up or down.
Can optionally take into account egress needs via Max Iterator Age so streams that are N minutes behind (configurable) do not scale down and lose much needed Lambda processing power (Lambdas per Shard) because their shard count was reduced due to a drop in incoming traffic.
Already designed out the box to work within the 10 UpdateShardCount per rolling 24 hour limit.
Emits a custom CloudWatch error metric if scaling fails, you can alarm off this for added peace of mind.
Can optionally adjust reserved concurrency for your Lambda consumers as it scales their streams up and down.