go-concurrency-limits

command module
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 28, 2021 License: Apache-2.0 Imports: 0 Imported by: 0

README

GoDoc Build Status Coverage Status Releases Releases

Background

When thinking of service availability operators traditionally think in terms of RPS (requests per second). Stress tests are normally performed to determine the RPS at which point the service tips over. RPS limits are then set somewhere below this tipping point (say 75% of this value) and enforced via a token bucket. However, in large distributed systems that auto-scale this value quickly goes out of date and the service falls over by becoming non-responsive as it is unable to gracefully shed excess load. Instead of thinking in terms of RPS, we should be thinking in terms of concurrent request where we apply queuing theory to determine the number of concurrent requests a service can handle before a queue starts to build up, latencies increase and the service eventually exhausts a hard limit such as CPU, memory, disk or network. This relationship is covered very nicely with Little's Law where Limit = Average RPS * Average Latency.

Concurrency limits are very easy to enforce but difficult to determine as they would require operators to fully understand the hardware services run on and coordinate how they scale. Instead we'd prefer to measure or estimate the concurrency limits at each point in the network. As systems scale and hit limits each node will adjust and enforce its local view of the limit. To estimate the limit we borrow from common TCP congestion control algorithms by equating a system's concurrency limit to a TCP congestion window.

Before applying the algorithm we need to set some ground rules.

  • We accept that every system has an inherent concurrency limit that is determined by a hard resources, such as number of CPU cores.
  • We accept that this limit can change as a system auto-scales.
  • For large and complex distributed systems it's impossible to know all the hard resources.
  • We can use latency measurements to determine when queuing happens.
  • We can use timeouts and rejected requests to aggressively back off.

Limit Algorithms

Vegas

Delay based algorithm where the bottleneck queue is estimated as

L * (1 - minRTT/sampleRtt)

At the end of each sampling window the limit is increased by 1 if the queue is less than alpha (typically a value between 2-3) or decreased by 1 if the queue is greater than beta (typically a value between 4-6 requests).

Gradient2

This algorithm attempts to address bias and drift when using minimum latency measurements. To do this the algorithm tracks uses the measure of divergence between two exponential averages over a long and short time time window. Using averages the algorithm can smooth out the impact of outliers for bursty traffic. Divergence duration is used as a proxy to identify a queueing trend at which point the algorithm aggresively reduces the limit.

Enforcement Strategies

Simple

In the simplest use case we don't want to differentiate between requests and so enforce a single gauge of the number of inflight requests. Requests are rejected immediately once the gauge value equals the limit.

Partitioned

For a slightly more complex system, it's desirable to partition requests to different backend/services. For example, you might shard by a customer id modulus 64 and the remainder you use as a unique backend identifier to target the the request. This allows for specific partitions to begin failing while others are operation normally.

Percentage

For more complex systems it's desirable to provide certain quality of service guarantees while still making efficient use of resources. Here we guarantee specific types of requests get a certain percentage of the concurrency limit. For example, a system that takes both live and batch traffic may want to give live traffic 100% of the limit during heavy load and is OK with starving batch traffic. Or, a system may want to guarantee that 50% of the limit is given to write traffic so writes are never starved.

Integrations

GRPC

A concurrency limiter may be installed either on the server or client. The choice of limiter depends on your use case. For the most part it is recommended to use a dynamic delay based limiter such as the VegasLimit on the server and either a pure loss based (AIMDLimit) or combined loss and delay based limiter on the client.

Server Limiter

The purpose of the server limiter is to protect the server from either increased client traffic (batch apps or retry storms) or latency spikes from a dependent service. With the limiter installed the server can ensure that latencies remain low by rejecting excess traffic with Status.UNAVAILABLE errors.

In this example a GRPC server is configured with a single adaptive limiter that is shared among batch and live traffic with live traffic guaranteed 90% of throughput and 10% guaranteed to batch. For simplicity we just expect the client to send a "group" header identifying it as 'live' or 'batch'. Ideally this should be done using TLS certificates and a server side lookup of identity to grouping. Any requests not identified as either live or batch may only use excess capacity.

import (
    gclGrpc "github.com/platnummonkey/go-concurrency-limits/grpc"
)

// setup grpc server with this option
serverOption := grpc.UnaryInterceptor(
    gclGrpc.UnaryServerInterceptor(
        gclGrpc.WithLimiter(...),
        gclGrpc.WithServerResponseTypeClassifier(..),
    ),
)
Client Limiter

There are two main use cases for client side limiters. A client side limiter can protect the client service from its dependent services by failing fast and serving a degraded experience to its client instead of having its latency go up and its resources eventually exhausted. For batch applications that call other services a client side limiter acts as a backpressure mechanism ensuring that the batch application does not put unnecessary load on dependent services.

In this example a GRPC client will use a blocking version of the VegasLimit to block the caller when the limit has been reached.

import (
    gclGrpc "github.com/platnummonkey/go-concurrency-limits/grpc"
)

// setup grpc client with this option
dialOption := grpc.WithUnaryInterceptor(
    gclGrpc.UnaryClientInterceptor(
        gclGrpc.WithLimiter(...),
        gclGrpc.WithClientResponseTypeClassifier(...),
    ),
)

References Used

  1. Original Java implementation - Netflix - https://github.com/netflix/concurrency-limits/
  2. Windowless Moving Percentile - Martin Jambon - https://mjambon.com/2016-07-23-moving-percentile/

Documentation

Overview

Package provides primitives for concurrency control in complex systems.

Directories

Path Synopsis
Package core provides the package interfaces.
Package core provides the package interfaces.
Package examples contains examples of using this package to solve concurrency problems.
Package examples contains examples of using this package to solve concurrency problems.
grpc_streaming/pb
Package pb is a generated protocol buffer package.
Package pb is a generated protocol buffer package.
grpc_unary/pb
Package pb is a generated protocol buffer package.
Package pb is a generated protocol buffer package.
Package grpc provides GRPC client/server mixins to add concurrency control.
Package grpc provides GRPC client/server mixins to add concurrency control.
Package limit provides several useful limit implementations.
Package limit provides several useful limit implementations.
functions
Package functions provides additional helper functions to the limit package.
Package functions provides additional helper functions to the limit package.
Package limiter provides common limiter implementations that are useful.
Package limiter provides common limiter implementations that are useful.
Package measurements provides measurement reading implementations
Package measurements provides measurement reading implementations
metric_registry
datadog
Package datadog implements the metric registry interface for a Datadog provider.
Package datadog implements the metric registry interface for a Datadog provider.
gometrics
Package gometrics implements the metric registry interface for a gometrics provider.
Package gometrics implements the metric registry interface for a gometrics provider.
Package patterns provides common patterns as higher level abstractions from the library building blocks.
Package patterns provides common patterns as higher level abstractions from the library building blocks.
pool
Package pool provides common pool patterns for concurrency control.
Package pool provides common pool patterns for concurrency control.
Package strategy provides common strategy implementations.
Package strategy provides common strategy implementations.
matchers
Package matchers provides basic matchers for partitioned type strategies.
Package matchers provides basic matchers for partitioned type strategies.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL