sidekick

package module

v0.5.2-0...-68b3f38 Latest Latest Go to latest Published: Jun 28, 2021 License: AGPL-3.0 Imports: 47 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

README ¶

sidekick

sidekick is a high-performance sidecar load-balancer. By attaching a tiny load balancer as a sidecar to each of the client application processes, you can eliminate the centralized loadbalancer bottleneck and DNS failover management. sidekick automatically avoids sending traffic to the failed servers by checking their health via the readiness API and HTTP error returns.

Architecture

architecture

Demo sidekick-demo

Install

Binary Releases

OS	ARCH	Binary
Linux	amd64	linux-amd64
Linux	arm64	linux-arm64
Linux	ppc64le	linux-ppc64le
Linux	s390x	linux-s390x
Apple	amd64	darwin-amd64
Windows	amd64	windows-amd64

You can also verify the binary with minisign by downloading the corresponding .minisig signature file. Then run:

minisign -Vm sidekick-<OS>-<ARCH> -P RWTx5Zr1tiHQLwG9keckT0c45M3AGeHD6IvimQHpyRywVWGbP1aVSGav

Docker

Pull the latest release via:

docker pull minio/sidekick

Build from source

GO111MODULE=on go get github.com/minio/sidekick/cmd/sidekick

You will need a working Go environment. Therefore, please follow How to install Go. Minimum version required is go1.16

Usage

NAME:
  sidekick - High-Performance sidecar load-balancer

USAGE:
  sidekick - [FLAGS] SITE1 [SITE2..]

FLAGS:
  --address value, -a value           listening address for sidekick (default: ":8080")
  --health-path value, -p value       health check path
  --read-health-path value, -r value  health check path for read access - valid only for failover site
  --health-port value                 health check port (default: 0)
  --health-duration value, -d value   health check duration in seconds (default: 5)
  --insecure, -i                      disable TLS certificate verification
  --log, -l                           enable logging
  --trace value, -t value             enable request tracing - valid values are [all,application,minio] (default: "all")
  --quiet, -q                         disable console messages
  --json                              output sidekick logs and trace in json format
  --debug                             output verbose trace
  --cacert value                      CA certificate to verify peer against
  --client-cert value                 client certificate file
  --client-key value                  client private key file
  --cert value                        server certificate file
  --key value                         server private key file
  --help, -h                          show help
  --version, -v                       print the version

SITE:
  Each SITE is a comma separated list of pools of that site: http://172.17.0.{2...5},http://172.17.0.{6...9}.
  If all servers in SITE1 are down, then the traffic is routed to the next site - SITE2.

Examples

Load balance across a web service using DNS provided IPs

$ sidekick --health-path=/ready http://myapp.myorg.dom

Load balance across 4 MinIO Servers (http://minio1:9000 to http://minio4:9000)

$ sidekick --health-path=/minio/health/ready --address :8000 http://minio{1...4}:9000

Two sites with 4 servers each

$ sidekick --health-path=/minio/health/ready http://site1-minio{1...4}:9000 http://site2-minio{1...4}:9000

Realworld Example with spark-operator

As spark driver, executor sidecars, to begin with install spark-operator and MinIO on your kubernetes cluster

optional create a kubernetes namespace spark-operator

kubectl create ns spark-operator

Configure spark-operator

We shall be using maintained spark operator by GCP at https://github.com/GoogleCloudPlatform/spark-on-k8s-operator

helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm install spark-operator incubator/sparkoperator --namespace spark-operator  --set sparkJobNamespace=spark-operator --set enableWebhook=true

Install MinIO

helm install minio-distributed stable/minio --namespace spark-operator --set accessKey=minio,secretKey=minio123,persistence.enabled=false,mode=distributed

NOTE: persistence is disabled here for testing, make sure you are using persistence with PVs for production workload. For more details read our helm documentation

Once minio-distributed is up and running configure mc and upload some data, we shall choose mybucket as our bucketname.

Port-forward to access minio-cluster locally.

kubectl port-forward pod/minio-distributed-0 9000

Create bucket named mybucket and upload some text data for spark word count sample.

mc config host add minio-distributed http://localhost:9000 minio minio123
mc mb minio-distributed/mybucket
mc cp /etc/hosts minio-distributed/mybucket/mydata/{1..4}.txt

Run the spark job in k8s

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-minio-app
  namespace: spark-operator
spec:
  sparkConf:
    spark.kubernetes.allocation.batch.size: "50"
  hadoopConf:
    "fs.s3a.endpoint": "http://127.0.0.1:9000"
    "fs.s3a.access.key": "minio"
    "fs.s3a.secret.key": "minio123"
    "fs.s3a.path.style.access": "true"
    "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem"
  type: Scala
  sparkVersion: 2.4.5
  mode: cluster
  image: minio/spark:v2.4.5-hadoop-3.1
  imagePullPolicy: Always
  restartPolicy:
      type: OnFailure
      onFailureRetries: 3
      onFailureRetryInterval: 10
      onSubmissionFailureRetries: 5
      onSubmissionFailureRetryInterval: 20

  mainClass: org.apache.spark.examples.JavaWordCount
  mainApplicationFile: "local:///opt/spark/examples/target/original-spark-examples_2.11-2.4.6-SNAPSHOT.jar"
  arguments:
  - "s3a://mytestbucket/mydata"
  driver:
    cores: 1
    coreLimit: "1000m"
    memory: "512m"
    labels:
      version: 2.4.5
    sidecars:
    - name: minio-lb
      image: "minio/sidekick:v1.0.0"
      imagePullPolicy: Always
      args: ["--health-path", "/minio/health/ready", "--address", ":9000", "http://minio-distributed-{0...3}.minio-distributed-svc.spark-operator.svc.cluster.local:9000"]
      ports:
        - containerPort: 9000

  executor:
    cores: 1
    instances: 4
    memory: "512m"
    labels:
      version: 2.4.5
    sidecars:
    - name: minio-lb
      image: "minio/sidekick:v1.0.0"
      imagePullPolicy: Always
      args: ["--health-path", "/minio/health/ready", "--address", ":9000", "http://minio-distributed-{0...3}.minio-distributed-svc.spark-operator.svc.cluster.local:9000"]
      ports:
        - containerPort: 9000

kubectl create -f spark-job.yaml
kubectl logs -f --namespace spark-operator spark-minio-app-driver spark-kubernetes-driver

High Performance S3 Cache

S3 compatible object store can be configured for shared cache storage. This will allow applications using Sidekick load balancer to share a distributed cache, thus allowing hot tier caching. The cache can be any S3 compatible object store either within the network or remote, offering vastly improved time to first byte for applications, while also fully utilizing cache storage capacity and reducing network traffic.

Run sidekick configured with high performance cache on baremetal

Caching can be enabled by setting the cache environment variables for sidekick which specify the endpoint of S3 compatible object store, access key, secret key to authenticate to the store. Objects are cached on GET to the shared store if object from the backend exceeds a configurable minimum size. Default minimum size is 1MB.

export SIDEKICK_CACHE_ENDPOINT="http://minio-remote:9000"
export SIDEKICK_CACHE_ACCESS_KEY="minio"
export SIDEKICK_CACHE_SECRET_KEY="minio123"
export SIDEKICK_CACHE_BUCKET="cache01"
export SIDEKICK_CACHE_MIN_SIZE=64MB
export SIDEKICK_CACHE_HEALTH_DURATION=20
sidekick --health-path=/minio/health/ready http://minio{1...16}:9000

Run the spark job in k8s

Following example shows on how to configure sidekick as high performance cache sidecar with spark operator framework on kubernetes environment.

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-minio-app
  namespace: spark-operator
spec:
  sparkConf:
    spark.kubernetes.allocation.batch.size: "50"
  hadoopConf:
    "fs.s3a.endpoint": "http://127.0.0.1:9000"
    "fs.s3a.access.key": "minio"
    "fs.s3a.secret.key": "minio123"
    "fs.s3a.path.style.access": "true"
    "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem"
  type: Scala
  sparkVersion: 2.4.5
  mode: cluster
  image: minio/spark:v2.4.5-hadoop-3.1
  imagePullPolicy: Always
  restartPolicy:
      type: OnFailure
      onFailureRetries: 3
      onFailureRetryInterval: 10
      onSubmissionFailureRetries: 5
      onSubmissionFailureRetryInterval: 20

  mainClass: org.apache.spark.examples.JavaWordCount
  mainApplicationFile: "local:///opt/spark/examples/target/original-spark-examples_2.11-2.4.6-SNAPSHOT.jar"
  arguments:
  - "s3a://mytestbucket/mydata"
  driver:
    cores: 1
    coreLimit: "1000m"
    memory: "512m"
    labels:
      version: 2.4.5
    sidecars:
    - name: minio-lb
      image: "minio/sidekick:v0.5.8"
      imagePullPolicy: Always
      args: ["--health-path", "/minio/health/ready", "--address", ":9000", "http://minio-distributed-{0...3}.minio-distributed-svc.spark-operator.svc.cluster.local:9000"]
      env:
       - name: SIDEKICK_CACHE_ENDPOINT
         value: "http://minio-remote:9000"
       - name: SIDEKICK_CACHE_ACCESS_KEY
         value: "minio"
       - name: SIDEKICK_CACHE_SECRET_KEY
         value: "minio123"
       - name: SIDEKICK_CACHE_BUCKET
         value: "cache01"
       - name: SIDEKICK_CACHE_MIN_SIZE
         value: "32MiB"
       - name: SIDEKICK_CACHE_HEALTH_DURATION
         value: "20"
      ports:
        - containerPort: 9000

  executor:
    cores: 1
    instances: 4
    memory: "512m"
    labels:
      version: 2.4.5
    sidecars:
    - name: minio-lb
      image: "minio/sidekick:v0.5.8"
      imagePullPolicy: Always
      args: ["--health-path", "/minio/health/ready", "--address", ":9000", "http://minio-distributed-{0...3}.minio-distributed-svc.spark-operator.svc.cluster.local:9000"]
      env:
       - name: SIDEKICK_CACHE_ENDPOINT
         value: "https://minio-remote:9000"
       - name: SIDEKICK_CACHE_ACCESS_KEY
         value: "minio"
       - name: SIDEKICK_CACHE_SECRET_KEY
         value: "minio123"
       - name: SIDEKICK_CACHE_BUCKET
         value: "cache01"
       - name: SIDEKICK_CACHE_MIN_SIZE
         value: "32MiB"
       - name: SIDEKICK_CACHE_HEALTH_DURATION
         value: "20"
      ports:
        - containerPort: 9000

kubectl create -f spark-job.yaml
kubectl logs -f --namespace spark-operator spark-minio-app-driver spark-kubernetes-driver

Features

Sidekick cache layer is implemented as a high performance middleware wrapper around the Sidekick load balancer. All GET requests that qualify for caching per the RFC 7234 cache specifications and exceeding minimum configured size are streamed simultaneously to the application and the S3 cache. This allows simple, fast and zero memory overhead caching without affecting performance.
If an object is already cached to S3 store, the ETag and LastModified date are verified with the backend unless the Cache-Control header explicitly specifies "immutable" or "only-if-cached". Any cached entry that fails the ETag and/or LastModified checks or deleted from the backend is cache evicted automatically.
When a cache resource is stale, the resource is validated with the backend with a If-None-Match header to check if it is in fact still fresh. If so, the backend returns a 304 (Not Modified) header without sending the body of the requested resource, saving some bandwidth.
Sidekick cache honors standard HTTP caching policies such as 'Cache-Control', 'Expiry' etc. specified in request and response directives.
GET requests with Range headers are not cached to keep the codebase simple.
Health-Check: Health check is provided at the path "/v1/health". It returns "200 OK" even if any one of the sites is reachable, else it returns "502 Bad Gateway" error.

Documentation ¶

Index ¶

Constants
Variables
func SidekickMain()
type Backend
- func (b *Backend) ErrorHandler(w http.ResponseWriter, r *http.Request, err error)
- func (b *Backend) Online() bool
type BackendStats
type ConnStats
type HTTPRangeSpec
- func (h *HTTPRangeSpec) GetLength(resourceSize int64) (rangeLength int64, err error)
- func (h *HTTPRangeSpec) GetOffsetLength(resourceSize int64) (start, length int64, err error)
- func (h *HTTPRangeSpec) String(resourceSize int64) string
type ResponseRecorder
- func NewRecorder() *ResponseRecorder
- func (rw *ResponseRecorder) Flush()
- func (rw *ResponseRecorder) Header() http.Header
- func (rw *ResponseRecorder) Result() *http.Response
- func (rw *ResponseRecorder) Write(buf []byte) (int, error)
- func (rw *ResponseRecorder) WriteHeader(code int)
type ResponseWriter
- func NewResponseWriter(w http.ResponseWriter) *ResponseWriter
- func (lrw *ResponseWriter) Body() []byte
- func (lrw *ResponseWriter) Flush()
- func (lrw *ResponseWriter) Size() int
- func (lrw *ResponseWriter) Write(p []byte) (int, error)
- func (lrw *ResponseWriter) WriteHeader(code int)
type ResponseWriterWrapper
- func (w *ResponseWriterWrapper) Write(p []byte) (int, error)
- func (w *ResponseWriterWrapper) WriteHeader(statusCode int)
type S3CacheClient
type TraceInfo
- func InternalTrace(req *http.Request, resp *http.Response, reqTime, respTime time.Time) TraceInfo
- func Trace(f http.HandlerFunc, logBody bool, w http.ResponseWriter, r *http.Request, ...) TraceInfo
- func (trc TraceInfo) String() string

Constants ¶

View Source

const (
	// CacheControl header
	CacheControl = "Cache-Control"
	// Expires header
	Expires = "Expires"

	// EnvCacheEndpoint cache endpoint
	EnvCacheEndpoint = "SIDEKICK_CACHE_ENDPOINT"
	// EnvCacheAccessKey cache access key
	EnvCacheAccessKey = "SIDEKICK_CACHE_ACCESS_KEY"
	// EnvCacheSecretKey cache secret key
	EnvCacheSecretKey = "SIDEKICK_CACHE_SECRET_KEY"
	// EnvCacheBucket bucket to cache to.
	EnvCacheBucket = "SIDEKICK_CACHE_BUCKET"
	// EnvCacheMinSize minimum size of object that should be cached.
	EnvCacheMinSize = "SIDEKICK_CACHE_MIN_SIZE"
	// EnvCacheHealthCheckDuration - health check duration
	EnvCacheHealthCheckDuration = "SIDEKICK_CACHE_HEALTH_DURATION"
)

View Source

const (
	// LogMsgType for log messages
	LogMsgType = "LOG"
	// TraceMsgType for trace messages
	TraceMsgType = "TRACE"
	// DebugMsgType for debug output
	DebugMsgType = "DEBUG"
)

Variables ¶

View Source

var BodyPlaceHolder = []byte("<BODY>")

BodyPlaceHolder returns a dummy body placeholder

Functions ¶

func SidekickMain ¶

func SidekickMain()

Types ¶

type Backend ¶

type Backend struct {
	Stats *BackendStats
	// contains filtered or unexported fields
}

Backend entity to which requests gets load balanced.

func (*Backend) ErrorHandler ¶

func (b *Backend) ErrorHandler(w http.ResponseWriter, r *http.Request, err error)

ErrorHandler called by httputil.ReverseProxy for errors.

func (*Backend) Online ¶

func (b *Backend) Online() bool

Online returns true if backend is up

type BackendStats ¶

type BackendStats struct {
	sync.Mutex
	LastDowntime    time.Duration
	CumDowntime     time.Duration
	TotCalls        int64
	TotCallFailures int64
	MinLatency      time.Duration
	MaxLatency      time.Duration
	CumLatency      time.Duration
	Rx              int64
	Tx              int64
	UpSince         time.Time
	DowntimeStart   time.Time
}

BackendStats holds server stats for backend

type ConnStats ¶

type ConnStats struct {
	// contains filtered or unexported fields
}

ConnStats - statistics on backend

type HTTPRangeSpec ¶

type HTTPRangeSpec struct {
	// Does the range spec refer to a suffix of the object?
	IsSuffixLength bool

	// Start and end offset specified in range spec
	Start, End int64
}

HTTPRangeSpec represents a range specification as supported by S3 GET object request.

Case 1: Not present -> represented by a nil RangeSpec Case 2: bytes=1-10 (absolute start and end offsets) -> RangeSpec{false, 1, 10} Case 3: bytes=10- (absolute start offset with end offset unspecified) -> RangeSpec{false, 10, -1} Case 4: bytes=-30 (suffix length specification) -> RangeSpec{true, -30, -1}

func (*HTTPRangeSpec) GetLength ¶

func (h *HTTPRangeSpec) GetLength(resourceSize int64) (rangeLength int64, err error)

GetLength - get length of range

func (*HTTPRangeSpec) GetOffsetLength ¶

func (h *HTTPRangeSpec) GetOffsetLength(resourceSize int64) (start, length int64, err error)

GetOffsetLength computes the start offset and length of the range given the size of the resource

func (*HTTPRangeSpec) String ¶

func (h *HTTPRangeSpec) String(resourceSize int64) string

String returns stringified representation of range for a particular resource size.

type ResponseRecorder ¶

type ResponseRecorder struct {
	StatusCode int

	Flushed bool // Flushed is whether the Handler called Flush.
	// contains filtered or unexported fields
}

ResponseRecorder returns a wrapped response writer to get underlying http.Response for cache handler.

func NewRecorder ¶

func NewRecorder() *ResponseRecorder

NewRecorder returns an initialized ResponseRecorder.

func (*ResponseRecorder) Flush ¶

func (rw *ResponseRecorder) Flush()

Flush calls underlying Flush method

func (rw *ResponseRecorder) Header() http.Header

Header needed for implementing "net/http".ResponseWriter

func (*ResponseRecorder) Result ¶

func (rw *ResponseRecorder) Result() *http.Response

Result returns the response generated by the handler. It blocks on the rw.ch until header some content has been written The returned Response will have at least its StatusCode, Header, Body

func (*ResponseRecorder) Write ¶

func (rw *ResponseRecorder) Write(buf []byte) (int, error)

Write implements http.ResponseWriter. The data in buf is written to the pipeWriter

func (*ResponseRecorder) WriteHeader ¶

func (rw *ResponseRecorder) WriteHeader(code int)

WriteHeader implements http.ResponseWriter.

type ResponseWriter ¶

type ResponseWriter struct {
	http.ResponseWriter
	StatusCode int
	// Response body should be logged
	LogBody         bool
	TimeToFirstByte time.Duration
	StartTime       time.Time
	// contains filtered or unexported fields
}

ResponseWriter - is a wrapper to trap the http response status code.

func NewResponseWriter ¶

func NewResponseWriter(w http.ResponseWriter) *ResponseWriter

NewResponseWriter - returns a wrapped response writer to trap http status codes for auditiing purposes.

func (*ResponseWriter) Body ¶

func (lrw *ResponseWriter) Body() []byte

Body - Return response body.

func (*ResponseWriter) Flush ¶

func (lrw *ResponseWriter) Flush()

Flush - Calls the underlying Flush.

func (*ResponseWriter) Size ¶

func (lrw *ResponseWriter) Size() int

Size - reutrns the number of bytes written

func (*ResponseWriter) Write ¶

func (lrw *ResponseWriter) Write(p []byte) (int, error)

func (*ResponseWriter) WriteHeader ¶

func (lrw *ResponseWriter) WriteHeader(code int)

WriteHeader - writes http status code

type ResponseWriterWrapper ¶

type ResponseWriterWrapper struct {
	http.ResponseWriter
	// contains filtered or unexported fields
}

func (*ResponseWriterWrapper) Write ¶

func (w *ResponseWriterWrapper) Write(p []byte) (int, error)

func (*ResponseWriterWrapper) WriteHeader ¶

func (w *ResponseWriterWrapper) WriteHeader(statusCode int)

type S3CacheClient ¶

type S3CacheClient struct {
	*minio.Core
	// contains filtered or unexported fields
}

S3CacheClient client to S3 cache storage.

type TraceInfo ¶

type TraceInfo struct {
	Type      string            `json:"type"`
	NodeName  string            `json:"nodename"`
	ReqInfo   traceRequestInfo  `json:"request"`
	RespInfo  traceResponseInfo `json:"response"`
	CallStats traceCallStats    `json:"stats"`
}

TraceInfo - represents a trace record, additionally also reports errors if any while listening on trace.

func InternalTrace ¶

func InternalTrace(req *http.Request, resp *http.Response, reqTime, respTime time.Time) TraceInfo

InternalTrace returns trace for sidekick http requests

func Trace ¶

func Trace(f http.HandlerFunc, logBody bool, w http.ResponseWriter, r *http.Request, endpoint string) TraceInfo

Trace gets trace of http request

func (TraceInfo) String ¶

func (trc TraceInfo) String() string

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

Architecture

Install

Binary Releases

Docker

Build from source

Usage

Examples

Load balance across a web service using DNS provided IPs

Load balance across 4 MinIO Servers (http://minio1:9000 to http://minio4:9000)

Two sites with 4 servers each

Realworld Example with spark-operator

Configure spark-operator

Install MinIO

Run the spark job in k8s

High Performance S3 Cache

Run sidekick configured with high performance cache on baremetal

Run the spark job in k8s

Features

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func SidekickMain ¶

Types ¶

type Backend ¶

func (*Backend) ErrorHandler ¶

func (*Backend) Online ¶

type BackendStats ¶

type ConnStats ¶

type HTTPRangeSpec ¶

func (*HTTPRangeSpec) GetLength ¶

func (*HTTPRangeSpec) GetOffsetLength ¶

func (*HTTPRangeSpec) String ¶

type ResponseRecorder ¶

func NewRecorder ¶

func (*ResponseRecorder) Flush ¶

func (*ResponseRecorder) Header ¶

func (*ResponseRecorder) Result ¶

func (*ResponseRecorder) Write ¶

func (*ResponseRecorder) WriteHeader ¶

type ResponseWriter ¶

func NewResponseWriter ¶

func (*ResponseWriter) Body ¶

func (*ResponseWriter) Flush ¶

func (*ResponseWriter) Size ¶

func (*ResponseWriter) Write ¶

func (*ResponseWriter) WriteHeader ¶

type ResponseWriterWrapper ¶

func (*ResponseWriterWrapper) Write ¶

func (*ResponseWriterWrapper) WriteHeader ¶

type S3CacheClient ¶

type TraceInfo ¶

func InternalTrace ¶

func Trace ¶

func (TraceInfo) String ¶

Source Files ¶