chester

command module
v0.0.0-...-3593051 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 21, 2021 License: MIT Imports: 31 Imported by: 0

README

Chester

Chester is an autoscaling tool for CloudSQL. This leverages Stackdriver Monitoring, pubsub and cloud functions. It has multiple moving pieces that have their own sections in this doc.

  • Chester-Daemon
    • Deployment on GKE that listens to pubsub and has administrative access to modify configmaps and other namespaced deployments in the cluster
    • Scales up/down read replicas in cloudsql
    • Updates the configmap on ProxySQL to enable reading/writing to those hostgroups
    • Repository - This one
  • ChesterModels
    • Helper functions and common structs across the ecosystem
  • GCF
  • ProxySQL
    • Exists as a deployment in GKE, currently non-clusterized, so the configuration is stored in a configmap
    • Applications get in contact with proxysql via the internal service endpoint in K8S
  • Chester-API
  • Datastore
    • Persistent no-sql storage for event data
  • Pub/Sub
    • Async messaging for alerting.

Why

Because Cloud SQL is missing the functionality that AWS Aurora has in Google Cloud. The official word from Google Cloud SQL teams word is leverage cloud functions to do this work, however their official implementation leaves a lot to be desired, so this is the expansion.

How it works at a high level

This works by setting up an alert in stackdriver with instance metadata in the details, scaling up/down, then updating the configuration

Configs

  • NETWORK_PROJECT_ID - The project ID of the shared vpc host project.
  • NETWORK_NAME - The name of the shared vpc host network.
  • PROJECT_ID - Project ID of the GCP project where chester exists.
  • PUBSUB_CREDS - Physical location of the JSON token we use to auth against pubsub.
  • DATASTORE_CREDS - Physical location of the JSON token we use to auth against datastore.
  • PUBSUB_TOPIC - Name of the topic used to broadcast messages to chester services
  • PUBSUB_SUBSCRIPTION - Name of the subscription used to listen to messages from the topic
  • SQLADMIN_CREDS - Physical location of the JSON token we use to auth against the sqladmin api.
  • IN_CLUSTER - Boolean, whether or not the daemon is in the cluster or not, used primarily for dev work when you don't want to spin up minikube

Stackdriver

Alerting Policy

Violates when: Any cloudsql.googleapis.com/database/network/connections stream is <above/below> a threshold of n for greater than <1/5> minute(s)

Documentation in these alerts needs to have instance/policy metadata in documentation.

Example:

{ 
  "replica_basename":"sql-development-", 
  "sql_master_instance":"sql-development", 
  "action":"add" 
}

Parameters:

  • replica_basename = string, what the default basename for the read replicas are
  • sql_master_instance = string, name of the immutable writer
  • action = string, are we adding or removing a read replica

GCF

The google cloud function takes the event data from stackdriver, converts it into a datastore object in the chester namespace, then sends that message to pub/sub

Datastore

Overview
{
  "IncidentID":"abcd",
  "PolicyName":"add_sql_replica",
  "State":"open",
  "StartedAt": 100000,
  "ClosedTimestamp":0,
  "Condition":{
    "IncidentID":"abcd",
    "PolicyName":"add_sql_replica"
  },
  "SqlMasterInstance":"sql-development",
  "ReplicaBaseName":"sql-development-",
  "Documentation":{
    "Content":"json blob here",
    "MimeType":"markdown"
  },
  "InProgress":true,
  "Action":"add"
}
Parameters
  • IncidentID = string, GCF replaces all . with -
  • PolicyName = string, the alert name
  • State = string, whether this is an open incident or not.
  • StartedAt = int64, timestamp of when this incident started
  • ClosedTimestamp = int64, timestamp of when this incident ended
  • Condition = object, contains data replicated at the top level
  • Documentation = object
    • MimeType = string, not really needed
    • Content = Documentation from the alert, contains database and alert metadata as a json encoded string.
      { 
        "replica_basename":"sql-development-", 
        "sql_master_instance":"sql-development", 
        "action":"add" 
      }
    
  • SqlMasterInstance = string, Stored from the documentation.content object, contains the immutable writer instance
  • ReplicaBasename = string, stored from documentation.content object, contains the base name used for the db instance
  • InProgress = bool, used for incident locking in the future
  • Action = string, stored from documentation.content object, this used to be PolicyName, however if we aim to extend this to other databases, we can't do that.

Chester-Daemon

How it works:

  1. Connects to PubSub/Datastore/K8S
  2. Listens to events on PubSub
  3. If the event is an add event
    1. Get the event data
    2. Generate a new instance
    3. Get that instances IP address
    4. Add to proxysql config
    5. If event is closed, call it a day, else repeat
  4. If the event is remove
    1. Get the event data
    2. Find a chester generated instance
    3. Get the private IP
    4. Remove that from proxysql config
    5. Remove that instance from CloudSQL
    6. If the event is closed, call it a day, else repeat

Chester-API

HTTP Layer used for updating database configurations in a programatic way, cause I'm not manually redeploying every time we add a DB.

Arch Diagram

Todo

  • Add support for mounting CES/PEM keys to the proxysql containers - pending proxysql to add support to individual read replica SSLs

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL