ead

package module
v0.0.0-...-8ea12a5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 3, 2023 License: MIT Imports: 16 Imported by: 0

README

Automate EC2 instance replacement with updated ECS optimized AMI for ECS clusters

What is this?

ecs-ami-deploy is a library of code for intelligently replacing instances in an autoscaling group (ASG) with the latest ECS Optimized image. Based on a couple assumptions, this library can replace all instances in an ECS cluster without causing downtime for any of the running services.

This process is available as:

  • A Go Module: import "github.com/silinternational/ecs-ami-deploy"
  • A command line application. See cli/ directory
  • A Lambda function that can be scheduled or triggered automatically whenever a new ECS optimized AMI is released

Why did we build this?

AWS releases new ECS optimized AMIs on a pretty frequent basis. It's generally a good idea to keep up to date with releases for security, performance, and other enhancements. However, replacing instances can be a complicated dance if you also want to maintain a high level of availability for the services you're running on the instances to be replaced. AWS has a feature to "refresh" instances in an autoscaling group, but unfortunately it isn't "ECS aware". Rather, it removes EC2 instances from the autoscaling group, terminates them, and replaces them. When the new instances are "healthy" it moves on to replace more instances. It does not take into account if the services you have running in ECS are stable yet when removing the next instance. As a result, refreshing instances in an ASG using the built-in feature can and will likely cause downtime for your ECS services.

That isn't really an acceptable solution for us, so we created ecs-ami-deploy to refresh instances with an awareness of the services running in ECS and to replace EC2 instances only when ECS services are stable. This process does assume more than one task per service is running so that when an EC2 instance is removed, ECS will launch a new task on a different instance while other tasks for this service remain where they are. Then one by one, EC2 instances are removed from service and terminated, and only when all services are stable again, that is they have zero pending tasks, then the next EC2 instance can be removed from service and so forth.

Idempotency

Gracefully replacing instances can take some time, especially for clusters with many instances supporting them. Since it is expected that this process will be run by a Lambda, which has a runtime limit of 15 minutes, the process was designed to be fault-tolerant and to pick up where it left off should it be killed due to timeout. One of the ways this is accomplished is that all EC2 instances in the ASG are tagged before being detached and terminated. On successive runs the process looks for tagged instances for the given cluster that are no longer in service and continues the graceful termination process while monitoring the ECS cluster services for stability.

If --force-replacement is enabled, the process will always replace all instances whether there is a newer AMI available or not. When --force-replacement is enabled the process is not idempotent.

Instance Replacement Process

  1. Look up latest AMI based on either the given AMI filter, or the default: amzn2-ami-ecs-hvm-*-x86_64-ebs
  2. Identify the ASG for the given ECS cluster to get current launch template and instances list
  3. Compare latest AMI with AMI in use by launch template
    1. If cluster is not using latest AMI, or force replacement is enabled, proceed to #4
    2. Else if using latest AMI already, jump to #10
  4. Create new launch template version with new AMI
  5. Update launch template default version and set ASG to use the latest template version ("$Latest")
  6. Detach existing instances from ASG and replace with new ones
  7. Wait for new instances to reach InService state with ASG
  8. Watch ECS cluster instances until all new ones are registered and available
  9. For each old instance that needs to be removed:
    1. Deregister one instance from ECS cluster
    2. Wait for zero pending tasks in cluster
    3. Terminate old ASG EC2 instance
  10. Scan all EC2 instances for any instances tagged for termination as part of this operation in case any were missed on a previous run due to timeout or something else. For each:
    1. Terminate instance
    2. Wait for zero pending tasks in cluster

Todo

  • Consistentify logging vs. returning errors
  • Add a logger that can send output to an email as well as stdout
  • Create Lambda wrapper and provide trigger examples for schedule and SNS when newer AMI is released

CLI Usage

  1. Grab the latest binary for your platform at https://github.com/silinternational/ecs-ami-deploy/releases
  2. The CLI makes use of AWS's SDK for Go, which can load authentication credentials from various places similar to the AWS CLI itself
  3. Run ecs-ami-deploy list-clusters to check if it's working and what clusters you have available.
  4. If you have multiple profiles configured in your ~/.aws/credentials file, you can use the -p or --profile flags to specify a different profile.
  5. The CLI defaults to region us-east-1, you can use the -r or --region flags to specify a different region
  6. The CLI has help information built in for the various subcommands and their supported flags, use -h or --help flags with each subcommand for more information.

Documentation

Index

Constants

View Source
const (
	DefaultAMIFilter           = "amzn2-ami-ecs-hvm-*-x86_64-ebs"
	DefaultPollingTimeout      = 15 * time.Minute
	DefaultPollingInterval     = 5 * time.Second
	DefaultLaunchTemplateLimit = 5
	DefaultTimestampLayout     = "20060102T150405"
	MinimumIntervalsForStable  = 6
	TagNameASG                 = "ecs-ami-deploy-asg"
	TagNameTerminate           = "ecs-ami-deploy-terminate"
	Version                    = "0.0.0"
)

Variables

View Source
var DefaultConfig = Config{
	AMIFilter:                DefaultAMIFilter,
	Cluster:                  "",
	ForceReplacement:         false,
	LaunchTemplateLimit:      DefaultLaunchTemplateLimit,
	LaunchTemplateNamePrefix: "",
	Logger:                   nil,
	PollingInterval:          DefaultPollingInterval,
	PollingTimeout:           DefaultPollingTimeout,
	TimestampLayout:          DefaultTimestampLayout,
}

Functions

This section is empty.

Types

type ClusterMeta

type ClusterMeta struct {
	Cluster ecsTypes.Cluster
	Image   ec2types.Image
}

type Config

type Config struct {
	AMIFilter                string
	Cluster                  string
	ForceReplacement         bool
	LaunchTemplateLimit      int
	LaunchTemplateNamePrefix string
	Logger                   *log.Logger
	PollingInterval          time.Duration
	PollingTimeout           time.Duration
	TimestampLayout          string
}

type Upgrader

type Upgrader struct {
	// contains filtered or unexported fields
}

func NewUpgrader

func NewUpgrader(awsCfg aws.Config, config *Config) (*Upgrader, error)

func (*Upgrader) LatestAMI

func (u *Upgrader) LatestAMI() (ec2types.Image, error)

LatestAMI finds the latest ECS optimized AMI for the given (or default) filter and returns the ec2types.Image and/or an error

func (*Upgrader) ListClusters

func (u *Upgrader) ListClusters() ([]ClusterMeta, error)

func (*Upgrader) UpgradeCluster

func (u *Upgrader) UpgradeCluster() error

Directories

Path Synopsis
cli
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL