In this section, we will prepare our cluster to handle Spot interruptions.
If the available On-Demand capacity of a particular instance type is deleted, the Spot Instance is sent an interruption notice two minutes ahead to gracefully wrap up things. We will deploy a pod on each spot instance to detect and redeploy applications elsewhere in the cluster.
The first thing that we need to do is deploy the AWS Node Termination Handler on each Spot Instance. This will monitor the EC2 metadata service on the instance for an interruption notice.
The termination handler consists of a ServiceAccount
, ClusterRole
, ClusterRoleBinding
, and a DaemonSet
.
The workflow can be summarized as:
By default, the aws-node-termination-handler will run on all of your nodes (on-demand and spot). If your spot instances are labeled, you can configure aws-node-termination-handler
to only run on your labeled spot nodes. If you’re using the tag lifecycle=Ec2Spot
, you can run the following to apply our spot-node-selector overlay.
kubectl apply -k https://github.com/aws/aws-node-termination-handler/config/overlays/spot-node-selector?ref=master
View the pods. There should be one for each spot node.
kubectl --namespace=kube-system get daemonsets