Install Karpenter on AWS EKS cluster

2022年07月16日


Note: Karpenter is designed to be cloud-provider agnostic but currently only supports AWS.

Diagram of pods using Karpenter to optimize capacity


Karpenter deployment
Create an Amazon EKS cluster and node group. Then set up Karpenter and deploy Provisioner API.

1) Set the following environment variables:
export CLUSTER_NAME=karpenter-demo
export AWS_DEFAULT_REGION=us-west-2
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

2) Create a cluster with eksctl.
The following example configuration specifies a basic cluster with one initial node and sets up an IAM OIDC provider for the cluster to enable IAM roles for Pods. Note: For an existing EKS cluster, you can determine whether you have one or need to create one in Create an IAM OIDC provider for your cluster.
eksctl create cluster -f - << EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "1.21"
managedNodeGroups:
  - instanceType: m5.large
    amiFamily: AmazonLinux2
    name: ${CLUSTER_NAME}-ng
    desiredCapacity: 1
    minSize: 1
    maxSize: 2
iam:
  withOIDC: true
EOF
This returns:
2022-07-17 05:35:00 [ℹ]  eksctl version 0.105.0
2022-07-17 05:35:00 [ℹ]  using region us-west-2
2022-07-17 05:35:01 [ℹ]  setting availability zones to [us-west-2a us-west-2d us-west-2b]
2022-07-17 05:35:01 [ℹ]  subnets for us-west-2a - public:192.168.0.0/19 private:192.168.96.0/19
2022-07-17 05:35:01 [ℹ]  subnets for us-west-2d - public:192.168.32.0/19 private:192.168.128.0/19
2022-07-17 05:35:01 [ℹ]  subnets for us-west-2b - public:192.168.64.0/19 private:192.168.160.0/19
2022-07-17 05:35:01 [ℹ]  nodegroup "karpenter-demo-ng" will use "" [AmazonLinux2/1.21]
2022-07-17 05:35:01 [ℹ]  using Kubernetes version 1.21
2022-07-17 05:35:01 [ℹ]  creating EKS cluster "karpenter-demo" in "us-west-2" region with managed nodes
2022-07-17 05:35:01 [ℹ]  1 nodegroup (karpenter-demo-ng) was included (based on the include/exclude rules)
2022-07-17 05:35:01 [ℹ]  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2022-07-17 05:35:01 [ℹ]  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
2022-07-17 05:35:01 [ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=karpenter-demo'
2022-07-17 05:35:01 [ℹ]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "karpenter-demo" in "us-west-2"
2022-07-17 05:35:01 [ℹ]  CloudWatch logging will not be enabled for cluster "karpenter-demo" in "us-west-2"
2022-07-17 05:35:01 [ℹ]  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=us-west-2 --cluster=karpenter-demo'
2022-07-17 05:35:01 [ℹ]
2 sequential tasks: { create cluster control plane "karpenter-demo",
    2 sequential sub-tasks: {
        4 sequential sub-tasks: {
            wait for control plane to become ready,
            associate IAM OIDC provider,
            2 sequential sub-tasks: {
                create IAM role for serviceaccount "kube-system/aws-node",
                create serviceaccount "kube-system/aws-node",
            },
            restart daemonset "kube-system/aws-node",
        },
        create managed nodegroup "karpenter-demo-ng",
    }
}
2022-07-17 05:35:01 [ℹ]  building cluster stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:35:04 [ℹ]  deploying stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:35:34 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:36:05 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:37:06 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:38:07 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:39:08 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:40:09 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:41:10 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:42:11 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:43:12 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:44:13 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:45:14 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-cluster"
2022-07-17 05:47:23 [ℹ]  building iamserviceaccount stack "eksctl-karpenter-demo-addon-iamserviceaccount-kube-system-aws-node"
2022-07-17 05:47:24 [ℹ]  deploying stack "eksctl-karpenter-demo-addon-iamserviceaccount-kube-system-aws-node"
2022-07-17 05:47:25 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-addon-iamserviceaccount-kube-system-aws-node"
2022-07-17 05:47:56 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-addon-iamserviceaccount-kube-system-aws-node"
2022-07-17 05:47:57 [ℹ]  serviceaccount "kube-system/aws-node" already exists
2022-07-17 05:47:57 [ℹ]  updated serviceaccount "kube-system/aws-node"
2022-07-17 05:47:59 [ℹ]  daemonset "kube-system/aws-node" restarted
2022-07-17 05:48:01 [ℹ]  building managed nodegroup stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:48:01 [ℹ]  deploying stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:48:02 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:48:33 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:49:16 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:50:57 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:51:55 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-nodegroup-karpenter-demo-ng"
2022-07-17 05:51:55 [ℹ]  waiting for the control plane availability...
2022-07-17 05:51:55 [✔]  saved kubeconfig as "/Users/l***u/.kube/config"
2022-07-17 05:51:55 [ℹ]  no tasks
2022-07-17 05:51:55 [✔]  all EKS cluster resources for "karpenter-demo" have been created
2022-07-17 05:51:56 [ℹ]  nodegroup "karpenter-demo-ng" has 1 node(s)
2022-07-17 05:51:56 [ℹ]  node "ip-192-168-19-125.us-west-2.compute.internal" is ready
2022-07-17 05:51:56 [ℹ]  waiting for at least 1 node(s) to become ready in "karpenter-demo-ng"
2022-07-17 05:51:56 [ℹ]  nodegroup "karpenter-demo-ng" has 1 node(s)
2022-07-17 05:51:56 [ℹ]  node "ip-192-168-19-125.us-west-2.compute.internal" is ready
2022-07-17 05:51:58 [ℹ]  kubectl command should work with "/Users/l***u/.kube/config", try 'kubectl get nodes'
2022-07-17 05:51:58 [✔]  EKS cluster "karpenter-demo" in "us-west-2" region is ready

export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)"
echo $CLUSTER_ENDPOINT
https://3402****AADD.gr7.us-west-2.eks.amazonaws.com

kubectl get serviceaccount -n kube-system aws-node -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/eksctl-karpenter-demo-addon-iamserviceaccoun-Role1-OUQZSEK2KI49
    ...
  labels:
    app.kubernetes.io/managed-by: eksctl
  name: aws-node
  namespace: kube-system
  ...
secrets:
- name: aws-node-token-kvtfr

3) Create subnet tags kubernetes.io/cluster/$CLUSTER_NAME.
Karpenter discovers subnets tagged kubernetes.io/cluster/$CLUSTER_NAME. Add this tag to associated subnets of your cluster. Retrieve the subnet IDs and tag them with the cluster name.
SUBNET_IDS=$(aws cloudformation describe-stacks \
 --stack-name eksctl-${CLUSTER_NAME}-cluster \
 --query 'Stacks[].Outputs[?OutputKey==`SubnetsPrivate`].OutputValue' \
 --output text)

aws ec2 create-tags \
 --resources $(echo $SUBNET_IDS | tr ',' '\n') \
 --tags Key="kubernetes.io/cluster/${CLUSTER_NAME}",Value=

4) Create the Karpenter node's IAM role.
K8s worker nodes launched by Karpenter must run with an EC2 instance profile that grants permissions necessary to run containers and configure networking. Karpenter discovers the instance profile using the name KarpenterNodeRole-${ClusterName}.

TEMPOUT=$(mktemp)

curl -fsSL https://karpenter.sh/v0.13.2/getting-started/getting-started-with-eksctl/cloudformation.yaml > $TEMPOUT \
&& aws cloudformation deploy \
  --stack-name Karpenter-${CLUSTER_NAME} \
  --template-file ${TEMPOUT} \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides ClusterName=${CLUSTER_NAME}
This returns:
Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - Karpenter-karpenter-demo

PS: cloudformation.yaml file content:
AWSTemplateFormatVersion: "2010-09-09"
Description: Resources used by https://github.com/aws/karpenter
Parameters:
  ClusterName:
    Type: String
    Description: "EKS cluster name"
Resources:
  KarpenterNodeInstanceProfile:
    Type: "AWS::IAM::InstanceProfile"
    Properties:
      InstanceProfileName: !Sub "KarpenterNodeInstanceProfile-${ClusterName}"
      Path: "/"
      Roles:
        - Ref: "KarpenterNodeRole"
  KarpenterNodeRole:
    Type: "AWS::IAM::Role"
    Properties:
      RoleName: !Sub "KarpenterNodeRole-${ClusterName}"
      Path: /
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                !Sub "ec2.${AWS::URLSuffix}"
            Action:
              - "sts:AssumeRole"
      ManagedPolicyArns:
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"
  KarpenterControllerPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: !Sub "KarpenterControllerPolicy-${ClusterName}"
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Resource: "*"
            Action:
              # Write Operations
              - ec2:CreateLaunchTemplate
              - ec2:CreateFleet
              - ec2:RunInstances
              - ec2:CreateTags
              - iam:PassRole
              - ec2:TerminateInstances
              - ec2:DeleteLaunchTemplate
              # Read Operations
              - ec2:DescribeLaunchTemplates
              - ec2:DescribeInstances
              - ec2:DescribeSecurityGroups
              - ec2:DescribeSubnets
              - ec2:DescribeInstanceTypes
              - ec2:DescribeInstanceTypeOfferings
              - ec2:DescribeAvailabilityZones
              - ec2:DescribeSpotPriceHistory
              - ssm:GetParameter
              - pricing:GetProducts

5) Grant access to instances using the profile to connect to the cluster. Add the Karpenter node role to your aws-auth ConfigMap.
eksctl create iamidentitymapping \
 --username system:node:{{EC2PrivateDNSName}} \
 --cluster ${CLUSTER_NAME} \
 --arn arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME} \
 --group system:bootstrappers \
 --group system:nodes
This returns:
2022-07-16 21:49:52 [ℹ]  adding identity "arn:aws:iam::111122223333:role/KarpenterNodeRole-karpenter-demo" to auth ConfigMap

This should update the aws-auth ConfigMap.
Name:         aws-auth
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>
Data
====
mapRoles:
----
- groups:
  - system:bootstrappers
  - system:nodes
  rolearn: arn:aws:iam::123456789012:role/eksctl-karpenter-demo-nodegroup-k-NodeInstanceRole-9BVA46MVZMRO
  username: system:node:{{EC2PrivateDNSName}}
- groups:
  - system:bootstrappers
  - system:nodes
  rolearn: arn:aws:iam::123456789012:role/KarpenterNodeRole-karpenter-demo
  username: system:node:{{EC2PrivateDNSName}}
mapUsers:
----
[]
BinaryData
====
Events:  <none>


6) Create the KarpenterController IAM role. Karpenter requires permissions like launching instances. This will create an AWS IAM role and a Kubernetes service account and associate them using IRSA.
eksctl create iamserviceaccount \
  --cluster "${CLUSTER_NAME}" --name karpenter --namespace karpenter \
  --role-name "${CLUSTER_NAME}-karpenter" \
  --attach-policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}" \
  --role-only \
  --approve
This returns:
2022-07-16 21:50:55 [ℹ]  1 existing iamserviceaccount(s) (kube-system/aws-node) will be excluded
2022-07-16 21:50:55 [ℹ]  1 iamserviceaccount (karpenter/karpenter) was included (based on the include/exclude rules)
2022-07-16 21:50:55 [!]  serviceaccounts that exist in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
2022-07-16 21:50:55 [ℹ]  1 task: { create IAM role for serviceaccount "karpenter/karpenter" }
2022-07-16 21:50:55 [ℹ]  building iamserviceaccount stack "eksctl-karpenter-demo-addon-iamserviceaccount-karpenter-karpenter"
2022-07-16 21:50:55 [ℹ]  deploying stack "eksctl-karpenter-demo-addon-iamserviceaccount-karpenter-karpenter"
2022-07-16 21:50:56 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-addon-iamserviceaccount-karpenter-karpenter"
2022-07-16 21:51:27 [ℹ]  waiting for CloudFormation stack "eksctl-karpenter-demo-addon-iamserviceaccount-karpenter-karpenter"

7) Install Karpenter Helm chart.
export KARPENTER_IAM_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"

Add a chart repository, if you haven't done so.
helm repo add karpenter https://charts.karpenter.sh
"karpenter" has been added to your repositories

Update information of available charts locally from chart repositories, if you haven't done so.
helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "karpenter" chart repository
...Successfully got an update from the "cilium" chart repository
Update Complete. ⎈Happy Helming!⎈

helm upgrade karpenter karpenter/karpenter --install --namespace karpenter \
  --create-namespace --version v0.13.2 \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
  --set clusterName=${CLUSTER_NAME} \
  --set clusterEndpoint=${CLUSTER_ENDPOINT} \
  --set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
  --wait # for the defaulting webhook to install before creating a Provisioner
This returns:
Release "karpenter" does not exist. Installing it now.
NAME: karpenter
LAST DEPLOYED: Sat Jul 16 22:01:24 2022
NAMESPACE: karpenter
STATUS: deployed
REVISION: 1
TEST SUITE: None

8) (Optional) Enable debug logging.
kubectl patch configmap config-logging -n karpenter --patch '{"data":{"loglevel.controller":"debug"}}'
configmap/config-logging patched

9) Deploy the provisioner and application Pods with layered constraints by applying the following Karpenter provisioner spec. It has the requirements for architecture type (arm64 & amd64), capacity type (Spot & On-demand), and taints for GPU-based use cases.
cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  requirements:
    - key: "karpenter.sh/capacity-type"
      operator: In
      values: ["spot", "on-demand"]
    - key: "kubernetes.io/arch" 
      operator: In
      values: ["arm64", "amd64"]
  limits:
    resources:
      cpu: 1000
  provider:    
    subnetSelector:
      kubernetes.io/cluster/$CLUSTER_NAME: '*'
    securityGroupSelector:
      kubernetes.io/cluster/$CLUSTER_NAME: '*'  
  ttlSecondsAfterEmpty: 30
EOF
This returned:
provisioner.karpenter.sh/default created

10) Run the application deployment on a specific capacity, instance type, hardware, and Availability Zone using pod scheduling constraints.


Sample deployment
In the following sample deployment, we define the nodeSelector with topology.kubernetes.io/zone to choose an Availability Zone and on-demand arm64 instance with karpenter.sh/capacity-type and kubernetes.io/arch: arm64 and specific instance type node.kubernetes.io/instance-type so that new nodes can be launched by Karpenter using the following Pod scheduling constraints.
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      nodeSelector:
        node.kubernetes.io/instance-type: t4g.2xlarge
        karpenter.sh/capacity-type: on-demand
        topology.kubernetes.io/zone: us-west-2a
        kubernetes.io/arch: arm64
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.5
          resources:
            requests:
              cpu: 1
EOF
OR
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      nodeSelector:
        node.kubernetes.io/instance-type: r6gd.xlarge
        karpenter.sh/capacity-type: on-demand
        topology.kubernetes.io/zone: us-west-2a
        kubernetes.io/arch: arm64
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.5
          resources:
            requests:
              cpu: 1
EOF
This returns:
deployment.apps/inflate created

1) Scale the above deployment.
kubectl scale deployment inflate --replicas 3
deployment.apps/inflate scaled

2) Review the Karpenter Pod logs for events and more details.
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
2022-07-16T14:08:17.805Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0b946c0c2c8c19b45 (us-west-2a) subnet-0417d1570d137d12e (us-west-2d) subnet-02fd9368eedb5e2e2 (us-west-2b)]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:17.913Z	INFO	controller.provisioning	Computed packing of 1 node(s) for 3 pod(s) with instance type option(s) [r6gd.xlarge]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:18.091Z	DEBUG	controller.provisioning	Discovered security groups: [sg-054cafe14bfa6f85d]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:18.094Z	DEBUG	controller.provisioning	Discovered kubernetes version 1.22	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:18.177Z	DEBUG	controller.provisioning	Discovered ami-0bea0f4e1d10dcca7 for query /aws/service/eks/optimized-ami/1.22/amazon-linux-2-arm64/recommended/image_id	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:18.177Z	DEBUG	controller.provisioning	Discovered caBundle, length 1099	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:18.406Z	DEBUG	controller.provisioning	Created launch template, Karpenter-karpenter-demo-9378935887162504259	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:20.636Z	INFO	controller.provisioning	Launched instance: i-02a1b104a0ad0356e, hostname: ip-192-168-139-148.us-west-2.compute.internal, type: r6gd.xlarge, zone: us-west-2a, capacityType: on-demand	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:20.676Z	INFO	controller.provisioning	Bound 3 pod(s) to node ip-192-168-139-148.us-west-2.compute.internal	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:08:20.676Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:09:19.830Z	DEBUG	controller.provisioning	Discovered 408 EC2 instance types	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:09:19.987Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0b946c0c2c8c19b45 (us-west-2a) subnet-0417d1570d137d12e (us-west-2d) subnet-02fd9368eedb5e2e2 (us-west-2b)]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:09:20.141Z	DEBUG	controller.provisioning	Discovered EC2 instance types zonal offerings	{"commit": "fd19ba2", "provisioner": "default"}
---
2022-07-17T00:17:43.818Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-78l7k","uid":"6af76de8-ab74-4c32-8283-3aa38994982e","apiVersion":"v1","resourceVersion":"30364"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:17:43.818Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-f46tx","uid":"d66e590a-4692-448c-8e2d-07b246692d80","apiVersion":"v1","resourceVersion":"30370"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:17:43.818Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-4gpfr","uid":"9fd9ac38-cb52-4cd4-9aae-9e9103e7c426","apiVersion":"v1","resourceVersion":"30373"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:35.999Z	INFO	controller.node	Added TTL to empty node	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:36.038Z	INFO	controller.node	Added TTL to empty node	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:45.076Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-78l7k","uid":"6af76de8-ab74-4c32-8283-3aa38994982e","apiVersion":"v1","resourceVersion":"30691"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:45.076Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-f46tx","uid":"d66e590a-4692-448c-8e2d-07b246692d80","apiVersion":"v1","resourceVersion":"30693"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:45.076Z	DEBUG	controller.events	Normal	{"commit": "062a029", "object": {"kind":"Pod","namespace":"default","name":"inflate-f9587f7c6-4gpfr","uid":"9fd9ac38-cb52-4cd4-9aae-9e9103e7c426","apiVersion":"v1","resourceVersion":"30695"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:46.950Z	INFO	controller.node	Removed emptiness TTL from node	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:18:46.970Z	INFO	controller.node	Removed emptiness TTL from node	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:22:44.900Z	DEBUG	controller.node-state	Discovered 539 EC2 instance types	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:22:45.078Z	DEBUG	controller.node-state	Discovered subnets: [subnet-0f6ba29c24dd48d8c (us-west-2d) subnet-01b827461efe10187 (us-west-2a) subnet-044338d5111ed565b (us-west-2b)]	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}
2022-07-17T00:22:45.228Z	DEBUG	controller.node-state	Discovered EC2 instance types zonal offerings	{"commit": "062a029", "node": "ip-192-168-111-244.us-west-2.compute.internal"}

3) Validate the application pods with the following command and the same will be in a Running state.
kubectl get node -L node.kubernetes.io/instance-type,kubernetes.io/arch,karpenter.sh/capacity-type
NAME                                            STATUS   ROLES    AGE     VERSION                INSTANCE-TYPE   ARCH    CAPACITY-TYPE
ip-192-168-111-244.us-west-2.compute.internal   Ready    <none>   6m15s   v1.21.12-eks-5308cf7   t4g.2xlarge     arm64   on-demand
ip-192-168-19-125.us-west-2.compute.internal    Ready    <none>   153m    v1.21.12-eks-5308cf7   m5.large        amd64

kubectl get pods -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP                NODE                                            NOMINATED NODE   READINESS GATES
inflate-f9587f7c6-4gpfr   1/1     Running   0          8m59s   192.168.116.156   ip-192-168-111-244.us-west-2.compute.internal   <none>           <none>
inflate-f9587f7c6-78l7k   1/1     Running   0          8m59s   192.168.102.188   ip-192-168-111-244.us-west-2.compute.internal   <none>           <none>
inflate-f9587f7c6-f46tx   1/1     Running   0          8m59s   192.168.117.10    ip-192-168-111-244.us-west-2.compute.internal   <none>           <none>
We can see Karpenter applied layered constraints to launch nodes that satisfy multiple scheduling constraints of a workload, like instance type, specific Availability Zone, and hardware architecture via Karpenter.


Groupless Node upgrades
When using the node groups (self-managed or managed) with an EKS cluster, and as part of upgrading the worker nodes to a newer version of Kubernetes, we would have to rely on either migrating to a new node group for self-managed or launching a new autoscaling group of worker nodes for a managed node group, as mentioned in managed node group update behavior. Whereas with the Karpenter groupless autoscaling the upgrade of nodes works with the expiry time-to-live value.

Karpenter Provisioner API has Node Expiry that will allow a node to expire on reaching the expiry time-to-live value (ttlSecondsUntilExpired). The same value is used to upgrade nodes ttlSecondsUntilExpired. The nodes will be terminated after a set period of time, after which they are replaced with newer nodes.

Note: Karpenter supports using custom launch templates. When using a custom launch template, you are taking responsibility for maintaining the launch template, including updating which AMI is used (that is, for security updates). In the default configuration, Karpenter will use the latest version of the EKS optimized AMI, which is maintained by AWS.

1) Validate the current EKS cluster Kubernetes version with the following command.

aws eks describe-cluster --name ${CLUSTER_NAME} | grep -i version
        "version": "1.22",
        "platformVersion": "eks.4",
            "alpha.eksctl.io/eksctl-version": "0.105.0",

2) Deploy Pod Disruption Budget (PodDisruptionBudget / PDB) for your application deployment. PDB limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions.
cat <<EOF | kubectl apply -f -
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: inflate-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: inflate
EOF
This returns:
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/inflate-pdb created

kubectl get pdb
NAME          MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
inflate-pdb   2               N/A               1                     64s

kubectl get deploy inflate
NAME      READY   UP-TO-DATE   AVAILABLE   AGE
inflate   3/3     3            3           8m37s

3) Upgrade the EKS cluster to a newer Kubernetes version. We can see that the cluster was upgraded successfully to 1.22.
aws eks describe-cluster --name ${CLUSTER_NAME} | grep -i version
        "version": "1.22",
        "platformVersion": "eks.4",
            "alpha.eksctl.io/eksctl-version": "0.105.0",

4) Checking our workload and node created by Karpenter earlier, we can see that nodes are of version 1.20 as Karpenter used the latest version of the EKS optimized AMI based on the earlier EKS cluster version 1.20.
kubectl get node -L node.kubernetes.io/instance-type,kubernetes.io/arch,karpenter.sh/capacity-type
NAME                                            STATUS   ROLES    AGE     VERSION               INSTANCE-TYPE   ARCH    CAPACITY-TYPE
ip-192-168-139-148.us-west-2.compute.internal   Ready    <none>   7m47s   v1.22.9-eks-810597c   r6gd.xlarge     arm64   on-demand
ip-192-168-28-196.us-west-2.compute.internal    Ready    <none>   35m     v1.22.9-eks-810597c   m5.large        amd64

kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE     IP                NODE                                            NOMINATED NODE   READINESS GATES
inflate-599c98dd86-dfmts   1/1     Running   0          8m36s   192.168.148.13    ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>
inflate-599c98dd86-jkjss   1/1     Running   0          8m36s   192.168.147.112   ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>
inflate-599c98dd86-jzznj   1/1     Running   0          8m36s   192.168.140.10    ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>

5) Now, let’s reconfigure the provisioner API of Karpenter and append ttlSecondsUntilExpired. This will add the node expiry, which allows the nodes to get terminated and replaced with a new one matching the EKS cluster Kubernetes version 1.21 now.
cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  requirements:
    - key: "karpenter.sh/capacity-type"
      operator: In
      values: ["spot", "on-demand"]
    - key: "kubernetes.io/arch" 
      operator: In
      values: ["arm64", "amd64"]
  limits:
    resources:
      cpu: 1000
  provider:    
    subnetSelector:
      kubernetes.io/cluster/$CLUSTER_NAME: '*'
    securityGroupSelector:
      kubernetes.io/cluster/$CLUSTER_NAME: '*'  
  ttlSecondsAfterEmpty: 30
  ttlSecondsUntilExpired: 1800
EOF
This returns:
provisioner.karpenter.sh/default configured
Note: If ttlSecondsUntilExpired is nil, that means that the feature is disabled and nodes will never expire. For an example value, we can configure the node expiry to a value of 30 days as ttlSecondsUntilExpired: 2592000 (# 30 Days = 60 * 60 * 24 * 30 Seconds).

6) Review the Karpenter pod logs for events and more details.

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
2022-07-16T14:09:20.141Z	DEBUG	controller.provisioning	Discovered EC2 instance types zonal offerings	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:11:45.805Z	DEBUG	controller.aws.launchtemplate	Deleted launch template lt-01814ee28aa5a0998	{"commit": "fd19ba2"}
2022-07-16T14:14:21.078Z	DEBUG	controller.provisioning	Discovered 408 EC2 instance types	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:14:21.189Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0b946c0c2c8c19b45 (us-west-2a) subnet-0417d1570d137d12e (us-west-2d) subnet-02fd9368eedb5e2e2 (us-west-2b)]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:14:21.349Z	DEBUG	controller.provisioning	Discovered EC2 instance types zonal offerings	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:17:39.163Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0b946c0c2c8c19b45 (us-west-2a) subnet-0417d1570d137d12e (us-west-2d) subnet-02fd9368eedb5e2e2 (us-west-2b)]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:17:39.166Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:19:22.370Z	DEBUG	controller.provisioning	Discovered 408 EC2 instance types	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:19:22.417Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0b946c0c2c8c19b45 (us-west-2a) subnet-0417d1570d137d12e (us-west-2d) subnet-02fd9368eedb5e2e2 (us-west-2b)]	{"commit": "fd19ba2", "provisioner": "default"}
2022-07-16T14:19:22.559Z	DEBUG	controller.provisioning	Discovered EC2 instance types zonal offerings	{"commit": "fd19ba2", "provisioner": "default"}
Note: In the previous logs, we can see that PodDisruptionBudget was respected by Karpenter, and then it Discovered kubernetes version 1.21, used the latest version of the EKS optimized AMI for 1.21, and launched a new node for the workload. Later, the old node was cordoned, drained, and deleted by Karpenter.

If we validate the application Pods with the following commands, we can see that Karpenter launched nodes are upgraded to 1.21, same as that of the EKS cluster Kubernetes version.

kubectl get node -L node.kubernetes.io/instance-type,kubernetes.io/arch,karpenter.sh/capacity-type
NAME                                            STATUS   ROLES    AGE   VERSION               INSTANCE-TYPE   ARCH    CAPACITY-TYPE
ip-192-168-139-148.us-west-2.compute.internal   Ready    <none>   13m   v1.22.9-eks-810597c   r6gd.xlarge     arm64   on-demand
ip-192-168-28-196.us-west-2.compute.internal    Ready    <none>   41m   v1.22.9-eks-810597c   m5.large        amd64

kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP                NODE                                            NOMINATED NODE   READINESS GATES
inflate-599c98dd86-dfmts   1/1     Running   0          15m   192.168.148.13    ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>
inflate-599c98dd86-jkjss   1/1     Running   0          15m   192.168.147.112   ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>
inflate-599c98dd86-jzznj   1/1     Running   0          15m   192.168.140.10    ip-192-168-139-148.us-west-2.compute.internal   <none>           <none>
In the previous demonstration, we see that Karpenter respected the PDB and its ability to apply node expiry for upgrading of nodes launched by Karpenter. Node expiry can be used as a means of upgrading or repacking nodes so that nodes are retired and replaced with updated versions. See How Karpenter nodes are deprovisioned in the Karpenter documentation for information on using ttlSecondsUntilExpired and ttlSecondsAfterEmpty.


Cleanup
Delete all the provisioners (CRDs) that were created.
kubectl delete provisioner default
provisioner.karpenter.sh "default" deleted
Remove Karpenter and delete the infrastructure from your AWS account.
helm uninstall karpenter --namespace karpenter
release "karpenter" uninstalled

eksctl delete iamserviceaccount --cluster ${CLUSTER_NAME} --name karpenter --namespace karpenter
2022-07-16 22:25:55 [ℹ]  1 iamserviceaccount (karpenter/karpenter) was included (based on the include/exclude rules)
2022-07-16 22:25:59 [ℹ]  1 task: {
    2 sequential sub-tasks: {
        delete IAM role for serviceaccount "karpenter/karpenter" [async],
        delete serviceaccount "karpenter/karpenter",
    } }2022-07-16 22:25:59 [ℹ]  will delete stack "eksctl-karpenter-demo-addon-iamserviceaccount-karpenter-karpenter"
2022-07-16 22:26:00 [ℹ]  serviceaccount "karpenter/karpenter" was already deleted

aws cloudformation delete-stack --stack-name Karpenter-${CLUSTER_NAME}

aws ec2 describe-launch-templates \
    | jq -r ".LaunchTemplates[].LaunchTemplateName" \
    | grep -i Karpenter-${CLUSTER_NAME} \
    | xargs -I{} aws ec2 delete-launch-template --launch-template-name {}

eksctl delete cluster --name ${CLUSTER_NAME}


Conclusion
Karpenter provides the option to scale nodes quickly and with very little latency. In this blog, we demonstrated how the nodes can be scaled with different options for each use case using Provisioner API by leveraging the well-known Kubernetes labels and taints and using the pod scheduling constraints within the deployment so that Pods get deployed on the Karpenter provisioned nodes. This demonstrates that we can run different types of workloads on different capacities or requirements for each of its use cases. Further, we see the upgrade node behavior for the nodes launched by Karpenter by enabling the node expiry time ttlSecondsUntilExpiredwith the provisioner API.




References

Managing Pod Scheduling Constraints and Groupless Node Upgrades with Karpenter in Amazon EKS

Introducing Karpenter – An Open-Source High-Performance Kubernetes Cluster Autoscaler

Kubernetes 节点弹性伸缩开源组件 Karpenter 实践:部署GPU推理应用

Chinese Wechat articles:
Karpenter:一个开源的高性能 Kubernetes 集群自动缩放器


Category: container Tags: public

Upvote


Downvote