Increasing the amount of available IP addresses for the EC2 nodes

2024年01月13日

This blog post explains how did I significantly increase the number of IP addresses that nodes can assign to Pods by assigning IP prefixes, rather than assigning individual secondary IP addresses to K8s worker nodes.

The procedure, and thus this content, are based on the AWS documents and tailored to my environment's situation. This means, this post is not a comprehensive guidance for all scenarios. For an overall illustration, please consult AWS documentations, I have listed a few document URLs in the references section in the bottom of this page, which I referred to during performing this task.

Restriction

Each EC2 instance has a maximum number of ENIs and a maximum number of IP addresses that can be assigned to each network interface.
Each node consumes one IP address from each network interface.
The rest IP addresses can be assigned to Pods.
In my case, the worker nodes have available compute and memory resources, but can't accommodate additional Pods because the node has run out of IP addresses to assign to Pods.

A prefix includes several IP addresses. If an EKS cluster is not configured for IP prefix assignment, that EKS cluster must make more EC2 API calls to configure network interfaces and IP addresses necessary for Pod connectivity.

Considerations

AWS suggests to consider the following factors:
- Each Amazon EC2 instance type supports a maximum number of Pods. If an EKS node group consists of multiple instance types, the smallest number of maximum Pods for an instance in the cluster is applied to all nodes in the cluster. In my case, since I only utilize t4g.medium, so I could skip to the next one.
- By default, the maximum number of Pods that you can run on a node is 110. This number could be changed. If you change the number and have an existing managed node group, the next AMI or launch template update of your node group results in new nodes coming up with the changed value.
- When transitioning from assigning IP addresses to assigning IP prefixes, we recommend that you create new node groups to increase the number of available IP addresses, rather than doing a rolling replacement of existing nodes. Running Pods on a node that has both IP addresses and prefixes assigned can lead to inconsistency in the advertised IP address capacity, impacting the future workloads on the node. For the recommended way of performing the transition, see Replace all nodes during migration from Secondary IP mode to Prefix Delegation mode or vice versa in the Amazon EKS best practices guide. In my case, I use Launch Template to perform the transition, that means the procedure is following the AWS recommendation.
- (For clusters with Linux nodes only) If you're also using security groups for Pods, with POD_SECURITY_GROUP_ENFORCING_MODE=standard and AWS_VPC_K8S_CNI_EXTERNALSNAT=false, when your Pods communicate with endpoints outside of your VPC, the node's security groups are used, rather than any security groups you've assigned to your Pods. If you're also using security groups for Pods, with POD_SECURITY_GROUP_ENFORCING_MODE=strict, when your Pods communicate with endpoints outside of your VPC, the Pod's security groups are used.

Prerequisites

- The subnets that your Amazon EKS nodes are in must have sufficient contiguous /28 (for IPv4 clusters) CIDR blocks. Using IP prefixes can fail if IP addresses are scattered throughout the subnet CIDR. We recommend that following
--- Using a subnet CIDR reservation so that even if any IP addresses within the reserved range are still in use, upon their release, the IP addresses aren't reassigned. This ensures that prefixes are available for allocation without segmentation
--- Use new subnets that are specifically used for running the workloads that IP prefixes are assigned to. Both Windows and Linux workloads can run in the same subnet when assigning IP prefixes
- To assign IP prefixes to your nodes, your nodes must be AWS Nitro-based. Instances that aren't Nitro-based continue to allocate individual secondary IP addresses, but have a significantly lower number of IP addresses to assign to Pods than Nitro-based instances do. In my case, T4g is one of the virtualized instances that are built on the Nitro System.
- For clusters with Linux nodes only – If your cluster is configured for the IPv4 family, you must have version 1.9.0 or later of the Amazon VPC CNI plugin for Kubernetes add-on installed. You can check your current version with the following command.

% kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2

amazon-k8s-cni-init:v1.16.0-eksbuild.1
amazon-k8s-cni:v1.16.0-eksbuild.1
amazon

To increase the amount of available IP addresses for your Amazon EC2 nodes

1. Configure the EKS cluster to assign IP address prefixes to nodes. Complete the procedure on the tab that matches your node's operating system. In my case, the OS is Linux.
1.a Enable the parameter to assign prefixes to network interfaces for the Amazon VPC CNI DaemonSet. When you deploy a 1.21 or later cluster, version 1.10.1 or later of the Amazon VPC CNI plugin for Kubernetes add-on is deployed with it. If you created the cluster with the IPv6 family, this setting was set to true by default. If you created the cluster with the IPv4 family, this setting was set to false by default.

% kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true

daemonset.apps/aws-node env updated

1.b Because I'm deploying a self-managed node group with an AMI ID, I should determine the Amazon EKS recommend number of maximum Pods for the EKS worker nodes. Follow the instructions in Amazon EKS recommended maximum Pods for each Amazon EC2 instance type, adding --cni-prefix-delegation-enabled to step 3. Note the output for use in a later step.

Since each Pod is assigned its own IP address, the number of IP addresses supported by an instance type is a factor in determining the number of Pods that can run on the instance. Amazon EKS provides a script that you can download and run to determine the Amazon EKS recommended maximum number of Pods to run on each instance type. The script uses hardware attributes of each instance, and configuration options, to determine the maximum Pods number. You can use the number returned in these steps to enable capabilities such as assigning IP addresses to Pods from a different subnet than the instance's and significantly increasing the number of IP addresses for your instance. If you're using a managed node group with multiple instance types, use a value that would work for all instance types.

Download a script that you can use to calculate the maximum number of Pods for each instance type
curl -O https://raw.githubusercontent.com/awslabs/amazon-eks-ami/master/files/max-pods-calculator.sh

Mark the script as executable on your computer.
chmod +x max-pods-calculator.sh

Run the script, replacing t4g.medium with your own instance type and 1.16.0-eksbuild.1 with your Amazon VPC CNI add-on version.
./max-pods-calculator.sh --instance-type t4g.medium --cni-version 1.16.0-eksbuild.1 --cni-prefix-delegation-enabled
An example output is as follows.

NOTE
The following option has been added to the script to see the maximum Pods supported when using optional capabilities.
--cni-prefix-delegation-enabled – Use this option when you want to assign significantly more IP addresses to each elastic network interface.

1.c Specify the parameters in one of the following options. To determine which option is right for you and what value to provide for it, see WARM_PREFIX_TARGET, WARM_IP_TARGET, and MINIMUM_IP_TARGET on GitHub.
WARM_IP_TARGET or MINIMUM_IP_TARGET – If either value is set, it overrides any value set for WARM_PREFIX_TARGET
% kubectl set env ds aws-node -n kube-system WARM_IP_TARGET=5

daemonset.apps/aws-node env updated

% kubectl set env ds aws-node -n kube-system MINIMUM_IP_TARGET=30

daemonset.apps/aws-node env updated

1.d Configure node groups with Amazon EC2 Nitro Amazon Linux 2 instance type. For a list of Nitro instance types, see Instances built on the Nitro System in the Amazon EC2 User Guide for Linux Instances. This capability is not supported on Windows. For the max-pod option, replace the 110 with either the value from step 3 (recommended), or your own value. In my case, this value is 110, and I'm using self-managed node group. I need to specify the following text for the BootstrapArguments parameter in the EC2 user data:

...
/etc/eks/bootstrap.sh blog \
--kubelet-extra-args  '... --max-pods=110' \
...

Note
If you also want to assign IP addresses to Pods from a different subnet than the instance's, then you need to enable the capability in this step. For more information, see Custom networking for pods.

2. Describe one of the nodes to determine the value of max-pods for the node and the number of available IP addresses.
% NODE=ip-10-0-x-yyy.us-west-2.compute.internal
% kubectl describe node ${NODE} | grep 'pods\|PrivateIPv4Address'
An example output is as follows.

  pods:               110
  pods:               110

In the previous output, 110 is the maximum number of Pods that Kubernetes will deploy to the node.

The Pods that are previously not being able to be allocated, now have created successfully.

References

Increase the amount of available IP addresses for your Amazon EC2 nodes

Instances built on the Nitro System

Amazon EKS recommended maximum Pods for each Amazon EC2 instance type

Kubernetes Scalability thresholds

Category: AWS Tags: K8s EKS public

Sky Cone