Configuring EKS VPC in AWS: Tackling IP Exhaustion Head-On

Özkan Poyrazoğlu
AWS in Plain English
7 min readJul 11, 2023

--

Source

Hello everyone, as the adoption of AWS Elastic Kubernetes Service (EKS) continues to rise, many organizations find themselves grappling with a common challenge: IP exhaustion within their Virtual Private Cloud (VPC) infrastructure. As the number of EKS clusters, worker nodes, and services grow, the available IP addresses within the VPC may become limited, hindering the ability to scale applications effectively. In this article I will discuss the problem of IP exhaustion in VPC and the quickest solutions that can be implemented.

Subnet configuration for worker nodes

What is the IP Exhaustion?

Normally VPC CIDRs are organized according to RFC1918 standards as in default VPC. However, when setting the subnet CIDRs, the CIDR range in which the EKS will work may not be set properly and may be left in a limited range. For example, when the subnet reserved for the container is provided with a 255.255.255.0 netmask, you will have only 254 available IPs per AZ, so it will be quite easy to consume 254 IP with EKS.

To briefly talk about the EKS networking scheme, each pod opens a network interface on the node it is on and has a private IP that it can use within the VPC. Accordingly, it can behave just like an EC2 instance in VPC.

When the subnets that EKS is running on run out of IPs, when the number of available IPs drops to 0, you can see that the pods cannot stand up. When you describe it, you can detail the problem.

You can check available IPv4 address count in Subnets/VPC page in AWS Console, as example below:

Available IPv4 addresses in VPC

Precautions for IP Exhaustion

The best action is always the action taken before the problem occurs. For tracing the available IPv4 count of subnets, you can simply use a Lambda function based on Python (boto3).

We created a simple Python script that checks the subnets every hour and if the trouble is coming, it sends a notification via Slack. You can check the Lambda function via this link in my Github repository.

The subnet-check script needs 3 main environment variables: SLACK_WEBHOOK_URL, REGION and VPC_ID. To invoke periodically, you can use EventBridge.

EventBridge for triggering Lambda function

With Lambda functions you do regular checks in AWS Lambda free tier. 1 request per hour will not approach the 1 million monthly limit.

Configuring Secondary CIDR for VPC

The main point is this article points adding secondary CIDR to the existing VPC for extending IPv4 pool. In this section I will briefly explain how to add the second CIDR to the VPC.

First of all, I should mention that the AWS documentation recommends 100.64.0.0/10 and 198.19.0.0/16 as secondary CIDRs that can be added to the VPC for EKS. But in my tests, when I stayed in the 10.0.0.0.0/8 block and tested with the 172.32.0.0/16 CIDR, I didn’t face any problems with the EKS.

You can use AWS CLI to add new CIDR for existing VPC, but I did in AWS Console for facilitating demonstration.

  • In VPC page, right click on VPC and then select “Edit CIDRs”.
Editing the CIDR’s
  • You can see existing VPC CIDR and “Add new IPv4 CIDR” button. You should click that button and then
Existing IPv4 CIDR
  • In this window, add the IPv4 block you want to add in the notation in the image.
Our new IPv4 CIDR
  • In this window you can verify that the new IPv4 CIDR has been added.
The new CIDR (172.32.0.0/16) was added

Creating Private Subnets for new CIDR

After the edits we made in VPC, we need to create subnets for the CIDR we just created. I have chosen the default VPC to show you simply and to be universal. I recommend you to take this into account.

I divided the new IPv4 CIDR to 3 equal subnets, and I kept remaining IPv4 as free IPv4 pool. You can divide new CIDR as you need.

Additional subnets divided for AZ’s

As a result of this process, we will have a total of 24561 IPv4 addresses for 3 AZs.

After creating the subnets, create a Route Table to provide communication, more precisely routes, and associate it to the subnets. Although I do not recommend it, I created a Nat Gateway for 3 container subnets in this structure I created as an example and created a route so that the traffic to the internet will be through this NatGW. If working for 3 AZ, it is best practice to create 3 different Nat gateways on 3 public zones (on 3 subnets connected to the internet gateway) and create different routes according to the zones with 3 route tables.

Route table that redirects outgoing traffic to NatGateway

Security groups also another keypoint to use secondary subnet with EKS. The new CIDR should be configured in existing security groups (for example, if you are using CIDR instead of security group id in RDS security groups, you should configure security group with new one).

Configuring VPC-CNI Plugin in EKS

Up to this point, I assume that subnets for the new CIDR have been defined on the VPC side and route tables are running well. If there is any trouble or you have a suspicion that it is not working, you can use AWS VPC Reachability Analyzer to find restriction in virtual network side. We can move forward to configure VPC CNI (Container Networking Interface) plugin in AWS EKS.

VPC-CNI (Virtual Private Cloud Container Networking Interface) is the networking plugin used by Amazon EKS to integrate Kubernetes pods with your VPC. It facilitates communication between pods and other resources within the VPC.

We will continue with the kubectl commands for the rest of the post.

The first step in EKS, installed version of the VPC-CNI plugin should be check. To check version, you can simply check the image version of vpc-cni as below.

kubectl describe daemonset aws-node --namespace kube-system | grep Image

The suffix at the image will give information about plugin’s version (like v1.13.2-eksbuild.1). You can check latest version of plugin with link below (also if it is not installed in cluster, you can follow the instructions in same page):

Enable custom network configuration for VPC-CNI plugin with AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG environment variable.

kubectl set env daemonset aws-node -n kube-system AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true

Add ENIConfig label to worker nodes;

kubectl set env daemonset aws-node -n kube-system ENI_CONFIG_LABEL_DEF=failure-domain.beta.kubernetes.io/zone

Before moving forward, a brief overview of ENIconfig is in order;

ENIConfig (Elastic Network Interface Configuration) is a feature provided by Amazon EKS that allows you to configure and manage the Elastic Network Interfaces (ENIs) that are used by your EKS worker nodes. ENIs are virtual network interfaces that provide networking capabilities to EC2 instances in AWS.

In the next step we will configure ENIConfigs for new subnets and existing security groups. You can create a yaml file contains;

apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: eu-west-1a
spec:
subnet: container-subnet-1a
securityGroups:
- sg-1231231231212
---
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: eu-west-1b
spec:
subnet: container-subnet-1b
securityGroups:
- sg-1231231231212
---
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: eu-west-1c
spec:
subnet: container-subnet-1c
securityGroups:
- sg-1231231231212

You should replace security group id sg-1231231231212 to your own EKS cluster’s node group security group id. And you can apply this manifest with kubectl command.

Finally, if everything is done successfully, you can create a new node group in your new subnets and test with sample deployment.

In this article, I tried to briefly explain how to add additional CIDR to AWS EKS cluster as a concept. If you have any questions, feel free to ask in the comments. Thank you for reading.

In Plain English

Thank you for being a part of our community! Before you go:

--

--