nSights Talks

Amazon EKS Optimizations (Part II)

Tutorial Highlights & Transcript

00:00 - EKS Optimization Problems
Hello everyone, today I’m going to give a demo on EKS optimization part two. This optimization will be based on a few problems and we have to solve the problem. There are two problems that we are going to solve today.

One problem on EKS by default is low pod density. It’s the number of pods per node. By default, all the EC2 instances have their default pod density defined and we need to increase it. The second problem is slow pod startup times. We’ll solve both these problems.

00:54 - EKS Optimization Solutions
The next question will be how we are going to solve these problems. The solution is AWS VPC CNI, IP prefix delegation. This feature is only available for the VPC CNI version greater than or equal to 1.9.0 And how does it work? It assigns a /28 IP prefix to each ENI and as in the EKS, all the pods are assigned IP addresses from the VPC. So, it requires an ENI, from the VPC directly to assign IP addresses. All the /28 IP prefixes will be assigned to a particular ENI. Because of that, it will increase the pod density per EC2 instance and it will also decrease the pod startup times because the IP is already assigned to the ENI. Support does not have to wait for the IP address to assign to that particular ENI. That’s why it makes it faster.
02:11 - Installation & Configuration
Let’s move to the installation part. There is a major configuration required. As stated earlier, the VPC ENI version should be greater than or equal to 1.9.0. In the VPC CNI daemon set, we need to set the warm IP target env variable as one. It’s enabling it. On the worker node side configuration, we need to disable the use of max exports equal to false and then we need to set a max pod according to our EC2 instances available maximum pods, assignable. That should be in kubelet additional configuration and arguments and the use Max pod should be in the node bootstrap argument in the worker nodes joining to a cluster configuration. These are some reference links that you can use for this IP prefix and VPC CNI.
03:23 - Deploying the VPC CNI Method
Let’s move to the demo. Other than IP prefix, there is another VPC custom CNI method, custom networking in the VPC CNI, which allows you to deploy secondary CIDR in the VPC and assign the pods to that secondary IP addresses only, but it will reduce the number of IP addresses assignable to VPC and pods. Generally, it’s not a preferred method. Based on compliance, we can use that method, as well. By default, currently, I’m using this default worker node group, which is MCI based instance. From this reference URL, AWS provides a sign shell script. With the shell script, we can get the details about how many pods we can assign by default to an EC2 instance. The script URL is here, we can download the script. I already downloaded it beforehand. This is the command. Here we have to provide the CNI version and the instance type so that we can get the details. I will run that command here. By default, MCI large instance supports only 29 pods assigned to this particular single EC2 instance. If we use the IP prefix delegation 10. This is the number. 110 pods we can assign to a single EC2 instance. It increased the pod density from 29 to 110, which is totally free of course. We can reduce the cost by reducing the number of EC2 instances assigned to the EKS cluster. For the demo, I deployed the default worker nodes without any changes. Let’s see how many pods it can assign. This is the command I am deploying for the example application. As you can see, the application is deployed, and currently by default, four pods running on a KubeSystem namespace. These are the four pods that run on the KubeSystem namespace. After 29, we can use the 25 pods to be deployed on the same EC2 instance. Currently, as you can see, all 25 pods are deployed. If I increase this and scale the deployment to 26, it will lead us to 30 pods on a single EC2 instance. Let’s see the behavior. We can schedule it or not. As you can see, there is an error that too many pods on a single node and it is not able to deploy this right.
06:48 - Deploying the IP Prefix Enabled Method
Now we will be deploying with the IP prefix-enabled method. For that, I need to destroy the current worker node groups. I have deleted this example so that the deletion of the node groups will be faster. Let’s wait until the default worker node groups are destroyed. In the meantime, I have created this optimized worker node group configuration. In this configuration, I use this use Max pods equal to false assigned 110 Max pod limits based on our max pod calculator shell script output. Here, I’m deploying the KubeCTL manifest which is a CNI batch. I’m using a 1.10.2 version of VPC CNI here. We can download the VPC CNI manifest from this URL and this website. I have downloaded that YAML file. One thing we need to take note of is by default, all the manifest defined in the website uses US West 2 region for the Docker images. If you want to use this VPC CNI in multiple regions, you need to take a region as a variable. That’s what I have done. I have taken this file as a Terraform template and I’m passing the region value here. I’m passing the bootstrap arguments and kubelet arguments here, which is defining this local file. All the worker nodes are destroyed and now I will comment on this so that it does not deploy a second time. I will enable this code with optimization. Now deploy again. Our new optimized worker node groups are deployed. I am deploying the example application. By default, it is deploying to identify a number of pods and as the limit for the pods is 110 pods and by default, the KubeSystem uses four pods so we can assign 106 pods to the test application. I will scale at this moment 200 which slider enables us to do this. Now let’s see if we are able to use the pods or not. As you can see, as soon as the IP prefix is assigned, lots of pods are getting assigned IP addresses in a faster way. But still, once the IP prefix is assigned pod’s startup time will reduce from two minutes. As you can see, 100 pods are deployed. Currently, all the IP prefix was the single EC2 instance assigned. Now let’s try to increase the pods to 106. Now let’s see the pod startup time after the IP Prefix delegation. So three seconds, four seconds, right? So in the 3-4 seconds, pods are assigned IP addresses and it’s scheduled directly so it also decreased the pod’s startup time.
Jasmeet Singh

Parth Vyas

DevOps Engineer


Parth is a DevOps Engineer at nClouds with multiple AWS certifications including AWS Certified Solutions Architect - Professional, AWS Certified DevOps Engineer - Professional, AWS Certified Developer - Associate, and AWS Certified SysOps Administrator - Associate.