Kubernetes CNI — deconstructed

A few months ago, I had to understand in details how Container Network Interface (CNI) are implemented in order to, well, simply get a chaos testing solution working on a bare-metal installation of Kubernetes.

At that time, I found a few resources that helped me understand how this was implemented, mainly Kubernetes official documentation on the topic, and the official CNI specification. And yes, this specification simply consists of a Markdown document, which I needed to invest a consequent amount of energy to digest and process.

I did not, however, find a step-by-step guide explaining how a CNI is practically working: is it running as a daemon ? does it communicate on a socket ? where are its configuration files ?

As it turned out, the answer to these 3 questions is not binary : a typical CNI is both a binary and a daemon, communication happens over a socket and over an other (unexpected, more on that later) channel, and its configuration files are stored in multiple locations!

What is a CNI after all ?

A CNI is a Network Plugin, installed and configured by the cluster
administrator. It receives instructions from Kubernetes, in order to set-up the network interface(s) and IP address for the pods.

It is quite important already to highlight that a CNI plugin is an executable, as specified in the CNI specification.

How does the CNI plugin know which interface type to configure, which IP address to set, etc ? It receives instructions from Kubernetes and more specifically from the kubelet daemon running on the {worker,master} nodes, and these instructions are sent with/through:

To better understand what happens when a new pod is created, I put together the following sequence diagram:

Sequence diagram highlighting the exchanges between the Container Runtime Interface (CRI) and the Container Networking Interface (CNI)

We have a number of steps happening:

netns — network namespace ?

Any idea how pods get their network interfaces and IP adresses ? And how pods are isolated from the node (server) on which they run ?

This is achieved through a Linux functionality called namespace: “Linux Namespaces are a feature of the Linux kernel that partitions kernel resources […]”
Namespaces are extensively used when it comes to containerization, to partition the network of a Linux host, the process IDs, the mount paths, etc.
If you want to learn more about that, then this article is definitely a good read

In our case, we simply need to understand that a pod is associated to a network namespace (netns), and that the CNI, knowing this network namespace, can for example attach a network interface and configure IP addresses for our pod. It will do so with commands that will look like:

ℹ️ if you find this interesting, please note that working with the network namespace of your pod can greatly help you debug your networking problems: you will be able to execute any executable running on your host but restricted to the network namespace of your pod. Concretely, you could do:

nsenter -t $netns --network tcpdump -i any icmp

To only debug/interecpt the traffic that your pod sees. You could even use this command if your pod doesn’t include the tcpdump executable. More info on this debugging technique in this article.

Down the rabbit hole: intercepting calls to a CNI plugin

Let's recapitulate, we know that:

If you have a CNI that doesn’t behave properly (I had an issue with Multus not correctly handling its sub-plugins, which I documented here, for which I submitted a fix that was merged in August 2020), having the possibility to watch/intercept the exchanges between the Container Runtime (CRI) and the Network Plugin (CNI) will become extremely handy.

For that reason, I spent some time creating an interception script, that you can simply install in place of the real CNI executable. This script will intercept and log the environment variables, as well as the 3 standards file descriptors ( stdin, stdout, and stderr ), but this won’t prevent the real CNI from doing its job and attaching the correct network interfaces for our pods.
Concretely, to intercept calls to e.g. the calico CNI, you need to:

Once this is done, you will see many (depending on the number of pods) entries (tagged with the cni tag) in your journal, each corresponding to a CRI/CNI exchange. You can list them with journalctl -t cni :

The screenshot on the left hand side corresponds to all the exchanges that happened between the CRI and the CNI. Here we see pod ADDitions and DELitions.

Your /tmp/cni_logging/ directory will also be containing a lot of log files:

For each CRI/CNI exchange, we indeed have the creation of 4 files:
env, stdin, stdout, and stderr.

Finally, for the curious reader that made it here 😜, an example output of our interceptor script can be seen in this Github Gist (not embed here, as it would otherwise make this story even longer).

Having a look at it, you see what happens when Kubernetes/Kubelet creates a pod:

We have up until now ignored the way Kubelet and the Container Runtime Interface choose which CNI is to be used. When the container runtime is Docker (and the CRI dockershim), the CRI scans the /etc/cni/net.d/ folder, and chooses the first (in alphabetical order) .conf file (e.g. 00-multus.conf). You can read more about this in the dockershim code: K8s/dockershim/network/cni/cni.go::getDefaultCNINetwork

You hopefully now have a concrete understanding of the way CNI (i.e. Network Plugins) are implemented and used within Kubernetes. If you have question or comments, please contact me, I’d be happy to clarify parts of this story.

[ Logo, so as to have a better story picture :) ]

Swiss DevOps Engineer, focused on Kubernetes, Linux, Networking, etc. Side-hobby: walk, cycle, run, play music and building stuff out of my hands !

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store