If you've ever worked with Kubernetes, then you might have wanted to run it locally to save time, debugging complexity, and money. For that, we have solutions like Minikube, which makes it possible for developers to develop and run Kubernetes locally and with ease.
But does it work exactly like a real Kubernetes environment? That depends on your features and needs.
In this blog, I will examine a single difference between a real Kubernetes environment and Minikube, which I stumbled upon while developing our own eBPF based traffic capturing tool.
Our use case
At Seekret, we developed an eBPF based traffic capturing tool that is loaded into a machine’s kernel and tracks different protocols like HTTP, gRPC, Kafka, and more. When we capture traffic, we also care about the process that manages the connection that gave us the traffic payload. For example, when we capture an HTTP payload (request and its response) we want to know which server it came from. We call it a “friendly name” that allows us to have meaningful references and shows a relationship between the client and server in our system.
How are we able to get that friendly name? It's a two steps process.
First, our kernel eBPF program attaches the process ID (a.k.a PID) of the process handling the connection to each captured payload. For example, if we have an HTTP server that receives requests, and its PID is 1337, then for each HTTP request or response, the server will handle, our kernel program will capture the traffic, it will attach the PID (1337 in our example) to the payload, and then it will send it to the user-mode module for further processing.
Then, our user-mode module will look for the given PID in the
/proc directory, and it will extract the process name from
/proc/<PID>/cmdline, which will be considered as our “friendly name”.
A variety of friendly names
In the previous section, I gave an example of a friendly name as the command line of a given PID. In different environments, we can use different friendly names. For instance, in docker-based environments, we can use the docker name or the docker image. In Kubernetes-based environments, we can use the pod name or the service name attached to the POD (if there is one).
In all use cases, we start by having the PID that handled a connection, and for every environment, we are able to convert that PID to the best friendly name possible.
Containers and resources
In most cases, our traffic-capturing tool is running as a container (docker for docker-based environment or daemonset for Kubernetes environment). If you are not familiar with containers, then you can find a very good explanation here:
Containers are a form of operating system virtualization. A single container might be used to run anything from a small microservice or software process to a larger application. Inside a container are all the necessary executables, binary code, libraries, and configuration files.
Each container simulates an operation system, meaning that each container has its own resources (files, processes, traffic), and there are strict isolations between container resources, so one container cannot access the resources of another container, including its processes.
Every container has its own processes, but eventually, every process in a container is a process on the host machine. For example, if your container is running a server, you’ll probably see that server when you run a
ps command line inside the container, but you’ll be able to see that process when running the
ps command on the host machine. In the container, we will have a PID (for example, 4), and in the host machine, we will have a different PID.
I ran a Kafka docker on my machine, the entrypoint of that image is
tini -- /run.sh. When I ran
ps aux | grep tini, I saw that the kafka process ID is 49868.
Then, I ran the same command line inside the docker and the result was process ID 1.
Containers and eBPF
How is our eBPF capturing tool able to capture traffic and find the friendly name?
Although containers are “virtualizations of operation systems”, they share the same kernel, so when our eBPF program is loaded into the host machine’s kernel, we are able to capture the traffic from the entire host machine.
Which PID does our eBPF program see in the kernel?
As we run in the kernel and all container processes are eventually processed on the host machine, we see the PID of the processes in the host machine.
How is our eBPF program able to convert a PID if we run it as a container?
We mount the
/proc directory from the host machine into our containers, and then we are able to get any attribute from the processes.
We rely on the
/proc directory from the host machine as our “oracle” and “source of truth".
Minikiube vs Kubernetes
Recently, I’ve been asked if our eBPF traffic capturing tool supports Minikube, and immediately I answered yes as Minikube is just a “local Kubernetes”. Soon, though, we learned that it has some differences we didn’t expect.
The “friendly name” feature didn’t work, although it ran thousands of times in Kubernetes environments.
We set up a Minikube environment and we were able to reproduce the same behavior.
We saw that the eBPF kernel program attaches high PIDs, and the user-mode module wasn’t able to locate those PIDs under the mounted
/proc directory. For example, an attached PID was 140123, while the highest PID in the mounted
/proc directory was 5000.
Although it might vary from distribution to distribution, the PID dispatching for new processes is using “monotonic” and increasing numbers, meaning a process dispatched now will get the PID 1000, and the next dispatched process will get the PID 1001. So seeing a PID of 140123, while the highest PID in the
/proc directory is 5000, was weird.
A quick check on the host machine shows that the PID exists, and checking its
cmdline attribute shows the process is a server running as a container in Minikube.
So what is the difference?
While in a real Kubernetes environment, the host machine is the one that runs all containers, and we rely on the fact that the
/proc directory contains all processes. In Minikube, however, there is a docker that runs all the other containers (a.k.a
mother of all containers). So when our eBPF container mounts the
/proc directory, it actually mounts the
/proc directory of the
mother of all containers, which is different than the
/proc directory of the host machine.
We tried to mount the
/proc from the host machine to the
mother of all containers so I would be able to mount it properly to our container. A quick check of Minikube’s documentation reveled that the
minikube mount command line can allow us to add mounts.
But when I tried to run it, it ended with weird errors. For example -
sudo ls -lh /proc/self on our host machine gave us:
In Minikube, we got:
An internal directory was mounted as a file? That was unexpected…
We checked more directories and saw the same weird behavior.
Getting back to the documentation, we saw that:
9P mounts are flexible and work across all hypervisors, but suffers from performance and reliability issues when used with large folders (>600 files).
/proc directory is too big for Minikube to mount, and thus, internal directories are not being mounted properly.
Further checks have led us to run Minikube with a different command line, which mounts the
minikube start --mount --mount-string /proc:/host-proc --driver=docker.
/host-proc from the
mother of all containers to my container gave me the correct behavior I was expecting to have.
Although Minikube is a “local Kubernetes” environment, there is still a big difference between Minikube and Kubernetes. For most users and use-cases, it won’t be harmful, but if you do rely upon mounts or use eBPF, be aware of those differences.