Skip to content

Handling the Challenge of Deploying eBPF into the Wild

By
Guy Arbitman
Guy Arbitman
Omid Azizi
Omid Azizi
9 min read
eBPF
Handling the Challenge of Deploying eBPF into the Wild

This is the second in a series of posts in which we explain how to ship a single, portable container of your eBPF program. The first blog was written by my collaborator Omid Azizi from New Relic (Pixie Labs), who covered the topic of building an eBPF program using the BCC toolkit and the resulting problems with portability. In this post, we continue this discussion with a focus on libbpf and CO-RE. 

Brief recap

Before CO-RE was introduced, deploying an eBPF program on a user’s machine using a framework like BCC required the following:

  • The eBPF program

  • The clang compiler, to compile the eBPF code on the host machine

  • Linux kernel headers for the host machine’s kernel

In the BCC flow, when a user runs the eBPF program, BCC compiles the eBPF source code on the target using Clang and the host’s kernel headers, and then loads the eBPF program into the kernel. This process needs to be done on the target host to make sure any Linux structs accessed by the eBPF program are consistent with the running kernel.

The major pitfall is, of course, the dependence on kernel headers, which might be missing.

The previous blog post suggested several solutions to handling this problem, but without the kernel headers, it’s hard to reliably deploy an eBPF program.

So how can we deal with the portability issues?

libbpf+CO-RE to the rescue

To resolve the challenge of having to compile the eBPF code on the target machine, eBPF developers created CO-RE. 

CO-RE (Compile One Run Everywhere) is a relatively new approach in the eBPF world in which we compile our eBPF module only once. The compiled binary is then shipped to users, and we just need BTF (BPF Type Format) information to let us know how to adjust our eBPF byte code to match the underlying kernel.

BPF Type Format

BTF is an acronym for BPF Type Format and it is:

The metadata format which encodes the debug info related to BPF program/map.

So, in other words, each kernel can expose its BTFs for every kernel struct, the BTF will hold enough data which will let us know how to find a specific field in a struct, and more generally, it will help us ship the same piece of code from kernel to kernel without changing it.

Every kernel that supports BTF exposes the data using a file located at /sys/kernel/btf/vmlinux.

With CO-RE, instead of needing to host Linux headers, we need the host BTF files. Now you might think we’ve just replaced one problem with another, but there’s light at the end of the tunnel.

BTFHub initiative

Up until now, we have figured out that BTF files are crucial for the CO-RE mechanism, but unfortunately, they are only “built-in” on the latest kernels. So what can we do if our machine does not have a built-in BTF file? We can try and generate one!

What do we need to do to generate a BTF file for a certain kernel?

  • Find the FTP server holding the debug packages for your OS

  • Download the index file which has the metadata of each package

    • Name

    • Location in the server

  • Look for your kernel debug symbols package in the index file

  • If you have a match, download the debug symbols

  • Extract from the debug symbols the vmlinux file

  • Use pahole to extract the BTF file

The process might take a while, and it is not easy for most users. Thankfully, there is a wonderful initiative called BTFHub powered by Aqua security and maintained by Rafael David Tinoco. BTFHub maintains scripts to download and generate BTF files for the kernels of Ubuntu, Debian, Amazon2, Centos, and Fedora. Furthermore, they even hold a dedicated repository that stores the BTF files!

So problem solved! Let’s download the BTFHub-archive repository, put it in our docker, and viola!

But, each BTF file weighs approximately 3-4MB in a compressed form (and ~25MB after decompression), and there are thousands of BTFs as of today.

So we can’t actually download the entire hub. Fortunately, the BTFHub initiative handles that problem too.

Instead of holding the entire BTF file, let’s extract only the relevant data for our BPF, which will make a smaller BTF and we will be able to ship a directory of BTFs in our docker at a small price.

In the BTFHub repository, you’ll find a bash script called called btfgen.sh, which receives a BPF file and generates custom BTF files that hold only the data relevant for your BPF.

In our scenario, each custom BTF weighs 1.3KB instead of 4MB.

Bottom line, by using BTFHub, we can access BTFs for a large number of kernels, and we can use that to pack a directory with all the BTFs we know about in less than 3MB (1.3KB per file X 2000 BTFs that exist today).

A real-life use case

After discovering BTFHub and adding a custom BTFHub directory step in our docker build process, we started shipping our new version. Everything worked great for a couple of days until we noticed that our product stopped running. A short debug session revealed that our product could no longer find a suitable BTF. It was weird: a few days ago it was able to find its BTF, but now it failed to do so?

We checked the kernel version, and we found out it was upgraded to a newer version.

In all clients, the root cause was the same - they released a new version of their product, which triggered a re-creation of the deployment environment, and during the re-creation of the deployment, new machines were freshly created. Those versions had the latest kernel versions, which were released in between our product installation and the day it stopped working.

We tried to think how can we handle such a scenario in the future, and we came up with 2 solutions:

  • Rebuild our dockers every day

    • Will allow us to create updated dockers with all recent BTFs

    • Problematic, as we will need to build all docker versions, including old ones that are in use by some customers

  • Create an online server that contains the BTFHub repo, and updates it every day

We chose the second solution, and we called it BTFHub online.

BTFHub Online

The solution is very simple, we create an online server that exports a single REST API to pull the customized BTF.

The API expects to receive both the BPF CO-RE binary, and the exact BTF version (distribution name, distribution version, arch, and kernel version). If the BTF exists, we try to create a trimmed BTF for the given BPF binary.

We periodically update the BTFs, thus keeping the server as up-to-date as possible. The outcome is an online and updated server, able to create a custom BTF for a given BPF binary.

So what is our full approach regarding the BTF?

Our recipe for an eBPF module

As you can understand by now, BTF is a powerful utility, but we cannot guarantee the host machine supports it out-of-the-box, thus we have come up with 4 layers of BTF searching and loading mechanism, to ensure that we have tried everything to load BTF before we give up.

Configurable location

The idea here is to have a configurable field specifying a location (on the local disk) of a BTF to use.

The purpose is for the day we will need to manually tell our product to trust us, and to use a specific BTF.

When should it be applied? If there is no BTF for your specific kernel version, but there is a BTF for the previous minor release. That one is tricky, you will need to be certain that there is no difference between the BTFs for your usage!

Default Linux location

As we mentioned earlier, some operating systems (from a certain revision) support BTF out-of-the-box, by having a BTF file in /sys/kernel/btf/vmlinux. So our product checks that location, and if the BTF exists, we use it!

Local BTFHub directory in our docker

During our docker build process, we generate a directory with custom BTFs for our current BPF binary.

Thanks to the BTFHub initiative, the custom BTFs are relatively small (about 1.3KB per file), so we can store ~2000 BTFs in our container at a cost of 3MB. That’s amazing!

So our product extracts 4 fields from the host machine:

  • Distribution name (Ubuntu, Debian, Amazon, COS, Centos, Fedora, etc.)

  • Distribution version (20.04 or 18.04 for Ubuntu, 2 for Amazon, 89 for COS, etc.)

  • Kernel arch (x86_64 or arm64)

  • Kernel version (5.11.0-1021-gcp for example)

We look for the relevant BTF in the local BTFHub directory (at <distro name>/<distro version>/<arch>/<kernel version>.btf), if it exists - we use it.

Fetching new versions from online BTFHub

If we reach here, it means we didn’t find a suitable BTF yet. Thus, we query the online and updated BTFHub server for a BTF that matches the distribution name, distribution version, arch, and kernel version.

If the online BTFHub has that BTF, we pull it and use it.

Source code

btfhub-online

Why do we use the CO-RE mechanism?

We’ll keep it simple:

  1. Container size - might be critical in some customers and environments

  2. Portability issues

    1. Consider your customer's experience when you find out your product cannot run in his machine and you need to add more code to support the customer’s environment.

    2. Consider the time spent to find the issues and fix them for each kernel.

  3. libbpf+CO-RE adoption is growing, and there are some initiatives such as the L3AF  project that aims to use CO-RE modules only.

References

The article does not cover all aspects of BTF and CO-RE, and therefore I suggest you read the great articles below

BTFHub source code and archive repository: