XDP and eBPF for Network Observability with Python

I’ve been playing with XDP and eBPF in my lab to see if it might be possible to create NetFlow/IPFIX style flow logs for network observability purposes. Of course this is possible, but is this something that is achievable in a few hours for average Joe?

In my previous article I discussed the what eBPF and XDP are and how they can be used for observability, security, and kernel behavioural changes. This article takes this further with some working code. I chose Python for my examples because to make it clear which parts of the code are user-land code, and which parts are eBPF (kernel-land code).

The example in this article is an expansion on Liz Rice’s eBPF beginners guide and book Learning eBPF (chapter 8, page 147). In this example we are going to extend the packet counter which kept a record of how many ICMP, TCP, and UDP packets have arrived on a hard-coded interface.

Enhancements:

  1. Seperate Python and C into seperate files for clarity.
  2. List interfaces available for selection (as opposed to hardcode eth0).
  3. Detach the XDP/eBPF from interface on ctrl-c (Keyboard interrupt).
  4. Display a list of source IP addresses that have sent IP traffic.
  5. Create an additional BPF_HASH to store source IP address packet counter.
  6. Create a function to export the source IP address from the packet header.
  7. Convert an unsigned integer to a dotted decimal address.

This code is for learning purposes and there is no guarantee the code will not break your system, although that is one of the premises of eBPF. I’ve used Debian Bookworm in VM (KVM) with only a minimal set of packages installed. You will need at least the following additional packages for development purposes:

  1. python3-bpfcc
  2. linux-headers-6.1.0-17-amd64
  3. bpftools
  4. xdp-tools
  5. vim
  6. sudo

Sudo isn’t necessarily needed, but because XDP code needs to be run as a super-user or with the correct capabilities (CAP_SYS_ADMIN and CAP_BPF etc). Setting capabilities is beyond the scope of this article, so we will run our code using sudo.

We are going to have three parts to our code split into three different files (packet.py, packet.c, and packet.h) to make the purpose of each piece of code clear. When packet.py is run, it will automatically compile packet.c/packet.h to eBPF byte-code and ensure that passes validation (safety check) prior to attaching the code to the selected interface.

Example Run

leigh@ebpf:~/ebpf/ebpf-xdp-example$ sudo python3 packet.py
[sudo] password for leigh:
Select your interface:
lo
ens18
Interface name: ens18
Protocol 1: counter 2,Protocol 17: counter 2,
Sources 192.168.20.97: counter 1,Sources 192.168.20.12: counter 2,Sources 192.168.20.1: counter 1,
Protocol 1: counter 4,Protocol 6: counter 2,Protocol 17: counter 4,
Sources 192.168.20.97: counter 3,Sources 192.168.20.12: counter 6,Sources 192.168.20.1: counter 1,
Protocol 1: counter 6,Protocol 6: counter 4,Protocol 17: counter 5,
Sources 192.168.20.97: counter 3,Sources 192.168.20.12: counter 10,Sources 192.168.20.1: counter 1,Sources 192.168.20.139: counter 1,
Protocol 1: counter 8,Protocol 6: counter 5,Protocol 17: counter 6,
Sources 192.168.20.23: counter 1,Sources 192.168.20.97: counter 3,Sources 192.168.20.12: counter 13,Sources 192.168.20.1: counter 1,Sources 192.168.20.139: counter 1,
Protocol 1: counter 10,Protocol 6: counter 6,Protocol 17: counter 6,
Sources 192.168.20.23: counter 1,Sources 192.168.20.97: counter 3,Sources 192.168.20.12: counter 16,Sources 192.168.20.1: counter 1,Sources 192.168.20.139: counter 1,
Protocol 1: counter 12,Protocol 6: counter 7,Protocol 17: counter 9,
Sources 192.168.20.23: counter 1,Sources 192.168.20.97: counter 3,Sources 192.168.20.37: counter 2,Sources 192.168.20.12: counter 20,Sources 192.168.20.1: counter 1,Sources 192.168.20.139: counter 1,
^CDetaching
Exiting
leigh@ebpf:~/ebpf/ebpf-xdp-example$

packet.py

This file is the orchestrator and user-land component of our code. The get_interfaces code asks the user which interface to use. The start_monitoring code attaches (and detaches) the code to the selected interface and subsequently checks the eBPF maps for the latest data from the kernel-land code.

Source

#!/usr/bin/python
from bcc import BPF
import socket
import struct
from time import sleep

def get_interfaces():
    print("Select your interface:")
    # Return a list of network interface information
    interfaces = socket.if_nameindex()
    for iface in interfaces:
        print(iface[1])
    val = input("Interface name: ")
    for iface in interfaces:
        if val == iface[1]:
            return val
    else:
        print("invalid interface name")
        exit()


def start_monitoring(interface):
    b = BPF(src_file="packet.c")
    b.attach_xdp(dev=interface, fn=b.load_func("packet_counter", BPF.XDP))
    try:
        while True:
            sleep(2)
            s = ""
            for k, v in b["packets"].items():
                s += "Protocol {}: counter {},".format(k.value, v.value)
            print(s)
            source = ""
            for k, v in b["sources"].items():
                source += "Sources {}: counter {},".format(socket.inet_ntoa(struct.pack('<L', k.value)), v.value)
            print(source)
    except KeyboardInterrupt: #7
        print("Detaching")
        b.remove_xdp(iface, 0)
        print("Exiting")

if __name__ == "__main__":
    iface = get_interfaces()
    start_monitoring(iface)

packet.c

This file contains the XDP function that will be compiled to eBPF byte code and run in kernel space attached to an interface. For every IPv4 packet that comes into this interface, the code will check that the protocol is IPv4 and record the next protocol (TCP, UDP, ICMP) byte (Wireshark display filter ip.proto).

Two BPF hashes are defined which are the eBPF maps we can use to send kernel-land date to the user-land python code.

Source

#include "packet.h"

BPF_HASH(packets);
BPF_HASH(sources);

int packet_counter(struct xdp_md *ctx) {
    u64 counter = 0;
    u64 source_counter = 0;
    u64 key = 0;
    u64 source = 0;
    u64 *p;
    u64 *s;

    key = lookup_protocol(ctx);
    if (key != 0) {
        p = packets.lookup(&key);
        if (p != 0) {
            counter = *p;
        }
        counter++;
        packets.update(&key, &counter);

        source = lookup_source(ctx);
        s = sources.lookup(&source);
        if (s != 0) {
            source_counter = *s;
        }
        source_counter++;
        sources.update(&source, &source_counter);

    }

    return XDP_PASS;
}

packet.h

Packet.h is the C header file which contains two inline functions to assist with the decoding of the IP headers. This file is far from efficient, however demonstrates how to dissect data from a xdc_md struct.

Source

#include <linux/ip.h>

#define IP_ADDRESS(x) (unsigned int)(172 + (17 << 8) + (0 << 16) + (x << 24))

// Returns the protocol byte for an IP packet, 0 for anything else
static __always_inline u64 lookup_protocol(struct xdp_md *ctx)
{
    u64 protocol = 0;

    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    if (data + sizeof(struct ethhdr) > data_end)
        return 0;

    // Check that it's an IP packet
    if (bpf_ntohs(eth->h_proto) == ETH_P_IP)
    {
        // Return the protocol of this packet
        // 1 = ICMP
        // 6 = TCP
        // 17 = UDP
        struct iphdr *iph = data + sizeof(struct ethhdr);
        if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) <= data_end)
            protocol = iph->protocol;
    }
    return protocol;
}

static __always_inline u64 lookup_source(struct xdp_md *ctx)
{
    u64 source = 1;

    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;
    if (data + sizeof(struct ethhdr) > data_end)
        return 0;

    // Check that it's an IP packet
    if (bpf_ntohs(eth->h_proto) == ETH_P_IP)
    {
        struct iphdr *iph = data + sizeof(struct ethhdr);
        if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) <= data_end)
            source = iph->saddr;
    }
    return source;
}

Manually detaching XDP code

Code can be manually detached using xdp-loader.

leigh@ebpf:~/ebpf/ebpf-xdp-example$ sudo xdp-loader status
CURRENT XDP PROGRAM STATUS:

Interface        Prio  Program name      Mode     ID   Tag               Chain actions
--------------------------------------------------------------------------------------
lo                     <No XDP program loaded!>
ens18                  packet_counter    native   81   10540f65ac5626d6

leigh@ebpf:~/ebpf/ebpf-xdp-example$ sudo xdp-loader unload ens18 -a

Summary

In the next XDP article we will move closer to collecting a 5-tuple (src-ip, dest-ip, src-port, dest-port, protocol) which could be used to export to a NetFlow/IPFIX collector. If you made it this far, thanks for reading.


Posted

in

by

Tags:

Comments

One response to “XDP and eBPF for Network Observability with Python”

  1. […] a previous article, I wrote about using XDP to collect information at the interface level to collect NetFlow style statistics. The issue with […]

  2. Short Hairstyles Avatar

    Your comment is awaiting moderation.

    Hi! Someone in my Myspace group shared this website with us so I came to check it out. I’m definitely loving the information. I’m bookmarking and will be tweeting this to my followers! Terrific blog and excellent style and design.

Leave a Reply

Your email address will not be published. Required fields are marked *