The future of Web Acceleration and Security #2: DPDK

Most of you would have noticed by now that my content is heavily focused on Computer Networks and systems. And yes, I love these subjects (although my grades might say otherwise). I started this blog because the technology I deal with on a day to day basis usually does not have a tutorial targeted at people who do not already have multiple years of relevant industry experience under their belts. It is not trivial for most newbies (including myself) to understand concepts like SmartNICs, eBPF and other stuff. And while writing this blog, I try to be as accurate as my knowledge allows me to be.

Source and Attribution

That being said, let us first understand the limitations because of which DPDK has become necessary.

Pre-Requisites

Before we begin, you should already know the basics of what a network packet is and how it's transmitted. If you are not aware of it, please visit the links below (both are necessary):

Now that you have understood how a packet goes around the internet, let us see one of its limitations. For understanding this, we will consider that the device is currently receiving data. Here's the process that happens:

The data is travelling in some physical form. It could be electrons on the wire, light switching on and off in a fibre optic cable or electromagnetic waves for WiFi.
Now, this 'signal' is detected by your network card.
The Network card's first job is to parse the signal and convert it into digital packets. This is hardware accelerated and hardly takes any time.
Once that is done, the NIC checks the packet's L2 (Datalink/Mac layer) to determine whether the message is meant for itself or someone else. And unlike the porch pirates that Mark Rober had come across, these cards are good and don't touch packets intended for someone else.
If the packet is meant for itself, it raises an interrupt, informing the CPU that a new packet is here! (Of course, this packet is in the buffer and then DMA'd into memory).
Now, the CPU de-packetises from L2 to L4 in the kernel.
After the packet has been de-packetised until the transport layer, the data is copied into the intended application's memory.
The application processes the L5-L7 data and does whatever it was designed to do.

I hope you found at least one limitation in the above steps! If not, that's okay. That's what learning is all about! So here are the primary limitations:

In #5 above, if you have too many packets incoming, you will raise too many interrupts. It's like that annoying over-excited kid in the classroom who constantly keeps raising their hand (you are lucky if you never came across anyone like that) or like the group that happened when you were offline for a few hours.

In #7 above, you are essentially copying the data again, which is also very inefficient.
A third inefficiency comes from the CPU constantly switching between Ring 0 and Ring 3, which is inefficient for small packets, such as the packets used for Voice over LTE (VoLTE) on your phones. More information on CPU Rings

So, what can be done about these?

The most obvious solution to the first problem is to yell "Shut up" at the NIC. But, Linux (and, in general, computers) are much more polite. So, instead of yelling "Shut up", the kernel now has a feature that tells the NIC to send the interrupt only once. Then the NIC puts all the upcoming packets in a queue. The CPU reads the data at its discretion, and when it is done reading all the data, it informs the NIC that it can raise the interrupt once again if any new packets come. Wikipedia and Documentation.
The 2nd problem has a non-trivial solution called Zero-copy. As the name suggests, it avoids copying the data. This is done by various methodologies like extending the Virtual Memory Space given to an application and mapping it to the physical location where the data is stored. It can then notify the application to read the data from that location.
The third problem is the most difficult to solve. The CPU will always switch between Ring 0 and Ring 3 whenever it transitions from executing the kernel's code to the user application code. And the only way to avoid that is to avoid the kernel altogether.

But to realistically avoid the kernel, you have to implement driver code for each NIC in userspace, which involves a lot of low-level instructions. You also have to allow the hardware to directly talk to userspace, which is not very secure. In addition, userspace has no access to interrupts. So you need another method of determining when a packet arrives. You also need to re-create/copy the code that de-packetises packets until L4 in the kernel. These are just some things you need to do to solve issue #3.

Intel realised this problem long ago and came up with Data Plane Development Kit (DPDK). DPDK solves all of the problems stated above. I found it easier to explain DPDK in a video, so here is the link to the same.

Root Access

Search This Blog

How do programs work on a computer? (Part 1/2)

The future of Web Acceleration and Security #2: DPDK

Pre-Requisites

Comments

Post a Comment

Popular posts from this blog

Photos to Photons

Dynamically extending the Linux Kernel! An introduction to eBPF

Photons to Photos