Most of you would have noticed by now that my content is heavily focused on Computer Networks and systems. And yes, I love these subjects (although my grades might say otherwise). I started this blog because the technology I deal with on a day to day basis usually does not have a tutorial targeted at people who do not already have multiple years of relevant industry experience under their belts. It is not trivial for most newbies (including myself) to understand concepts like SmartNICs, eBPF and other stuff. And while writing this blog, I try to be as accurate as my knowledge allows me to be.
Pre-Requisites
- The data is travelling in some physical form. It could be electrons on the wire, light switching on and off in a fibre optic cable or electromagnetic waves for WiFi.
- Now, this 'signal' is detected by your network card.
- The Network card's first job is to parse the signal and convert it into digital packets. This is hardware accelerated and hardly takes any time.
- Once that is done, the NIC checks the packet's L2 (Datalink/Mac layer) to determine whether the message is meant for itself or someone else. And unlike the porch pirates that Mark Rober had come across, these cards are good and don't touch packets intended for someone else.
- If the packet is meant for itself, it raises an interrupt, informing the CPU that a new packet is here! (Of course, this packet is in the buffer and then DMA'd into memory).
- Now, the CPU de-packetises from L2 to L4 in the kernel.
- After the packet has been de-packetised until the transport layer, the data is copied into the intended application's memory.
- The application processes the L5-L7 data and does whatever it was designed to do.
- In #5 above, if you have too many packets incoming, you will raise too many interrupts. It's like that annoying over-excited kid in the classroom who constantly keeps raising their hand (you are lucky if you never came across anyone like that) or like the group that happened when you were offline for a few hours.
- In #7 above, you are essentially copying the data again, which is also very inefficient.
- A third inefficiency comes from the CPU constantly switching between Ring 0 and Ring 3, which is inefficient for small packets, such as the packets used for Voice over LTE (VoLTE) on your phones. More information on CPU Rings
- The most obvious solution to the first problem is to yell "Shut up" at the NIC. But, Linux (and, in general, computers) are much more polite. So, instead of yelling "Shut up", the kernel now has a feature that tells the NIC to send the interrupt only once. Then the NIC puts all the upcoming packets in a queue. The CPU reads the data at its discretion, and when it is done reading all the data, it informs the NIC that it can raise the interrupt once again if any new packets come. Wikipedia and Documentation.
- The 2nd problem has a non-trivial solution called Zero-copy. As the name suggests, it avoids copying the data. This is done by various methodologies like extending the Virtual Memory Space given to an application and mapping it to the physical location where the data is stored. It can then notify the application to read the data from that location.
- The third problem is the most difficult to solve. The CPU will always switch between Ring 0 and Ring 3 whenever it transitions from executing the kernel's code to the user application code. And the only way to avoid that is to avoid the kernel altogether.
Comments
Post a Comment