Skip to main content

How do programs work on a computer? (Part 1/2)

Let's take a very simple program for understanding how this system works. The following example is a very basic C code that assigns 2 variables (x and y) and then adds them to generate a value 'z'.  Note: We do not output anything in this program void main() { int x = 3; int y = 4; int z = x + y; } But we have a pressing question. Computers inherently work on electronics and logic, so how does it understand and execute the program? The answer is not very simple and you have to bear with me through a tiny journey. Basic Logic Circuits Let's first begin with a very basic Circuit which implements AND logic and OR logic (skip if you are familiar)  As seen in the circuits above, for the left circuit, the LED lights up only when both switches are on. This is AND logic as "Switch 1 AND Switch 2 need to be on". In the circuit on the right, either of the switch will switch on the LED. Note that in the

The small click returns (with results) (Part 2/2)

Let us continue from where we left off. That is from the time when your request's packets were delivered to Google's Data Centre.

At Google's Data Centre (Cloud Computing)

  1. Your request has now travelled through multiple routers and switches throughout the internet and reached a google Data Centre (DC). 
    • This is a specific request, which is requesting a search result, and hence it hit a DC. 
    • Suppose this was a generic request like the homepage of google or the login page of Facebook, which is static (constant) and does not vary. In that case, your request may not even go all the way to a DC of Google/Facebook and instead be served by something called CDNs (Content Distribution Networks).
    • CDNs are servers that store and serve static files like HTML, CSS etc. So, if you open Facebook and login, the HTML, CSS and JS of the page, i.e. content like the formatting, background colour, feed blocks, sections, icons like home and settings etc. are served by CDNs
    • The loaded Javascript then sends requests to the Facebook server for the posts you see in your feed. This is dynamic and changes with time or is user-specific, and hence CDNs can't serve them.
  2. At the DC, Google has a load balancer that will determine which server your request will go to.
  3. The DC contains multiple physical servers, most of which run some flavour of Linux.  There are distributed computing systems like Kubernetes (k8s in short) and borg that run and manages these physical servers.
    • Containers are packages that contain all the components necessary to run a particular program or service. So there would be a container running the 'search' service of
    • Multiple such containers are managed by k8s and are scheduled on the physical servers. There can be 100,000 containers all doing just the one specific task of serving search results.
    • The load balancer determines which container in which physical server will serve your request. It balances the load across all the physical servers so that your request gets served as quickly as possible.
  4. There is another service (maybe another 100,000 containers?) that runs on those servers. This service crawls through the internet and index and store information about the websites. 
    • There can be separate devices with multiple hard drives and SSDs to store a vast amount of information. 
    • They can also have a separate microcontroller that serves the content of those drives over the network to the service that requests it. 
    • That device maintains a track of which track and sector the data is stored in. As the image below shows, each drive is divided into layers, tracks and sectors.
    • Another microcontroller, inside the physical hard disk, will do the actual job of reading the bits from each disk-track-sector and sending it to the cluster's microcontroller mentioned above.
  5. At the server, your packets are de-packetised all the way back to the Top Layer in the OSI model. And the request is thus determined. The search service will determine which of the hard disk clusters contains the information you have searched for, request the cluster's microcontroller there to fetch and send that data. 
  6. There would be yet another service, this time running some sort of Machine Learning algorithm that would notice your current query and prioritise the results based on your previous queries. 
    • For example, if I have searched a lot about insects and plants in the past, searching 'bugs' will yield me results of insects.
    • Instead, if my search history contains a lot of queries regarding computer programming, searching 'bugs' might yield results more relevant to computers and code.
  7. The prioritised results will then be packetised the same way as described in part 1 of the post. And sent back to you the same way. Your request contains the destination IP address of your computer or the last router on your path. And hence the service knows where to respond to.

Showing the results to you (Computer Graphics)

  1. Now this time, the request would be received by your WiFI/LAN hardware and stored in your receive buffer on it. It also raises an interrupt.
  2. This interrupt is served the same way as the previous one, except that the ISR and the Driver will be different this time.
  3. Now, the request and response's transport layer (refer to OSI model) contain a port number, which is the unique ID of the requester. Hence the OS knows exactly which program/process to forward the data to. 
  4. Now that chrome receives your search result, it will parse the HTML, CSS and JS associated with it.
  5. HTML and CSS, as mentioned before, tell you how to display something. So, your browser will parse them.
  6. But your display is just a collection of Red, Green and Blue (RGB) pixels. It does not understand anything except which colour to show and at what intensity (brightness).
  7. So your processor, or better yet, your GPU, determines exactly which pixels need to show which colour and at what intensity, which in itself is a complex task.
    • It is complex as HTML just says, show this text in black colour and font size 12. But which exact pixels to light up so that a 'd' looks different from a 'cl' is something the CPU/GPU has to determine.
  8. Then those instructions on which pixel to light up and sent and stored in another hardware called a display buffer, which is then used by a controller attached to the display to actually light up those pixels and display the results.
So, this ends how you get a search result back. As you see, there are many steps involving a lot of complexity. I always admire how easy it is to google something and get the results back in a few milliseconds, especially when I am aware of the complex stuff in the background.

P.S. If you are interested in knowing more about the click's journey, it left a travelogue too.


Popular posts from this blog

Dynamically extending the Linux Kernel! An introduction to eBPF

A simple Google search of "eBPF" will tell you that it stands for "extended Berkeley Packet Filter". The words "packet filter" in the name makes me think this has something to do with Computer Networks. But, while it started as a packet filter system, it has extended much beyond that functionality since then. Facebook also had Whatsapp, Instagram, Oculus etc., under its belt, and hence the board decided to rename the company to Meta to symbolise this. In the same way, maybe eBPF should be renamed to represent what it is! So, what exactly is eBPF? To understand eBPF, let us first understand why it is required. A short video explaining eBPF can be found here . Pre-requisites In this journey, the first thing to learn is the difference between userspace and kernel space. This  sub-section of my previous post gives you an overview. As mentioned there, you don't want your user application to have access to any data of any other running application unless spe

The future of Web Acceleration and Security #2: DPDK

Most of you would have noticed by now that my content is heavily focused on Computer Networks and systems. And yes, I love these subjects (although my grades might say otherwise). I started this blog because the technology I deal with on a day to day basis usually does not have a tutorial targeted at people who do not already have multiple years of relevant industry experience under their belts. It is not trivial for most newbies (including myself) to understand concepts like SmartNICs , eBPF  and other stuff. And while writing this blog, I try to be as accurate as my knowledge allows me to be. Source and Attribution That being said, let us first understand the limitations because of which DPDK has become necessary. Pre-Requisites Before we begin, you should already know the basics of what a network packet is and how it's transmitted. If you are not aware of it, please visit the links below (both are necessary): Life of a packet Life of a packet - Deep Dive Now that you have unde

A small click goes a long way (Part 1/2)

Have you ever wondered what actually happens when you type something in the google search bar and press 'Enter'? It's a pretty massive collection of some very complicated sets of systems interacting with one another. In this article, I will break down each of them for you. For the sake of simplicity, I will assume some things. For example, I will assume that you are on an android phone using a chrome browser and are doing a google search. The assumption will mostly be stated whenever there is one. With that, let's dive into what happens. Each subtopic is formatted as "Event (Education field related to it)." Attribution and License: Touch Input (Digital Electronics) As soon as you press the "Google Search" button on your screen, the following things happen: Your phone has what is called a capacitive touch screen. As soon as you tap on the screen, there is a ch