How does the Interwebz really work?
Ahh the interwebz - or the internet for the common folk. This magical system that was put in place to connect the entire world. Majority of the world utilize the internet in a plethora of ways throughout their daily lives. So much so that the United Nations states that having access to the internet is a Fundamental Human Right. So of course with the internet being so important, majority of the populous have to have a grasp of how it works, RIGHT? Well, sadly that's not the case.
My goal is to explain how exactly the internet works in simple words and in a simple analogy of how we receive mail in real life. I will incorporate some of the buzzwords(HTTP, TCP, IP, Ethernet etc.) so that the explanation will be relatable to common folk.
So let's think about it:
- What's the purpose of the internet?
- To keep us connected.
- How does it keep us connected?
- Via Information! (Let's think of these as the note we want to send to our buddy, or the TV we just ordered)
- How do we represent this Information?
- The information is represented via data (1's and 0's) put into special formats (packets) . (Let's think of these as packages & envelopes)
- How does this information move around the internet?
- Via a special set of rules that these packages must obey in order for them to move fast . (How does the post office assure that when they send our package, they will send it the most timely and efficient manner?)
- How do we assure that the information will reach my specific computer?
- Via another special set of rules specific to your local area. (We need to make sure that our package is put in our mailbox and not our neighbor's!)
There are many different things that were defined. In order to make things simpler, we'll break them up into smaller logical "layers" and see how these layers connect!
Let me tell you the funniest joke about the internet. There is no real clear-cut "defined" model (some might argue the OSI model, but that is a different discussion). There exists a set of rules and layers which are generally agreed upon. The names might differ from person to person but the ideals stay the same. So with that: The BIG BAD TCP/IP Model.
So far we have defined:
- Information (What are we sending?)
The Application Layer
- Packages (How are we sending it?)
The Transport Layer
- Route (Where is it going?)
The IP (Internet Protocol) Layer
- Your Mailbox (Goes to the right place)
The Link Layer
The beauty is that one layer need not worry about the other. It just needs a general idea. The packager need not worry about the route the package will take or what kind of information it will carry. It just needs to worry that its follows the general rules and regulations put in place.
So now we have a general idea of the environment and landscape. Let's dig deep into some of the details on the different layers and really see how this crazy thing called the internet works.
Application Layer
This is the layer where the average coder gets to have the most fun. The other layers are already defined and just look for input from the application layer. Here we define what are we sending or rather, What is in the package? Is it a letter, a note, a TV, a car? The most common delivered item are usually HTTP messages. HTTP (HyperText Transfer Protocol) is responsible for sending simple information (like a webpage through protocols like HTML (HyperText Markup Language) or simple text) and for gathering resources (an image or a video).
Transport Layer
Here we define the what the packaging is going to look like. For a letter, a simple envelope will do, but what about for a TV? Hopefully, there will be some bubble wrap. For every package we allow and restrict different metrics like size. If not, we could clog the internet by sending grossly large packages which is inefficient and costly.
But, what if the mail truck is too small and your package is too big? Good Question! The way this is answered is via packets! We break up our luggage to different smaller packages in multiple trucks and give a sequence of how they should be opened up so that it makes logical sense to whoever receives our payload.
When it comes to the internet there are two packaging schemes which shine. Good old reliable TCP (Transmission Control Protocol) packets and fast, quick & easy UDP (User Datagram Protocol) packets.
TCP packets give assurance that they will go where ever they say are going. They also guarantee all of its broken up segments will be put together for one complete package. These are mission critical payloads! Think of these as as an Olympic relay race, where the baton needs to be passed to the next participant in a efficient manner. If not, the next participant will wait and whine! If we have a web page with text, we need to assure we will receive ALL the text or it won't make sensXXXXXXXXX. (See what I did there? ;D).
Then we have our lousy UDP packets. These are unreliable packages who go fast but might not always make it. Think of these as motorcycles that go a little too fast on the highway and are carrying some part of an image. We'll be pretty mad if we don't get the whole image but we'll be happy if we have a general idea of what the picture is trying to depict. If we have a Skype conversation, it isn't mission critical to have all the pixels of the other attendees face and that is why there are blurry images and broken up audio on occasion.
There are many different types of packages for different situations. If you wanted to watch Real-time video for websites like twitch.com you would need to utilize RTP (Real-time Transport Protocol) packets. Though there are different kinds of packets, they aren't as prevalent as the mighty UDP or TCP packets. There is a purpose for every packaging scheme, with some packages needing more priority.
Internet Protocol Layer
We defined the routing for the packages (Internet). Let's figure out how to make these move as fast as possible and as efficient as possible. Within the IP layer, rules of how these packages can merge, move and where they can go are defined. The main points of this layer is:
- Host Addressing and Identification: When a package starts moving it has no way of knowing every route. Rather as it moves through, it gains more and more intel on where it should go until finally it is given an exit. Like a package, it will be sent to a post office which is closer to the actual destination of the package. After, it is that offices duty to continue this until the final host is found.
- Packet Routing: Remember how it was mentioned that if the package is too big, that we break it up? It is the IP Layer's responsibility to make sure the right packets go to the right hosts (Computer or Device). Every package has a sticker on it giving details such as the packet number, where it came from, where it needs to go, and a plethora of other information. This is called the Header of the packet. Whatever is in the package is referred to as the Payload.
Link Layer
This is when we are closest to the final destination. After the routing (IP Layer), we use technologies such as ethernet and routing techniques to take our payload to the final destination. With the IP layer we have brought our piece of information to the specific post office it needed to go to, and with the Link layer we bring that piece of information to the computer that it needs to be delivered to.
The Link Layer is the only layer which actually talks to hardware and has to worry about hardware such as routers, switches, and ethernet. In Parallel, this is where a post man actually comes to your neighborhood, and will put your package in the correct mailbox or house.
That was a lot to digest, but it is truly important to have a conscious understanding of why the interwebz works the way it does. We now know that if we wanted to go to www.Google.ca, it will:
- Make a Note telling google that I want their front page (Application layer)
- Put it in an appropriate packaging scheme (Transport layer)
- Get it to the closest post office (IP)
- Put the package on Google's doorstep (Link)
- Google will repeat step 1 - 4 with the appropriate response but now with your computer as the destination
- www.Google.ca homepage will magically appear on your screen
This was a very high level explanation of how the the internet works. There are many different mechanisms and fail safes along the way which make things more reliable and efficient but I have neglected them for the sake of simplicity.
The internet is a beautiful system in which we all remain connected regardless of geography. There exists some flaws and inefficiencies which are being resolved as we go (IPV6, HTML5, etc.) but for something that was developed ~45 years ago, it is simply mind boggling how a simple set of rules have defined our culture and our society.