What happens in that funny little bar up top?
There’s not a lot of things ninety-odd percent of Americans can agree on anymore, but one of those things might be the necessity of the internet. From communication to information, the internet has become indispensable to work, entertainment, and home life over the course of the last two decades. So how does it work? What happens when you type something into your address bar?
Step 1: DNS cache request
The first thing your browser does after you type, say, https://www.holbertonschool.com, into your browser and press enter, is cycle through the various caches (a storage repository, like a weapons cache or survival cache, only digital) it has access to in order to marry up the words with a numerical IP address, what’s called a Domain Name System (DNS) request. The first cache is the browser cache, which stores DNS records of websites you have previously visited. Second it checks the cache on your operating system. Next, it checks the cache stored on your router and finally, it goes to your Internet Service Provider (ISP) if it can’t find it anywhere on your local system. If the ISP doesn’t have the request in its cache, it will in turn request the IP address from other DNS servers until it finds it.
Step 2: TCP connection
Once the correct destination is found, your machine and the server you are trying to access try to shake hands through TCP, or Transmission Control Protocol. What happens is your system asks the server to synchronize on an open port, if the server says yes it syncs up and acknowledges it to your machine, which then sends its own acknowledgment back. If everything’s successful, data can now be transferred.
Step 3: The Firewall
Now that a connection is established, the data must pass through the firewall, going both ways. The firewall will examine each set of data passed from your machine to the server or vise versa for suspicious content. The firewall also might cut things off if the server tries to connect to a non-standard port or if the transfer protocol is being abused in some way.
Step 4: HTTPS/SSL
Now that a connection is established and being monitored, we need to transfer data. This takes the place of GET or POST requests being sent by your machine to the server. These requests are sent generally over HTTP or HTTPS, being HyperText Transfer Protocol (Secure). They operate on similar principles, but HTTPS, somewhat obviously, is more secure and less risky for your machine and the server. HTTPS requires an SSL, or Secure Socket Layer, certificate. This is a cryptographic protocol that scrambles the signal. Now we’re transferring data securely.
Step 5: Load Balancer
Up until this point, the website you’ve been accessing has been referred to as the server, which isn’t quite accurate, at least not usually. Most websites you visit won’t be a single server. This isn’t the 90’s where half the internet was home-hosted dialup. In all likelihood, you’re trying to access an entire network of servers. The reason for this? There are somewhere around four BILLION internet users, and a vast quantity of them are trying to access the exact same websites as you. So a single server isn’t going to cut it. In order to access this impressive network of machines hosting huge numbers of web servers, application servers, and databases, you (or your browser) will have to work through the load balancer. The load balancer, well, balances the incoming traffic across the various machines of the server network, compensating for failures and avoiding an overload of the system.
Step 6: Web Server
Once your traffic is allocated, your request is responded to by a program called the web server, which is a software designed for the task. It serves the HTTP requests and hands the webpages off to your browser, the HTML and supporting files that your browser uses to make these lovely colors on your screen flash.
Step 7: Application Server
But, say you’re not visiting your favorite cooking blog or just falling down the Wikipedia rabbit hole. A lot of websites provide complex services that really can’t be accomplished by simple HTML files alone. This is where application servers come in. Working in tandem with the webserver, the application server provides the web application and runs it for you. Rest in peace Adobe Flash, you will be missed.
Step 8: Database
But wait, where is all of this coming from? How does this explain things like user accounts and other data? Is that stored on my machine? No, it isn’t. It’s stored in the database held by the server itself. This database is a second program or set of programs that allow for easy access of data by the web and application servers so that it can be served to you!
Congratulations, you can now browse in peace, knowing that the internet is a super complex thing and safe in the knowledge that no one really understands it.