System Architecture Flashcards
What is RESTful?
Representational State Transfer. It is a set of design principles for making network communication more scalable and flexible.
Discuss RESTful Fielding Constraint: Client-server
Client-server
The first Fielding Constraint specifies that the network must be made up of clients and servers. A server is a computer that has resources of interest, and a client is a computer that wants to interact with the resources stored on the server. When you browse the Internet, your computer is acting as a client and sends HTTP requests to a server in order to access and manipulate information. A RESTful system has to operate in the client-server model, even if a component sometimes acts like a client and sometimes acts like a server.
A non-RESTful alternative to client-server architecture is event-based integration architecture. In this model, each component continuously broadcasts events while listening for pertinent events from other components. There’s no one-to-one communication, only broadcasting and eavesdropping. REST requires one-to-one communication, so event-based integration architecture would not be RESTful.
Discuss RESTful Fielding Constraint: Stateless
Stateless does not mean that servers and clients do not have state, it simply means that they do not need to keep track of each other’s state. When a client is not interacting with the server, the server has no idea of its existence. The server also does not keep a record of past requests. Each request is treated as a standalone.
Discuss RESTful Fielding Constraint: Uniform Interface
The “uniform interface” constraint ensures that there is a common language between servers and clients that allows each part to be swapped out or modified without breaking the entire system. This is achieved through 4 sub-constraints: identification of resources, manipulation of resources through representations, self-descriptive messages, and hypermedia.
Discuss RESTful Fielding Constraint: Caching
Caching refers to the constraint that server responses should be labelled as either cacheable or non-cacheable. Caching occurs when the client stores previous responses it received from the server, so that when that data is needed again, it can save a round trip over the network by using the cached data. The ability to cache is made possible by the interface constraint of “self-descriptive messages”, since the client knows that all the relevant data about a single resource is being sent in one response. It doesn’t have to worry about accidentally only caching part of the information it needs, and missing other parts.
Discuss RESTful Fielding Constraint: Layered System
Layered system refers to the fact that there can be more components than just servers and clients. This means there can be more than one layer in the system. However, each component is constrained to only see and interact with the very next layer. A proxy is an additional component, and it relays HTTP requests to servers or other proxies. Proxies can be useful for load balancing and security checks. A proxy acts like a server to the initial client that sends the request, and then acts like a client when it relays that request. A gateway is another additional component and it translates an HTTP request into another protocol, propagates that request, and then translates the response it receives back into HTTP. A client can simply treat a gateway as a regular server. An example use case for gateways is a system that needs to download a file from an FTP server.
Discuss RESTful Fielding Constraint: Code on demand
Code on demand is the only optional constraint and refers to the ability for a server to send executable code to the client. This is what happens in HTML’s tag. When the HTML document is loaded, the browser automatically fetches the JavaScript from the server and executes it locally.
RESTFul Fielding constraint: Uniform Interface - Identification of resources
The first sub-constraint of “uniform interface” affects the way that resources are identified. In REST terminology, a resource could be anything – an HTML document, an image, information about a particular user, etc. Each resource must be uniquely identified by a stable identifier. A “stable” identifier means that it does not change across interactions, and it does not change even when the state of the resource changes. If a resource does get moved to another identifier, the server should give the client a response indicating that the request was bad, and give it a link to the new location of the resource.
The Web uses URI to identify resources, and HTTP as its communication standard. To get a resource stored on a server, a client makes a HTTP GET request to the URI that identifies that resource. Every time you type an address into your browser, your browser makes a GET request to that URI. If it receives a 200 OK response and an HTML document back, then it renders the page in the window so that you can view it.
RESTFul Fielding constraint: Uniform Interface - manipulation of resources through representations
The second sub-constraint of “uniform interface” says that the client manipulates resources through sending representations to the server–usually a JSON object containing the content that it would like to add, delete, or modify. In REST, the server has full control of the resources, and is responsible for making any changes. When a client wishes to make changes to resources, it sends the server a representation of what it would like the resulting resource to look like. The server takes the request as a suggestion, but still has ultimate control.
Let’s think about the case of a blog on the Web. When a user makes a new blog post, their computer wants to tell the server that a new blog post needs to be added. To do this, it sends an HTTP POST or PUT request with the content for the new blog post. The server sends back a response indicating whether the post was created, or if there was a problem. In a non-REST world, the client may literally be sending instructions for operations such as add a new line and make the title of the blog “What RESTful Actually Means”, instead of simply sending a representation of what it would like the final product to look like.
RESTFul Fielding constraint: Uniform Interface - self-descriptive messages
Self-descriptive messages are another constraint that ensures a uniform interface between clients and servers. A self-descriptive message is one that contains all the information that the recipient needs to understand it. There should not be additional information in a separate documentation or in another message.
To understand how this applies to the Web, let’s analyze a set of HTTP requests and responses.
When a user types http://www.example.com in the address bar of their web browser, the browser sends the following HTTP request:
GET / HTTP/1.1
Host: www.example.com
This message is self-descriptive because it told the server what HTTP method was used, and the protocol that was used (HTTP 1.1).
This message is self-descriptive because it told the client how it needs to interpret the message body, by indicating that Content-type was text/html. The client has everything it needs in this single message to handle it appropriately.
RESTFul Fielding constraint: Uniform Interface - hypermedia
Hypermedia is a fancy word for data sent from the server to the client that contains information about what the client can do next–in other words, what further requests it can make. In REST, servers should be sending only hypermedia to clients.
HTML is a type of hypermedia. To understand this better, let’s look again at the server response above. * <a> Check out the Recurse Center! </a> tells the client that it should make a GET request to http://www.recurse.com if the user clicks on the link. * <img></img> tells the client to immediately make a GET request to http://www.example.com/awesome-pic.jpg so it can display the image to the user.
When a system has identifiers for each resource, manipulates them through sending representations from the client to the server, and has messages that are self-descriptive and composed of hypermedia, it is said to have a uniform interface. This is perhaps the most important attribute of a RESTful system, as it allows for clients to intelligently adapt to changes. A server can change the underlying implementation without breaking all the clients that interacted with it, because each interaction is self-contained, identifiers do not change when the underlying state or implementation changes, and hypermedia gives clients instructions for state transitions that it can do next. The server does not need to remember anything about the client or do anything special to cater to it, and vice versa.
What happens when you type in “google.com” - 8 key points:
- You type maps.google.com into the address bar of your browser.
- The browser checks the cache for a DNS record to find the corresponding IP address of maps.google.com.
- If the requested URL is not in the cache, ISP’s DNS server initiates a DNS query to find the IP address of the server that hosts maps.google.com.
- Browser initiates a TCP connection with the server.
- The browser sends an HTTP request to the web server.
- The server handles the request and sends back a response.
- The server sends out an HTTP response.
- The browser displays the HTML content (for HTML responses which is the most common).
The browser checks the cache for a DNS record to find the corresponding IP address of maps.google.com. What does this mean?
DNS(Domain Name System) is a database that maintains the name of the website (URL) and the particular IP address it links to. Every single URL on the internet has a unique IP address assigned to it. The IP address belongs to the computer which hosts the server of the website we are requesting to access. For an example, www.google.com has an IP address of 209.85.227.104. So if you’d like you can reach www.google.com by typing http://209.85.227.104 on your browser. DNS is a list of URLs and their IP addresses just like how a phone book is a list of names and their corresponding phone numbers.
The main purpose of DNS is human-friendly navigation. You can easily access a website by typing the correct IP address for it on your browser but imagine having to remember different sets of numbers for all the websites we regularly access? Therefore, it is easier to remember the name of the website using an URL and let DNS do the work for us with mapping it to the correct IP.
In order to find the DNS record, the browser checks four caches.
● First, it checks the browser cache. The browser maintains a repository of DNS records for a fixed duration for websites you have previously visited. So, it is the first place to run a DNS query.
● Second, the browser checks the OS cache. If it is not found in the browser cache, the browser would make a system call (i.e. gethostname on Windows) to your underlying computer OS to fetch the record since the OS also maintains a cache of DNS records.
● Third, it checks the router cache. If it’s not found on your computer, the browser would communicate with the router that maintains its’ own cache of DNS records.
● Fourth, it checks the ISP cache. If all steps fail, the browser would move on to the ISP. Your ISP maintains its’ own DNS server which includes a cache of DNS records which the browser would check with the last hope of finding your requested URL.
You may wonder why there are so many caches maintained at so many levels. Although our information being cached somewhere doesn’t make us feel very comfortable when it comes to privacy, caches are important for regulating network traffic and improving data transfer times.
If the requested URL is not in the cache, ISP’s DNS server initiates a DNS query to find the IP address of the server that hosts maps.google.com. Que?
As mentioned earlier, in order for my computer to connect with the server that hosts maps.google.com, I need the IP address of maps.google.com. The purpose of a DNS query is to search multiple DNS servers on the internet until it finds the correct IP address for the website. This type of search is called a recursive search since the search will continue repeatedly from DNS server to DNS server until it either finds the IP address we need or returns an error response saying it was unable to find it.
In this situation, we would call the ISP’s DNS server a DNS recursor whose responsibility is to find the proper IP address of the intended domain name by asking other DNS servers on the internet for an answer. The other DNS servers are called name servers since they perform a DNS search based on the domain architecture of the website domain name.
Many website URLs we encounter today contain a third-level domain, a second-level domain, and a top-level domain. Each of these levels contains their own name server which is queried during the DNS lookup process.
For maps.google.com, first, the DNS recursor will contact the root name server. The root name server will redirect it to .com domain name server. .com name server will redirect it to google.com name server. google.com name server will find the matching IP address for maps.google.com in its’ DNS records and return it to your DNS recursor which will send it back to your browser.
These requests are sent using small data packets which contain information such as the content of the request and the IP address it is destined for (IP address of the DNS recursor). These packets travel through multiple networking equipment between the client and the server before it reaches the correct DNS server. This equipment use routing tables to figure out which way is the fastest possible way for the packet to reach its’ destination. If these packets get lost you’ll get a request failed error. Otherwise, they will reach the correct DNS server, grab the correct IP address, and come back to your browser.
Browser initiates a TCP connection with the server. Huh?
Once the browser receives the correct IP address it will build a connection with the server that matches IP address to transfer information. Browsers use internet protocols to build such connections. There are a number of different internet protocols which can be used but TCP is the most common protocol used for any type of HTTP request.
In order to transfer data packets between your computer(client) and the server, it is important to have a TCP connection established. This connection is established using a process called the TCP/IP three-way handshake. This is a three step process where the client and the server exchange SYN(synchronize) and ACK(acknowledge) messages to establish a connection.
- Client machine sends a SYN packet to the server over the internet asking if it is open for new connections.
- If the server has open ports that can accept and initiate new connections, it’ll respond with an ACKnowledgment of the SYN packet using a SYN/ACK packet.
- The client will receive the SYN/ACK packet from the server and will acknowledge it by sending an ACK packet.
Then a TCP connection is established for data transmission!
The browser sends an HTTP request to the web server.
Once the TCP connection is established, it is time to start transferring data! The browser will send a GET request asking for maps.google.com web page. If you’re entering credentials or submitting a form this could be a POST request. This request will also contain additional information such as browser identification (User-Agent header), types of requests that it will accept (Accept header), and connection headers asking it to keep the TCP connection alive for additional requests. It will also pass information taken from cookies the browser has in store for this domain.
The server handles the request and sends back a response.
The server contains a web server (i.e Apache, IIS) which receives the request from the browser and passes it to a request handler to read and generate a response. The request handler is a program (written in ASP.NET, PHP, Ruby, etc.) that reads the request, its’ headers, and cookies to check what is being requested and also update the information on the server if needed. Then it will assemble a response in a particular format (JSON, XML, HTML).
The server sends out an HTTP response.
The server response contains the web page you requested as well as the status code, compression type (Content-Encoding), how to cache the page (Cache-Control), any cookies to set, privacy information, etc.
Example HTTP server response:
If you look at the above response the first line shows a status code. This is quite important as it tells us the status of the response. There are five types of statuses detailed using a numerical code.
● 1xx indicates an informational message only
● 2xx indicates success of some kind
● 3xx redirects the client to another URL
● 4xx indicates an error on the client’s part
● 5xx indicates an error on the server’s part
So, if you encountered an error you can take a look at the HTTP response to check what type of status code you have received.
he browser displays the HTML content (for HTML responses which is the most common).
The browser displays the HTML content in phases. First, it will render the bare bone HTML skeleton. Then it will check the HTML tags and sends out GET requests for additional elements on the web page, such as images, CSS stylesheets, JavaScript files etc. These static files are cached by the browser so it doesn’t have to fetch them again the next time you visit the page. At the end, you’ll see maps.google.com appearing on your browser.
That’s it!
Although this seems like a very tedious prolonged process we know that it takes less than seconds for a web page to render after we hit enter on our keyboard. All of these steps happens within milliseconds before we could even notice. I sincerely hope this article helps you answer the question “What happens when you type an URL in the browser and press enter?”.
What is DNS and how does it work?
The Domain Name System (DNS) is often referred to as the backbone of the internet. It’s run by many engineers and their organizations, it ultimately shapes the future of the internet.
Internet is more a of design philosophy … if all parties agree on a protocol, the data gets sent seamlessly.
All devices on the internet have a number. And an IP address.
How it works:
A user asks their browser to visit freecodecamp.com
The browser queries a DNS Resolver (usually their ISP) “where’s freecodecamp.com?”
DNS Resolver queries the Root servers (which have a big important list that keeps this information) “where is .COM?” Replies with Verisign.
DNS Resolver then queries Verisign — “where is freecodecamp.com?” Verisign replies with the nameservers ns1.cloudflare.com and the IP address 192.168.178.1
Hosting servers are queried with the IP address. “Give me the files for IP address 192.168.178.1 (please)”
Website files are delivered and rendered on the page so user can learn to code…or whatever they were doing.