Basic Networking Flashcards

1
Q

How does a session get established in TCP?

A

In TCP, this process is known as three-way handshake. It is used to establish a reliable connection between a client and a server:
1. SYN (Synchronize): The client initiates the connection by sending a SYN (synchronize) packet to the server. This packet contains the initial sequence number (ISN), which is a random number used to start the sequence numbers for the segment transmission. The SYN flag is set to 1 to indicate that this is a connection request.
2. SYN-ACK: Upon receiving the SYN packet, the server responds with a SYN-ACK packet. This packet contains the server’s own initial sequence number and an acknowledgment number that is one more than the initial sequence number received from the client. The SYN and ACK flags are set to 1 to indicate that this is a response to the connection request and to acknowledge the client’s SYN packet.
3. ACK: The client receives the SYN-ACK packet and responds with an ACK packet. This acknowledgment packet contains the next sequence number, which is one more than the acknowledgment number received, and the acknowledgment number is set to one more than the server’s initial sequence number. The ACK flag is set to 1 to indicate that this is an acknowledgment for the server’s SYN-ACK packet.
https://youtu.be/LyDqA-dAPW4?si=TaZ9JHfHxxdjK4T6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is TCP/IP?

A

Transmission Control Protocol / Internet Protocol is the fundamental suite of protocols that forms the basis for the Internet. It is a compressed version of the OSI model.

TCP/IP specifies how devices exchange data over the internet to one another. It identifies how the data should be broken down, addressed, transmitted, routed and received for sharing.

TCP defines how applications can create communication channels. It manages how a message is broken down into smaller segments and how it should be put back together.

IP defines how to route these packets to make sure it reaches its destination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are TCP/IP layers?

A

Application Layer: Application specific protocols are defined here. HTTP, HTTPS, FTP, SMTP and DNS are at this layer.

Transport Layer: It ensures that data is transferred reliably and efficiently between hosts. Common protocols here are TCP and UDP.

Internet Layer: This layer is responsible for routing packets of data from source to destination across multiple networks. It uses IP protocol to provide an addressing system and makes routing decisions to forward packets toward their destination.

Link Layer: This is the lowest layer of the TCP/IP model, responsible for interfacing with the physical network hardware. Protocols at this layer include Ethernet, ARP (Address resolution protocol) and PPP (Point to point)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does each layer responsible of in TCP/IP model?

A

Application Layer: Application specific protocols are defined here. HTTP, HTTPS, FTP, SMTP and DNS are at this layer.

Transport Layer: It provides end-to-end communication. Common protocols here are TCP and UDP.

Internet Layer: This layer is responsible for routing packets of data from source to destination across multiple networks. It uses IP protocol to provide an addressing system and makes routing decisions to forward packets toward their destination.

Link Layer: This is the lowest layer of the TCP/IP model, responsible for interfacing with the physical network hardware. Protocols at this layer include Ethernet, ARP (Address resolution protocol) and PPP (Point to point)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe OSI Model

A

OSI model or Open Systems Interconnection model is a reference model that describes how applications interact over a computer network. It has 7 layers

All : Application Layer
People : Presentation Layer
Seem : Session Layer
To : Transport Layer
Need : Network Layer
Data : Data Link Layer
Processing : Physical Layer

Please : Physical Layer
Do : Data Layer
Not : Network Layer
Throw : Transport Layer
Sausage : Session Layer
Pizza : Presentation Layer
Away : Application Layer

Its primary purpose is educational and even though the layers dont fit real-world use cases perfectly, they are still widely used by networking vendors and cloud providers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe 7 layers of OSI

A

Please Do Not Throw Sausage Pizza Away
All People Seem To Need Data Processing

Physical Layer: It is responsible for transmitting raw bits of data across a physical connection. (Hubs Fiber etc)

Datalink Layer: It takes the raw bits from the physical layer and organizes them into frames. It ensures that the frames are delivered to the correct destination. The ethernet primarily lives in this layer.

Network Layer: It is responsible for routing data frames across different networks. The IP part of TCP/IP model is a well-known example of this layer. IPv4, IPv6 come here.

Transport Layer: This layer is responsible for end-to-end communication. TCP and UDP live here. Data in this layer called segments.

There are 3 more layers in the OSI model but they are a bit too fine-grained and do not really reflect reality.

Session Layer: This layer control signals between the computers. Establishes, maintains and terminates connections between processes. 3-way handshake of TCP, the ack packets are here.

Presentation Layer: The formatting, encoding, UTF8 characters, encryption, all are here.

Application Layer: SMTP, HTTP, FTP, HTTPS, DNS are here.

So it’s useful to collapse these three layers into one and consider application protocols like HTTP as Layer-7 protocols.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is DNS?

A

It stands for Domain Name System. It is the backbone of the internet. It is internet’s directory. It translates human-readable domain names, such as google.com to machine-readable IP addresses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens when you type www.amazon.com in your browser?

A

When a url is placed, a DNS query is automatically generated by the browser and a packet is sent to our DNS servers, asking what the IP address is amazon.com. A DNS server has a database of all the IPs mapped to domains. If our first DNS server doesn’t have this information, it makes another query to an authoritative DNS server which might have this information and so on, until it finds the IP of the domain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does the DNS resolver find the authoritative name server?

A

Actually this is where DNS gets interesting. There are 3 main levels of authoritative DNS servers: Root name servers, top level domain (TLD) name servers, authoritative name servers.

Root name servers store the IP addresses of the TLD name servers. (Imagine one root server and underneath, .com, .org, .edu, .de, .ch, .uk etc.) There are 13 logical root name servers. (a.root-servers.net, b.root-servers.net, c.root-servers.net etc) Each root name server is assigned an IP to them. But these root name servers can be anywhere in the world with the same IP. This is done with the magic of anycast. It provides one IP being assigned to multiple servers in different locations and you get routed to the closest server to your location.

The TLD servers store the IP addresses of the authoritative name servers for all the domains under them (.com store amazon.com, google.com, org stores wikipedia.org etc)

Authoritative name servers give answer to the DNS queries. This is where we register our domain.
This design makes DNS highly decentralized and robust.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can you explain a little bit of the lifecycle of a DNS query?

A

When you type google.com to your browser:
1. Browser at first checks its cache.
2. If it has no answer, it makes an operating system call to try to get the answer. The OS call would likely have its own cache too. If the answer isn’t there, it reaches out to the DNS resolver.
3. The DNS resolver first checks its cache. If it’s not there or if the answer has expired, it asks the root name server.
4. The root name server responds with the list of the .com TLD name servers. Most probably from its cache.
5. The DNS resolver then reaches out to the .com TLD nameserver, and the .com TLD nameserver returns the authoritative nameservers to google.com.
6. And finally, the DNS resolver reaches out to google.com’s authoritative name server, and it returns the IP address of google.com
7. The DNS resolver then returns the IP address to the operating system, and the operating system returns it to the browser.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to do updates DNS records for a live, high-traffic, production website?

A

Some of the default TTLs (time to live) are pretty long and not every DNS resolver actually honor those TTLs. There are 2 practical steps we could take:
1. Reduce the TTL before to something like 30 seconds before we change the DNS record.
2. Leave the server running on the old IP addresses for a while.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a firewall?

A

Firewall is a hardware or a software that is used to secure a network by allowing or blocking incoming or outgoing traffic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List a few common firewall types…

A
  1. Packet filtering firewall: Most common type of firewall which analyze packets and let them pass through only if they match the rule set. (Analyze packets mean - checking source IP, destination IP, port numbers and connection protocol) These firewalls dont have the capacity to do packet inspection. These work on network and transport layer of the OSI model.
  2. Proxy firewall: These work at Layer 7 of OSI model. Packet. It will deal with application level protocols, http https ftp smtp.
  3. Stateful multi-layer inspection (SMLI) firewalls: They filter packets at the network, transport and application layers.
  4. Next-Generation Firewall (NGFW): Incorporates features of the traditional firewall along with additional functionalities like application awareness and control, integrated intrusion prevention, and cloud-delivered threat intelligence.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a VPN?

A

Virtual Private Network. It is a secure tunnel across internet, between a VPN client and a VPN server. The traffic is encrypted. The user has a VPN client installed on their machine, then it creates an encrypted tunnel to VPN server and then it reaches out internet, gets the packets, encrypts them and sends them back to the user.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do you mean by ipconfig and ifconfig?

A

ipconfig -> Windows
ifconfig -> linux

These commands are used to view all the adapters and the configuration of all the adapters for their network interfaces.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data encapsulation?

A

It refers to the process of adding headers and trailers to the data.
(deencapsulation is the process of removing headers and trailers from the data)

https://en.wikipedia.org/wiki/Encapsulation_(networking)#/media/File:UDP_encapsulation.svg

17
Q

What is a NAT / NAT tunneling?

A

A NAT (Network Address Translation) tunnel is a technology used in computer networking to enable devices on a private network to communicate with external networks (like the Internet) using a single public IP address. The main functions of NAT tunneling are to provide security by hiding internal network addresses, conserve the number of public IP addresses needed, and allow for the routing of traffic between different network domains.
1. Translation: When a device on a private network wants to communicate with the outside world, the NAT device (often a router) replaces the device’s private IP address in the outgoing packets with its own public IP address. This process is known as “source NAT” (SNAT).

  1. Tracking Connections: The NAT device keeps a translation table to track each active connection. This table contains mappings of the original private IP addresses and assigned ports to the corresponding public IP address and port. This way, the NAT device can identify and properly route the returning traffic.
  2. Routing Incoming Traffic: For incoming traffic, “destination NAT” (DNAT) can be used. This is where the NAT device redirects incoming traffic addressed to a specific public IP address and port to the corresponding private IP address and port on the internal network, based on predefined rules or entries in the translation table.
18
Q

How does DHCP work and what is it?

A

Dynamic Host Control Protocol.
Main goal is to assign unique IP addresses to hosts. It also provides other network addresses, such as the subnet mask, default gateway and DNS address. It come in two flavors, it comes as a client and a server.

When you turn on your PC, if it doesnt have an address, it needs a unique IP in the network. Each computer will run a DHCP client and this will allow the computer to ask for a unique IP address. Somewhere on the network, there will be a DHCP server where IP addresses are managed.

DHCP servers are handled by routers or servers. At home it sits on the router but in enterprise networks, they sit on the server.

DHCP server keeps the IP addresses, the corresponding MAC address of the device and an expiration date.

How does DHCP work:
Step 1. DHCPDiscover: Looks for a DHCP server
Step 2. DHCPOffer: DHCP server offers an IP address
Step 3. DHCPRequest: Hosts requests to get this IP address
Step 4. DHCPACK: DHCP server sends the IP address to the host

DHCP uses UDP protocol.
Client: on port 68
Server: on port 67

19
Q

What is a proxy?

A

A proxy, or proxy server, is an intermediary server that separates end users from the websites they browse. It provides various functions such as web requests forwarding, data caching, and anonymity, by handling internet traffic on behalf of users.

20
Q

What is a reverse proxy?

A

Nginx is called a reverse proxy.
What does a reverse proxy do? It sits between the users and the application servers (well usually there are more layers to reverse proxy than that in cloud systems)
1. Protects the web servers, the websites IP addresses are hidden behind the reverse proxy and this makes it much harder to target the web servers for a DDOS attack.
2. Used for load balancing, can balance a large amount of requests
3. Caches static content, it caches static content for a period of time and when requested by users, it delivers it without having to go to the web servers
4. Handles SSL encryption. SSL handshake is computationally expensive so we want to offload it to possible other servers on the network to do it instead of our very busy web servers.

21
Q

What is a modern reverse proxy design like?

A

The first layer could be an edge-service like CloudFlare. The reverse proxies are deployed to hundreds of locations worldwide close to users. The second layer could be an API gateway or a loadbalancer at the hosting provider. Many cloud providers combine these two into a single ingress service. The user would enter the cloud network over the edge close to the user, then from the edge the reverse proxy connects over a fast fiber network to the load balancer, where the request is evenly distributed over a cluster of web servers.

22
Q

What is a forward proxy?

A

A forward proxy is a server machine that sits between a group of machines and the internet. When those group of machines make a request to the internet, the forward proxy acts as a middle man, intercepts those requests and talks to the web servers on behalf of the clients.
Why would you want to do that:
1. Forward proxy protects the client’s online identity, hides the IP and only the forward proxy IP is visible.
2. F.proxy can bypass browsing restrictions. Usually organizations want to block access to certain websites and a proxy can help around that -it doesnt always work cause a firewall can be sitting in the same network to block those websites as well-
3. Also it helps schools and governments to block access to certain content. This would require a client machine on the network to point to a proxy server. For large institutions, they usually apply a technique called transparent proxy to streamline the process. A transparent proxy works with Layer 4 switches to redirect certain traffic to the proxy automatically. And there is no need to configure it and hard to bypass it if it’s in the network.

23
Q

Can you tell me about Load Balancing algorithms?

A

Yes, Load balancing is a highly critical component of any large scale web application. By distributing load across web servers, load balancing ensures high availability, responsiveness and scalability. There are two main types of Load Balancing algorithms:
- Static
- Dynamic

Static LB algorithms distribute the application load without taking into account the servers’ real-time conditions, metrics and performance. Its advantage is simplicity but the disadvantage of it is less adaptivity and precision.

Round Robin is the simplest approach, it distributes the requests evenly, sending request 1 to server A, sending request 2 to server B and so on. It is easy to implement and understand. But it can potentially overload servers if they are not properly monitored.

Sticky Round Robin is an improved version of Round Robin. It sends the requests coming from the same client to the same server, to improve the performance by having the related data on the same server. But unexpected load can occur as new users are assigned randomly.

Weighted Round Robin allows admins to assign different weights or priorities to different servers. It has the ability to take the heterogenous capabilities of the servers. The higher weight is the But it requires manual configuration of the weight and it is less adaptive to real-time change.

Hash-based algorithms use a hash function to map incoming requests to the backend servers. The hash function often uses the client’s IP address or the requested URL as input for determining where to route each request. It can evenly distribute request if the function is chosen wisely but it is challenging to pick up the optimal algorithm.

Dynamic LB algorithms: These adapt in real-time by taking into account the active performance metrics and server conditions.

Least Connection algorithm sends requests to the server currently with the least number of active connections or open requests. This requires actively tracking the open connections of each backend server. But if connections pile up unevenly, it is possible to overload some servers in this way.

Least Response Time algorithms send incoming requests to the server with the fastest response time or lowest current latency. Latency in each server is continuously monitored. This approach is highly adaptive and responsive. But it brings high complexity and an overhead. It also doesnt take into account how many requests each server is dealing with.

So all these algorithms come with clear tradeoffs.

24
Q

What are types of load balancers?

A

Types based on configurations:
1a. Software Load balancers
1b. Hardware Load balancers
1c. Virtual Load balancers

Types based on functions
2a. Layer4 Load balancers: Work on Transport Layer of the OSI model. These do load balancing based on IP addresses and port numbers.
2b. Layer7 load balancers: Operate on Application Layer. They can make intelligent load balancing decisions based on URLs, HTTP headers, or cookies.

GSLB (Global Server Load Balancer) a.k.a. Multi-site Load Balancer
GSLB stands for Global Server Load Balancer. This type of load balancer goes beyond the traditional local load balancing and is designed for distributing traffic across multiple data centers or geographically distributed servers.

A GSLB load balancer is concerned with global or wide-area load balancing.
It takes into account factors such as server proximity, server health, and geographic location to intelligently distribute traffic across multiple locations.

25
Q

Describe the difference between HTTP/1 HTTP/1.1 HTTP/2 HTTP/3

A

HTTP1: It was released in 1996. It was built on top of TCP. Every request to the same server required a separate TCP connection.

HTTP1.1: Soon followed in 1997. It introduced a keep-alive mechanism so a connection could be reused for more than a single request. The persistent connections reduce the latency because the client can reuse the existing connection and it does not have to deal with three-way handshake that is computationally expensive. It also introduced pipelining that meant basically to not to wait until a response is sent, but the responses had to be provided in a meaningful order and many proxy servers couldn’t implement it properly and eventually its support was removed from many web browsers.

HTTP2: It was released in 2015 and it introduced the HTTP streams. Multiple stream of requests could be sent to the same server on one single TCP connection. Unlike HTTP1.1, each stream here is independent so they don’t need to follow a specific order. But here is a problem called head of line blocking. HTTP2 also introduced a push capability, to allow servers to send updates to the clients whenever new data is available, without requiring a client to poll.

HTTP3: It began as a draft in 2020 and has recently been published in June 2022. It uses a protocol called QUIC instead of It is based on UDP

26
Q

API gateway vs Load balancer vs Reverse Proxy

A

So an API gateway without API management and auth is just a load balancer and a load balancer without needing to balance loads to multiple servers is just a proxy server which hides the servers address? How fun

27
Q

TCP vs UDP

A

TCP provides reliable, end-to-end communication between devices. It does this by splitting data into small, manageable segments, and sending each segment individually. Each segment has a sequence number attached to it, so the receiving end can check this number and reassemble data in the correct order. TCP also provides error checking to make sure the data is not corrupted during transmission.

UDP is similar to TCP, but it is simpler and faster. Unlike TCP, UDP does not provide error-checking and reliability. It simply sends packets of data from one device to another. The receiving end is responsible for determining whether the packets were received correctly. If an error is detected, the receiver simply discards the packet.

  • UDP: Anything where you don’t care too much if you get all data always
  • Domain Name System (DNS)
  • Tunneling/VPN (lost packets are ok - the tunneled protocol takes care of it)
  • Media streaming (lost frames are ok)
  • Games that don’t care if you get every update
  • Local broadcast mechanisms (same application running on different machines “discovering” each other)
28
Q

Can you tell me an example on OSI model and how data moves through layers when transmitting over the network.

A

When a user sends a HTTP request to a web server over the network,
1. HTTP header is added to the data at Application Layer
2. Then TCP header added to the data. It is encapsulated into TCP segments at the Transport Layer. This data contains source port, destination port and the sequence number.
3. The segments are then encapsulated with an IP header at the network layer. The IP header container the source and the destination IP addresses. This is done in Network Layer.
4. MAC addresses, for a source and destination MAC address, are added at the Data Link Layer. (These are not actually MAC addresses of the source and destination but MAC addresses of the routing devices of the next hop of a usually long journey.)
5. Then the encapsulated bits of data sent over the network on the Physical Layer.

When the web-server receives the raw-bits from the network, it reverses the process. The headers are removed layer by layer. Eventually the web-server processes the HTTP requests.

29
Q

What is the device that a computer needs to connect to a network?

A

a NIC - Network Interface Card

30
Q

What are 10.0.0.0, 192, and 172 addresses are used for?

A

Private IP addresses are used within private networks and are not routable on the internet. They are defined by the Internet Engineering Task Force (IETF) and fall within specific IP address ranges. By default, private IP addresses start within the following ranges:

IPv4 Private Address Ranges:
10.0.0.0 to 10.255.255.255 (10.0.0.0/8)
172.16.0.0 to 172.31.255.255 (172.16.0.0/12)
192.168.0.0 to 192.168.255.255 (192.168.0.0/16)
These ranges are designated for private use and are commonly used in home, office, and enterprise networks.

31
Q

What is a 127 IP address?

A

The IP address range starting with 127.x.x.x is designated for loopback addresses, not private IP addresses. Loopback addresses are used for testing and diagnostics within a single host.

127.0.0.0 to 127.255.255.255 (127.0.0.0/8):

This range is reserved for loopback addresses.
The most commonly used address within this range is 127.0.0.1, which is referred to as “localhost.”
Loopback addresses allow a device to send and receive network traffic to itself, which is useful for testing and troubleshooting network applications and services on the local machine.
Traffic sent to a loopback address is not sent over any physical network interface but is instead looped back by the operating system.

32
Q

What is 255.255.255.255 used for?

A

255.255.255.255 is used to send packets to all hosts on the local network.

A network-specific broadcast address (e.g., 192.168.1.255 for the network 192.168.1.0/24).

33
Q

What is 0.0.0.0 for?

A

0.0.0.0: Used to denote an unknown or non-specific address (e.g., default route).