Part 1 - Internet and HTTP Flashcards
First quarter of CMPUT 404, basic understanding of how the web works
What is a web application?
graphical computer program that a user interacts with in a web browser. They often also have a server-side component that runs on a web server.
Also needs the use of hypertext.
Why do we use the web?
We use the web to request, search, navigate and share info. Also to access and operate software.
Give me some example use case of the web to request, search, navigate and share info.
A social media website, we can “search” for users on the social media website, “request” to follow them, and share our own posts on the website.
Give me an example use case of accessing and operating software
VScode is a website! it is a software used to create programs, etc. An IDE-integrated development environment
What is Ethernet?
a fundamental networking protocol that frames data for transmission over physical media (cables).
Ethernet works within a network, i.e. a laptop uses ethernet to talk to a Wi-Fi router
In the context of web-applications and architectures, how does ethernet work? Give an example
Ethernet frames carry the network-level data that is under higher-level protocols like HTTP and web requests.
When you send an HTTP request in your app, it travels through multiple layers of protocols. At the lowest level, it’s carried inside an Ethernet frame, which is like a delivery box that helps it reach its destination.
What is an ethernet frame?
Structured packets of data transmitted over ethernet.
What are the key components of an Ethernet Frame?
Preamble - syncs sender and receiver by providing a signal to start processing the frame
Start Frame Delimiter - start of ethernet frame
Destination MAC address - unique address of recipient hardware device
Source MAC address - unique address of sender hardware device
Type/Length Field - Specifies protocol (IPv4, ARP) or payload size
Payload - data being transmitted
Frame Check Sequence (FCS) - ensures integrity by detecting errors in transmission
Ethernet frames are important to the web. Why is knowing the minimum packet size that can be sent over an ethernet frame important?
Since ethernet frames have a minimum packet size, smaller data must be padded to meet the size requirement.
Knowing this, ensures efficient data transmission without unnecessary overhead. As you can send more data to rid of the unnecessary overhead.
Ethernet frames are important to the web. What do we know about the potential waste in a transmission?
Transmitting small amounts of data (1 byte) incurs overhead from headers and trailers.
Headers can be disproportionately large compared to the actual payload.
What is fragmentation in ethernet frames?
If a message exceeds the maximum transmission unit (MTU) it must be fragmented (split) into multiple frames.
What does fragmentation do to latency?
Why does this happen?
Splitting large messages across frames adds latency because you need to reconstruct them at the destination.
Ensuring payloads fit within a single frame minimizes these delays.
Why is ethernet crucial to understand when optimizing web applications? What can we do to keep things optimal?
Ethernet is prevalent, so understanding its limitations (message exceeding MTU–> causing latency) is crucial.
Keeping data smaller than 1.5 KB ensures staying within a single frame, avoiding fragmentation and in turn reducing latency
If we just need to keep data small, why not just send lots of small data?
Sending minimal data (e.g., a 1-byte payload) still incurs the full size of Ethernet headers (14 bytes), frame check sequence (4 bytes), and potentially IP/TCP overhead.
What are the problems with ethernet frames in terms of communication? (Hint: Is ethernet routable?)
Ethernet is not routable.
Ethernet frames are limited to communication with a local network.
To communicate across networks (between a computer and a server on the internet), we need a routable protocol like IP.
How did we fix the problem of ethernet not being able to communicate between different networks?
Introducing IP (IPv4),
uniquely identifies devices on a network, enabling efficient communication across billions of devices.
What is IPv4?
Core Protocol that allows devices on different networks to communicate globally.
Assigns addresses to devices, within a network
What kind of addresses does IPv4 give?
32-bit addresses: represented as 4 decimal octets
e.g 192.168.0.1
Around 4.3 billion unique addresses
Is IP stateless?
Yes, IP does not maintain any connection state, simply routes packets independantly.
What was the problem with IPv4?
We ran out of addresses. Due to the rapid growth of devices, (IoT), IPv4 ran out of available addresses, leading to the adoption of IPv6 as a successor, providing a vastly larger address space.
How is IPv6 is an improvement form IPv4?
IPv6 expands the address space of Ipv4, allowing for virtually infinite addresses, solving the IPv4 exhaustion issue
Increased address size from 32 to 128 bits
What kind of protocols does IPv6 and IPv4 support?
They both support TCP/UDP ensuring compatibility with existing transport protocols used in web apps
How are IPv6 addresses different from IPv4?
Written in hexadecimal separated by colons.
ex. 2001:0db8:0000:0000:0000:0000:0000:0001
How can the long addresses of IPv6 be abbreviated?
omitting leading zeros and consecutive blocks of zeros
e.g.
2001:0db8:0000:0000:0000:0000:0000:0001
goes to
2001:db8:0:0:0:0:0:1
which can go to:
2001:db8::1
How are the host and port representations different in IPv4 and IPv6?
IPv4: 192.168.1.1:443 (host:port)
IPv6: https://[2001:db8::1]:443/
[host]:port
Colons are part of the address in IPv6, so you have to over the host in square brackets to separate it from the port
What is UDP?
User Datagram Protocol, a lightweight. transport protocol. Is connectionless, meaning no session between sender and receiver. Is designed for applications that can handle communication independently with out requiring guarantees from the transport layer
What are checksums?
UDP includes a checksum field for basic error checking and data integrity.
It does not guarantee reliability but can detect some corruption.
What do the port numbers UDP provides do?
It provides port numbers for application level multiplexing, enabling multiple applications to use the same network connection.
Is UDP stateless?
Yes, unlike TCP, UDP does not maintain a connection state between the sender and the receiver.
What does it mean when it is said that “UDP is lossy and unordered”
Data packets (datagrams) sent over UDP can be
- Lost (never received by destination)
- Received out of order (Sent: 0,1,2,3; Received: 0,2,1,3)
UDP does not retransmit lost packets or reorder them
What does UDP guarantee?
Nothing; UDP does not guarantee
- delivery of packets
- order of packets
- protection against duplication
Why is UDP used in real-time applications?
Prioritizes low latency over reliability, making it suitable for video streaming and gaming
What are the key characteristics of UDP?
Stateless, lossy, connectionless, unordered, no guarantees.
What is DNS?
Domain Name Service
System that maps human-readable domain names (google.com) to IP addresses (172.217.164.110)
What can DNS bind to?
Can bind:
a name to an IP address.
(e.x. google address)
a name to another name or a set of IP addresses.
(ai.com => chatgpt.com)
What are DNS records?
A records: Map a domain name to an IP address.
CNAME Records: Point one domain name to another (ai=>chatgpt)
How does DNS work with IPv4 and IPv6?
Seamlessly, (IPv6)AAAA records
(IPv4) A records
what are tools we can use to check DNS?
host: simple tool for DNS lookups
dig: Detailed DNS query tool
nslookup: Legacy tool for DNS troubleshooting
What is TCP?
A transport protocol used for reliable communications between devices over the internet.
Ensures data is delivered in order, without loss, and without duplication.
Is TCP connection oriented?
Yes, TCP is connection oriented, meaning it establishes a connection before data transfer begins
How does TCP establish a connection?
Establishes a connection by using a 3-way handshake:
SYN: Client requests a connection
SYN-ACK: Server acknowledges the request and sends its own request to connect.
ACK: Client acknowledges the server’s request, completing the handshake
Does TCP maintain order?
Yes, data is reassembled in the correct order at the receiving end.
0, 1, 2, 3, 4 -> 0, 1, 2, 3, 4
Is TCP prevalent?
Yes, TCP is widely used and is the backbone of most internet applications including:
HTTP (web browsing)
FTP (File browsing)
SMTP (email)
IMAP/POP3 (email retrieval)
What do firewalls do?
Usually prevent hosts from communicating on certain ports, or hosting services.
What is the impact of firewalls on HTTP and Web Clients
Firewalls often block inbound traffic unless explicitly allowed.
For HTTP and web clients:
Web clients (like browsers) are generally not allowed to act as web servers.
Communication must be initiated by clients (e.g., browsers making HTTP requests) rather than the web server pushing data.
What is HTTP?
Hypertext Transfer Protocol
What is Hypertext?
refers to the text that contains links to other texts or resources
HTTP facilitates the transmission of hypertext across the web, allowing users to navigate between resources (e.g. web pages)
How does HTTP transport?
HTTP serves as the application-layer protocol to transport requests and responses between clients (browsers) and servers.
Relies on lower layer protocols (TCP/UDP) for actual data transfer
What does Protocol mean in terms of HTTP?
A set of rules that define how communication occurs between systems
Is HTTP stateless?
Yes! each request-response cycle is independant of one another
What kind of headers does HTTP allow the use of?
Custom Headers, allowing for extendable functionality.
e.g. Developers can define headers for specific use cases (X-Custom-Header) or adopt new features without modifying the protocol itself
What kind of request/command pattern does HTTP use?
HTTP uses a request/command oriented pattern, where the client sends requests (GET/POST etc) and servers respond with appropriate actions and data
What kind of pairing does HTTP rely on?
HTTP relies on the Client-Server Model. Relies on interaction between web clients (browsers) and web servers.
Client initiate requests.
Server processes them to return responses.
How does HTTP identify resources on the web?
HTTP uses URIs to uniquely identify resources on the web.
URIs allow multiple resources to be hosted on a single server by distinguishing them using paths.
e.g. example.com/resource1
example.com/resource2