Part 1 - Internet and HTTP Flashcards

First quarter of CMPUT 404, basic understanding of how the web works

1
Q

What is a web application?

A

graphical computer program that a user interacts with in a web browser. They often also have a server-side component that runs on a web server.
Also needs the use of hypertext.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do we use the web?

A

We use the web to request, search, navigate and share info. Also to access and operate software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give me some example use case of the web to request, search, navigate and share info.

A

A social media website, we can “search” for users on the social media website, “request” to follow them, and share our own posts on the website.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give me an example use case of accessing and operating software

A

VScode is a website! it is a software used to create programs, etc. An IDE-integrated development environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Ethernet?

A

a fundamental networking protocol that frames data for transmission over physical media (cables).

Ethernet works within a network, i.e. a laptop uses ethernet to talk to a Wi-Fi router

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In the context of web-applications and architectures, how does ethernet work? Give an example

A

Ethernet frames carry the network-level data that is under higher-level protocols like HTTP and web requests.

When you send an HTTP request in your app, it travels through multiple layers of protocols. At the lowest level, it’s carried inside an Ethernet frame, which is like a delivery box that helps it reach its destination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an ethernet frame?

A

Structured packets of data transmitted over ethernet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the key components of an Ethernet Frame?

A

Preamble - syncs sender and receiver by providing a signal to start processing the frame
Start Frame Delimiter - start of ethernet frame
Destination MAC address - unique address of recipient hardware device
Source MAC address - unique address of sender hardware device
Type/Length Field - Specifies protocol (IPv4, ARP) or payload size
Payload - data being transmitted
Frame Check Sequence (FCS) - ensures integrity by detecting errors in transmission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ethernet frames are important to the web. Why is knowing the minimum packet size that can be sent over an ethernet frame important?

A

Since ethernet frames have a minimum packet size, smaller data must be padded to meet the size requirement.

Knowing this, ensures efficient data transmission without unnecessary overhead. As you can send more data to rid of the unnecessary overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Ethernet frames are important to the web. What do we know about the potential waste in a transmission?

A

Transmitting small amounts of data (1 byte) incurs overhead from headers and trailers.

Headers can be disproportionately large compared to the actual payload.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is fragmentation in ethernet frames?

A

If a message exceeds the maximum transmission unit (MTU) it must be fragmented (split) into multiple frames.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does fragmentation do to latency?
Why does this happen?

A

Splitting large messages across frames adds latency because you need to reconstruct them at the destination.

Ensuring payloads fit within a single frame minimizes these delays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is ethernet crucial to understand when optimizing web applications? What can we do to keep things optimal?

A

Ethernet is prevalent, so understanding its limitations (message exceeding MTU–> causing latency) is crucial.

Keeping data smaller than 1.5 KB ensures staying within a single frame, avoiding fragmentation and in turn reducing latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If we just need to keep data small, why not just send lots of small data?

A

Sending minimal data (e.g., a 1-byte payload) still incurs the full size of Ethernet headers (14 bytes), frame check sequence (4 bytes), and potentially IP/TCP overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the problems with ethernet frames in terms of communication? (Hint: Is ethernet routable?)

A

Ethernet is not routable.

Ethernet frames are limited to communication with a local network.

To communicate across networks (between a computer and a server on the internet), we need a routable protocol like IP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How did we fix the problem of ethernet not being able to communicate between different networks?

A

Introducing IP (IPv4),

uniquely identifies devices on a network, enabling efficient communication across billions of devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is IPv4?

A

Core Protocol that allows devices on different networks to communicate globally.
Assigns addresses to devices, within a network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What kind of addresses does IPv4 give?

A

32-bit addresses: represented as 4 decimal octets

e.g 192.168.0.1

Around 4.3 billion unique addresses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Is IP stateless?

A

Yes, IP does not maintain any connection state, simply routes packets independantly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What was the problem with IPv4?

A

We ran out of addresses. Due to the rapid growth of devices, (IoT), IPv4 ran out of available addresses, leading to the adoption of IPv6 as a successor, providing a vastly larger address space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How is IPv6 is an improvement form IPv4?

A

IPv6 expands the address space of Ipv4, allowing for virtually infinite addresses, solving the IPv4 exhaustion issue

Increased address size from 32 to 128 bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What kind of protocols does IPv6 and IPv4 support?

A

They both support TCP/UDP ensuring compatibility with existing transport protocols used in web apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How are IPv6 addresses different from IPv4?

A

Written in hexadecimal separated by colons.

ex. 2001:0db8:0000:0000:0000:0000:0000:0001

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How can the long addresses of IPv6 be abbreviated?

A

omitting leading zeros and consecutive blocks of zeros

e.g.

2001:0db8:0000:0000:0000:0000:0000:0001
goes to
2001:db8:0:0:0:0:0:1
which can go to:
2001:db8::1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How are the host and port representations different in IPv4 and IPv6?
IPv4: 192.168.1.1:443 (host:port) IPv6: https://[2001:db8::1]:443/ [host]:port Colons are part of the address in IPv6, so you have to over the host in square brackets to separate it from the port
26
What is UDP?
User Datagram Protocol, a lightweight. transport protocol. Is connectionless, meaning no session between sender and receiver. Is designed for applications that can handle communication independently with out requiring guarantees from the transport layer
27
What are checksums?
UDP includes a checksum field for basic error checking and data integrity. It does not guarantee reliability but can detect some corruption.
28
What do the port numbers UDP provides do?
It provides port numbers for application level multiplexing, enabling multiple applications to use the same network connection.
29
Is UDP stateless?
Yes, unlike TCP, UDP does not maintain a connection state between the sender and the receiver.
30
What does it mean when it is said that "UDP is lossy and unordered"
Data packets (datagrams) sent over UDP can be - Lost (never received by destination) - Received out of order (Sent: 0,1,2,3; Received: 0,2,1,3) UDP does not retransmit lost packets or reorder them
31
What does UDP guarantee?
Nothing; UDP does not guarantee - delivery of packets - order of packets - protection against duplication
32
Why is UDP used in real-time applications?
Prioritizes low latency over reliability, making it suitable for video streaming and gaming
33
What are the key characteristics of UDP?
Stateless, lossy, connectionless, unordered, no guarantees.
34
What is DNS?
Domain Name Service System that maps human-readable domain names (google.com) to IP addresses (172.217.164.110)
35
What can DNS bind to?
Can bind: a name to an IP address. (e.x. google address) a name to another name or a set of IP addresses. (ai.com => chatgpt.com)
36
What are DNS records?
A records: Map a domain name to an IP address. CNAME Records: Point one domain name to another (ai=>chatgpt)
37
How does DNS work with IPv4 and IPv6?
Seamlessly, (IPv6)AAAA records (IPv4) A records
38
what are tools we can use to check DNS?
host: simple tool for DNS lookups dig: Detailed DNS query tool nslookup: Legacy tool for DNS troubleshooting
39
What is TCP?
A transport protocol used for reliable communications between devices over the internet. Ensures data is delivered in order, without loss, and without duplication.
40
Is TCP connection oriented?
Yes, TCP is connection oriented, meaning it establishes a connection before data transfer begins
41
How does TCP establish a connection?
Establishes a connection by using a 3-way handshake: SYN: Client requests a connection SYN-ACK: Server acknowledges the request and sends its own request to connect. ACK: Client acknowledges the server's request, completing the handshake
42
Does TCP maintain order?
Yes, data is reassembled in the correct order at the receiving end. 0, 1, 2, 3, 4 -> 0, 1, 2, 3, 4
43
Is TCP prevalent?
Yes, TCP is widely used and is the backbone of most internet applications including: HTTP (web browsing) FTP (File browsing) SMTP (email) IMAP/POP3 (email retrieval)
44
What do firewalls do?
Usually prevent hosts from communicating on certain ports, or hosting services.
45
What is the impact of firewalls on HTTP and Web Clients
Firewalls often block inbound traffic unless explicitly allowed. For HTTP and web clients: Web clients (like browsers) are generally not allowed to act as web servers. Communication must be initiated by clients (e.g., browsers making HTTP requests) rather than the web server pushing data.
46
What is HTTP?
Hypertext Transfer Protocol
47
What is Hypertext?
refers to the text that contains links to other texts or resources HTTP facilitates the transmission of hypertext across the web, allowing users to navigate between resources (e.g. web pages)
48
How does HTTP transport?
HTTP serves as the application-layer protocol to transport requests and responses between clients (browsers) and servers. Relies on lower layer protocols (TCP/UDP) for actual data transfer
49
What does Protocol mean in terms of HTTP?
A set of rules that define how communication occurs between systems
50
Is HTTP stateless?
Yes! each request-response cycle is independant of one another
51
What kind of headers does HTTP allow the use of?
Custom Headers, allowing for extendable functionality. e.g. Developers can define headers for specific use cases (X-Custom-Header) or adopt new features without modifying the protocol itself
52
What kind of request/command pattern does HTTP use?
HTTP uses a request/command oriented pattern, where the client sends requests (GET/POST etc) and servers respond with appropriate actions and data
53
What kind of pairing does HTTP rely on?
HTTP relies on the Client-Server Model. Relies on interaction between web clients (browsers) and web servers. Client initiate requests. Server processes them to return responses.
54
How does HTTP identify resources on the web?
HTTP uses URIs to uniquely identify resources on the web. URIs allow multiple resources to be hosted on a single server by distinguishing them using paths. e.g. example.com/resource1 example.com/resource2
55
What does a GET request do?
Retrieves information from the server Used for reading data Idempotent: repeated requests yield the same result
56
What does a POST request do?
Sends data to the server for processing (form submission, creating a resource) Commonly used for login actions or appending data Non-idempotent: Repeated requests may have different effects (creating multiple resources)
57
What does a HEAD request do?
Similar to GET but only retrieves the headers, not the body Useful for checking meta data (caching or file size)
58
What does a PUT request do?
Stores or replace a resource at the specified URI e.g. Uploading or Updating a file Idempotent: repeated requests result in the same resource state PUT requests explicitly specifies the resource's URI.
59
What does a DELETE request do?
Removes the resource identified by the URI e.g. deleting user account Idempotent: Multiple delete requests have the same effect method may be overridden by human intervention on the origin server.
60
What does a PATCH request do?
Partially modifies a resource. E.g. updating only a user's email in their profile
61
What does an OPTIONS request do?
Returns the communication options available for a resource e.g. checking if a server supports specific methods (e.g. CORS preflight requests).
62
What does a TRACE request do?
Debugging method to see the request as received by the server. Used for diagnostic purposes only (not commonly used).
63
What does a TRACE request do?
Establishes a tunnel to the server. Commonly used for HTTPS traffic through proxies.
64
What is a URI?
A generic term for identifying a resource on the web Can be a URL or a URN URL: Uniform Resource Locator URN: Uniform Resource Name URL: http://ualberta.ca URN: urn:ietf:rfc:3986 Most are URLs
65
What is a URL?
Uniform resource locator Tells you how to get to resource
66
What is a URN?
Uniform Resource Name Tells you the unique name or number given to a resource by some body.
67
What are the two main parts of a URL
Scheme and everything else
68
What are all of the components of a URL
scheme authority -> username:password@ (this is optional) Hostname Port overall sytnax: [username:password@]hostname[:port] then, path query fragment (optional)
69
What does the scheme do in the URL?
Specifies the protocol to access the resource Common Schemes: https, http, mailto, file, data Syntax: scheme://
70
What does the host do in the url?
The domain or IP address of the server
71
What does the port do?
The port # to connect to, i.e :8080 for http, :443 for Https. Optional
72
What does the username@password do in the url?
optional specifies credentials for accessing resource
73
What does the path do in the url?
specifies the location of the resource on the server /index.html in https://ualberta.ca/index.html.
74
what does the query do in the url?
Provides additional parameters for the request ?query=students in https://ualberta.ca/search?query=students
75
What does the fragment do in the url?
Identifies a specific section of the resource https://ualberta.ca/docs#section1 #section1 https://en.wikipedia.org/wiki/Methanol#Applications
76
What is an absolute url?
Specifies complete location of a resource, including: Scheme, authority, path
77
What is a relative url?
Omits part of the abolute url and is interpreted relative to the context.
78
In terms of relative url's What is implied authority, absolute path
starts with / indicating path starts at the root of the domain /images/web-server.svg
79
In terms of relative url's What is implied authority, relative path
Does not start with /, indicating path is relative to the current directory e.g. images/web-server.svg
80
Why are URI's percent encoded?
They must accommodate all kinds of characters. Ensuring diverse resources can be represented on the web.
81
How are URL's encoded in HTTP
Unicode UTF-8 encoded
82
What encoding do we use for domain names
punycode encoding
83
What are the use cases for HTTP POST?
Submitting HTML forms - standard method for sending form data (login, registration) Adding or mutating data - generally used when server is expected to change state (e.g. adding a record to a database, uploading a file)
84
Can HTTP POST include query parameters?
Yes, similar to get, however the payload of the POST request is where most of the data is sent
85
How are POST parameters sent in a POST request?
In POST request body as application/x-www-form-urlencoded or multipart/form-data when uploading files
86
What is multipart/form-data?
A MIME(Multipurpose Internet Mail Extensions) type specified in RFC 2388 Used to send data, especially when a form contains file uploads or binary data breaks the form data into parts, each with its own content headers, allowing binary data and plain text to exist
87
What is multipart/form-data useful for?
file uploads
88
What is the process of using multipart/form-data?
The client sends the content size to the server first, ensuring the server knows what to expect. The server responds with HTTP/1.1 100 Continue if it can handle the specified size. Once acknowledged, the client sends the data in the body of the request.
89
What tradeoff does multipart/form-data introduce?
Increased latency due to the validation step before sending the data body.
90
What are the differences between put and post
POST: sever decides ho\w to process the request and often assigns a new URI to the resource Commonly used for adding new resources, or triggering a server side process PUT: URI itself specifies the resource being modified or created Client takes full control over the resource's location
91
How does the DELETE request work?
Client sends a DELETE request to the server Server removes the resource or marks it for deletion A successful DELETE request doesn't always guarantee that the resource was deleted; it only indicates the server intends to delete it or make it inaccessible.
92
What are the possible server responses to a DELETE request?
200 OK - resource deleted successfully, server may include message in response body 204 No Content - resource deleted successfully, no response body included 202 Accepted - Server has acknowledged the request, but has not yet completed the deletion 404/403 - resource doesn't exist or cannot be deleted.
93
Why is POST less suitable for replacing resources compared to PUT?
POST lacks clear semantics for replacing resources and is not idempotent, unlike PUT.
94
What is the advantage of using DELETE instead of POST for removing resources?
DELETE provides semantic clarity and is idempotent, ensuring repeated requests have the same result.
95
How does PUT align with REST principles?
PUT is tied to the "update/replace" operation, making RESTful APIs more predictable and intuitive.
96
What is WebDAV, and how does it use PUT and DELETE?
WebDAV is an HTTP extension for collaborative file management, relying on PUT to upload files and DELETE to remove them.
97
What is the key distinction between PUT and POST?
PUT targets a specific resource identified by its URI, while POST allows the server to determine the resource's URI.
98
What is an HTTP User agent
The user agent is the software acting on behalf of the user in HTTP communications. Commonly refers to: Web browsers (e.g., Chrome, Firefox, Safari). HTTP clients (e.g., curl, Python’s requests library, Postman). In RFCs, "user agent" often explicitly means a browser but can also include any HTTP client.
99
What is the role of the User Agent?
The user agent initiates HTTP requests to servers and processes their responses.
100
Why is the user-agent important?
It determines the format and features used in HTTP requests. HTTP headers like User-Agent identify the software making the request, enabling servers to adapt responses accordingly.
101
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36 What is this user agent?
Chrome 71.0 on Windows 10 on a PC
102
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0 What is this user agent?
Firefox 64.0 on Windows 10 on a PC
103
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.2 Safari/605.1.15 What is this user agent?
Safari 12 on OSX 10.14
104
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134 What is this user agent?
MS Edge 17
105
What is the general structure of a user-agent string?
() Example: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36
106
What do the 1xx https status codes do?
Indicates the server has received the request and is processing it. Example: 100 Continue — The server has received the request headers, and the client can proceed with sending the request body.
107
What do the 2xx https status codes do?
Confirms the request was successfully processed. Example: 200 OK — The request succeeded, and the server returned the requested data.
108
What do the 3xx https status codes do?
Indicates the client must take additional action (e.g., follow a new URL). Example: 301 Moved Permanently — The resource has been permanently moved to a new location. ** Redirect status codes
109
What do the 4xx https status codes do?
Indicates an error on the client’s side (e.g., bad request). Example: 404 Not Found — The server cannot find the requested resource.
110
5XX — Server Errors:
Indicates an error on the server’s side while processing a valid request. Example: 500 Internal Server Error — The server encountered an unexpected condition.
111
HTTP/1.1 201 Created:
Indicates that the request has succeeded, and a new resource has been created as a result. Typically used with POST or PUT requests.
112
HTTP/1.1 202 Accepted:
The request has been accepted for processing, but the processing is not yet complete. Example use case: A server needs time to process an uploaded file or run a long calculation. The response does not guarantee that the request will ultimately be fulfilled.
113
HTTP/1.1 203 Non-Authoritative Information:
Rarely used. Indicates that the returned information came from a proxy or intermediary, not the origin server. Example use case: A proxy server modifies or filters content before passing it to the client.
114
HTTP/1.1 204 No Content
Indicates that the request was successful, but the server is not returning a message body in the response. Only headers are returned. Useful to use to reduce bandwidth
115
HTTP/1.1 205 Reset Content (rare)
Like 204 No Content but the browser should clear the form/page
116
HTTP/1.1 206 Partial Content
Indicates that the server is delivering part of the resource as requested by the client. Typically used in scenarios like resumable downloads or streaming large files.
117
What does 300 Multiple Choices mean?
The server provides multiple options for the resource, and the client must choose one.
118
When should you use 301 Moved Permanently?
When a resource has been permanently moved to a new URI.
119
What’s the difference between 301 and 302 redirects?
301 indicates a permanent move, while 302 is for temporary redirects.
120
How does a 302 Found redirect affect the address bar?
The address bar remains unchanged, making the redirect invisible to the user.
121
Provide an example of when to use 302 Found.
Redirecting users to a backup server during maintenance.
122
What is the main purpose of 303 see other
Purpose: Designed to solve the forum-posting problem where POST requests (e.g., submitting a form) could accidentally be re-submitted when a user refreshes a page.
123
What does 303 see other do?
The server sends a 303 See Other response with a Location header indicating the new URI. The client is instructed to perform a GET request to the provided URI, instead of resubmitting the original POST request.
124
What happens to the URI in the location bar when a 303 See Other is issued?
The location bar updates to the new URI provided in the Location header.
125
What method is used in the redirected request of a 303 See Other response?
The GET method.
126
HTTP/1.1 307 Temporary Redirect
Go to the URI mentioned in the Location header Keep making requests to this URI you originally requested in case the server needs to redirect you somewhere else next time Cache the redirection using standard caching headers and rules URI in the location bar is updated
127
HTTP/1.1 308 Permanent Redirect
Go to the URI mentioned in the Location header Similar to 301 Moved Permanently Client must repeat the same request for the new location, unlike 301
128
HTTP/1.1 400 Bad Request
Hey buddy, I can't read this garbage. Don't send it again. Request has bad format
129
HTTP/1.1 401 Unauthorized
You have to send authentication information to see this URI. Headers and entity (response body) explains to the browser and user how to log in. Mostly useful for HTTP Authorization: header authentication
130
HTTP/1.1 402 Payment Required (rare)
Pay up, buttercup! Supposedly reserved, but some services use it anyway, e.g. MobileMe used it (the predecessor to iCloud) Google APIs use it YouTube will use it to force you to solve a CAPTCHA
131
HTTP/1.1 403 Forbidden
The web server will never respond to this request, no matter who you log in as Maybe it could answer your request but an administrator disabled that ability.
132
HTTP/1.1 404 Not Found
You've got the wrong resource or path. Can't find what you're looking for. Droids? What droids? Resource cannot be found
133
HTTP/1.1 405 Method not allowed
Whatever method you used (GET/HEAD/POST/PUT/DELETE/...) doesn't work on this URI
134
HTTP/1.1 502 Bad Gateway
The server talks to another HTTP server to fulfill this request and that other server isn't working.
135
HTTP/1.1 503 Service Unavailable
The service is temporarily down. Something's broken and we'll bring it back up eventually. Also used when servers are undergoing maintenance
136
HTTP/1.1 504 Gateway Timeout
The server talks to another process to fulfill this request and that other process isn't responding fast enough. Very common when a webapp is overloaded. Similar to 502, except in this case the packets between the reverse proxy and the origin webserver are just vanishing...