Glossary Flashcards
DNS
Domain Name System (DNS) is the phone book of the internet.
Web browsers interact through IP address. DNS translate domain names to IP address for browsers can load internet resources. Recursive and authoritative.
Crisp:-
https://d1.awsstatic.com/Route53/how-route-53-routes-traffic.8d313c7da075c3c7303aaef32e89b5d0b7885e7c.png
Detailed:-
https://s905060.gitbooks.io/site-reliability-engineer-handbook/content/a_day_in_the_life_of_a_web_page_request.html
Firewall
What :- A firewall is a hardware / software that monitors incoming and outgoing network traffic and decides whether to allow or block specific traffic based on a defined set of security rules.
Why :- First line of defense. Protect network from outside malicious attacks or unnecessary traffic.
Where :- Between network and gateway.
A firewall works like a traffic guard at your computer’s entry point, or port. Only trusted sources, or IP addresses, are allowed in. IP addresses are important because they identify a computer or source, just like your postal address identifies where you live.
Proxy Firewall Packet-filtering firewall Stateful inspection Firewall Unified threat management Firewall Next generation firewall NAT firewall - only if solicited. Threat focused Virtual
API gateway
What:- an API gateway is a API management tool that sits between client and collection of backend services.
Where :- After firewall before Backend API
Why :-
- Protect API from overuse, abuse. Apply authentication, rate limiting.
- Analytics and API monitoring.
- Monetize APIs - connect to billing system.
- Microservices - reverse proxy.
- Add new APIs / retire other.
- Policies, alerts, security.
Load Balancer
What :- load balancing refers to the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient
Why:- ptimize the response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle
Where :- Before server farm where same functionality is present on multiple nodes.
How:- (Weighted)Round robin, (Weighted)Least connection, Resource based, IP / URL Hash,
Types:-
Hardware - fast, efficient, costly.
Software - slow, cheap, customizable.
L4 - Transport level, TCP / UDP ports, Do not open packet.
L7 - Application level, HTTP headers, sessions data,
API gateway vs Load balancer
API Gateway provides a single entry point for a client for a number of different underlying APIs (system interfaces/web services/Rest APIs etc.)
Load Balancer provides facilitates load distribution for your application servers (where you may have deployed your microserivces/Rest APIs)
Web server
A web server accepts and fulfills requests from clients for static content (i.e., HTML pages, files, images, and videos) from a website. Web servers handle HTTP requests and responses only.
Application server
An application server exposes business logic to the clients, which generates dynamic content. It is a software framework that transforms data to provide the specialized functionality offered by a business, service, or application. Application servers enhance the interactive parts of a website that can appear differently depending on the context of the request.
Web server vs Application server
Web Server:-
- Deliver static content.
Content is delivered using the HTTP protocol only. - Serves only web-based applications.
No support for multi-threading. - Facilitates web traffic that is not very resource intensive.
Application Server:-
- Delivers dynamic content.
- Provides business logic to application programs using several protocols (including HTTP).
- Can serve web and enterprise-based applications.
Uses multi-threading to support multiple requests in parallel. - Facilitates longer running processes that are very resource-intensive.
Proxy server
What :- Proxy means “to act on behalf of another”
If you’re using a proxy server, traffic flows through the proxy server on its way to the address you requested. The request then comes back through that same proxy server (there are exceptions to this rule), and then the proxy server forwards the data received from the website to you.
why :- act as firewall, traffic filter, provide shared network connections, cache data.
Forward proxy :- Positioned in front of clients. Organizations use it. Hide the client IP, eg., VPN
Reverse proxy :- Positioned in front of web servers, Hide the backend.
VPN or Proxy
If you need to constantly access the internet to send and receive data that should be encrypted or if your company has to reveal data you must hide from hackers and corporate spies, a VPN would be a better choice.
If an organization merely needs to allow its users to browse the internet anonymously, a proxy server may do the trick. This is the better solution if you simply want to know which websites team members are using or you want to make sure they have access to sites that block users from your country.
A VPN is better suited for business use because users usually need secure data transmission in both directions. Company information and personnel data can be very valuable in the wrong hands, and a VPN provides the encryption you need to keep it protected. For personal use where a breach would only affect you, a single user, a proxy server may be an adequate choice. You can also use both technologies simultaneously, particularly if you want to limit the websites that users within your network visit while also encrypting their communications.
WebLogic server
Oracle WebLogic Server is a scalable, enterprise Java platform application server for Java-based web applications. WebLogic allows users to develop and deploy an application that has business logic and allows the application to access other services like database, messaging, or other enterprise systems.
CDN
What :- A content delivery network (CDN) refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content.
Why :-
Improve website load times, reducing bandwidth costs, increase availability and redundancy, improve security.
Where :- across globe closed to users.
Types:-
Pull zone CDN :- A pull CDN automatically gets assets and files from your platform. It’s a reliable solution for storing small and medium-sized files.
push zone CDN :- A more fitting solution for the storage of larger files and static content. The main difference – you have to upload the files manually.
DDoS Attack
What :-
A distributed denial-of-service (DDoS) attack is a malicious attempt to disrupt the normal traffic of a targeted server, service or network by overwhelming the target or its surrounding infrastructure with a flood of Internet traffic.
why:- Malicious intent
How:-
When a victim’s server or network is targeted by the botnet, each bot sends requests to the target’s IP address, potentially causing the server or network to become overwhelmed, resulting in a denial-of-service to normal traffic.
Performance vs Scalability
Performance is an indicator of how well a software system or component meets its requirements for timeliness. Timeliness is measured in terms of response time or throughput.
Performance issue if the system is slow for a single user.
Scalability is the property of a system to handle a growing amount of work by adding resources to the system.
Scalability issue if system is fast for single user but slow under heavy load.
Latency and throughput
Latency is the delay incurred between a request and it being fulfilled by a responding party.
Throughput expresses the amount of requests a responding party can handle per unit of time. Latency and throughput have a similar relation to that of performance and scalability.
Availability
Availability refers to the percentage of time that the infrastructure, system, or solution remains operational under normal circumstances in order to serve its intended purpose.
The mathematical formula for Availability is :
Percentage of availability = (total elapsed time – sum of downtime)/total elapsed time
The measurement of Availability is driven by time loss
Reliability
Reliability refers to the probability that the system will meet certain performance standards in yielding correct output for a desired time duration.
MTBF = (total elapsed time – sum of downtime)/number of failures
the measurement of Reliability is driven by the frequency and impact of failures
Tips to improve scalability
- Clones
- Database - Sharding.
- Cache
- Asynchronism.
- CDN
- Scheduling / Map-Reduce
Bandwidth
Data per unit time
Bandwidth vs Speed
Bandwidth is how much information you receive every second, while speed is how fast that information is received or downloaded. Let’s compare it to filling a bathtub. If the bathtub faucet has a wide opening, more water can flow at a faster rate than if the pipe was narrower. Think of the water as the bandwidth and the rate at which the water flows as the speed.
Bandwidth vs Latency
Latency is sometimes referred to as delay or ping rate. It’s the lag you experience while waiting for something to load. If bandwidth is the amount of information sent per second, latency is the amount of time it takes that information to get from its source to you.
Bandwidth vs throughput
Throughput is how much information actually gets delivered in a certain amount of time. So if bandwidth is the max amount of data, throughput is how much of that data makes it to its destination – taking latency, network speed, packet loss and other factors into account.
Database partitioning and sharding
What :-
Sharding and partitioning are both about breaking up a large data set into smaller subsets. The difference is that sharding implies the data is spread across multiple computers while partitioning does not. Partitioning is about grouping subsets of data within a single database instance.
Why :-
improve performance.
When:-
when data grows huge.
How:-
Horizontal and vertical sharding.
Indexes
What:-
Indexes are a powerful tool used in the background of a database to speed up querying. Indexes power queries by providing a method to quickly lookup the requested data.
Simply put, an index is a pointer to data in a table. An index in a database is very similar to an index in the back of a book.
Why:-
Indexes serve as lookup tables that efficiently store data for quicker retrieval.
Public Key
What :- Public keys have been described by some as being like a business’ address on the web – it’s public and anyone can look it up and share it widely. In asymmetric encryption, public keys can be shared with everyone in the system. Once the sender has the public key, he uses it to encrypt his message.
Why :-
When :-
Where :-
How
Private key
Each public key comes paired with a unique private key. Think of a private key as akin to the key to the front door of a business where only you have a copy. This defines one of the main differences between the two types of keys. The private key ensures only you can get through the front door. In the case of encrypted messages, you use this private key to decrypt messages
SSL protocol
- A browser requests a secure page (usually https://).
- The web server sends its public key with its certificate.
- The browser checks that the certificate was issued by a trusted party (usually a trusted root CA), that the certificate is still valid and that the certificate is related to the site contacted.
- The browser then uses the public key, to encrypt a random symmetric encryption key and sends it to the server with the encrypted URL required as well as other encrypted http data.
- The web server decrypts the symmetric encryption key using its private key and uses the symmetric key to decrypt the URL and http data.
- The web server sends back the requested html document and http data encrypted with the symmetric key.
- The browser decrypts the http data and html document using the symmetric key and displays the information.