System Desing Primer Flashcards
Performance vs scalability
A service is scalable if it results in increased performance in a manner proportional to resources added. Generally, increasing performance means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow.1
Another way to look at performance vs scalability:
If you have a performance problem, your system is slow for a single user.
If you have a scalability problem, your system is fast for a single user but slow under heavy load.
Latency vs throughput
Latency is the time to perform some action or to produce some result.
Throughput is the number of such actions or results per unit of time.
Generally, you should aim for maximal throughput with acceptable latency.
Define Elements of CAP
Briefly explain it
Discuss dimensions
In a distributed computer system, you can only support two of the following guarantees:
Consistency - Every read receives the most recent write or an error
Availability - Every request receives a response, without guarantee that it contains the most recent version of the information
Partition Tolerance - The system continues to operate despite arbitrary partitioning due to network failures
Networks aren’t reliable, so you’ll need to support partition tolerance. You’ll need to make a software tradeoff between consistency and availability.
CP - consistency and partition tolerance
Waiting for a response from the partitioned node might result in a timeout error. CP is a good choice if your business needs require atomic reads and writes.
AP - availability and partition tolerance
Responses return the most readily available version of the data available on any node, which might not be the latest. Writes might take some time to propagate when the partition is resolved.
AP is a good choice if the business needs to allow for eventual consistency or when the system needs to continue working despite external errors.
Consistency Patterns
Recall the definition of consistency from the CAP theorem - Every read receives the most recent write or an error.
Weak consistency
After a write, reads may or may not see it. A best effort approach is taken.
This approach is seen in systems such as memcached. Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games. For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
Eventual consistency
After a write, reads will eventually see it (typically within milliseconds). Data is replicated asynchronously.
This approach is seen in systems such as DNS and email. Eventual consistency works well in highly available systems.
Strong consistency
After a write, reads will see it. Data is replicated synchronously.
This approach is seen in file systems and RDBMSes. Strong consistency works well in systems that need transactions.
Discuss Availability Patterns with Disadvantages
Fail-over
Active-passive
With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active’s IP address and resumes service.
The length of downtime is determined by whether the passive server is already running in ‘hot’ standby or whether it needs to start up from ‘cold’ standby. Only the active server handles traffic.
Active-passive failover can also be referred to as master-slave failover.
Active-active
In active-active, both servers are managing traffic, spreading the load between them.
If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers.
Active-active failover can also be referred to as master-master failover.
Disadvantage(s): failover
Fail-over adds more hardware and additional complexity.
There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive.
What is DNS?
A Domain Name System (DNS) translates a domain name such as www.example.com to an IP address.
DNS is hierarchical, with a few authoritative servers at the top level. Your router or ISP provides information about which DNS server(s) to contact when doing a lookup. Lower level DNS servers cache mappings, which could become stale due to DNS propagation delays. DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live (TTL).
What are different types of DNS records?
NS record (name server) - Specifies the DNS servers for your domain/subdomain.
MX record (mail exchange) - Specifies the mail servers for accepting messages.
A record (address) - Points a name to an IP address.
CNAME (canonical) - Points a name to another name or CNAME (example.com to www.example.com) or to an A record.
Name a few patterns for DNS routing
Weighted round robin
Prevent traffic from going to servers under maintenance
Balance between varying cluster sizes
A/B testing
Latency-based
Geolocation-based
DNS disadvantages
Accessing a DNS server introduces a slight delay, although mitigated by caching described above.
DNS server management could be complex and is generally managed by governments, ISPs, and large companies.
DNS services have recently come under DDoS attack, preventing users from accessing websites such as Twitter without knowing Twitter’s IP address(es).
What is CDN?
A content delivery network (CDN) is a globally distributed network of proxy servers, serving content from locations closer to the user. Generally, static files such as HTML/CSS/JS, photos, and videos are served from CDN, although some CDNs such as Amazon’s CloudFront support dynamic content. The site’s DNS resolution will tell clients which server to contact.
How does CDN improves performance?
Serving content from CDNs can significantly improve performance in two ways:
Users receive content from data centers close to them
Your servers do not have to serve requests that the CDN fulfills
What are the CDN types? Explain
Push CDNs
Push CDNs receive new content whenever changes occur on your server. You take full responsibility for providing content, uploading directly to the CDN and rewriting URLs to point to the CDN. You can configure when content expires and when it is updated. Content is uploaded only when it is new or changed, minimizing traffic, but maximizing storage.
Sites with a small amount of traffic or sites with content that isn’t often updated work well with push CDNs. Content is placed on the CDNs once, instead of being re-pulled at regular intervals.
Pull CDNs
Pull CDNs grab new content from your server when the first user requests the content. You leave the content on your server and rewrite URLs to point to the CDN. This results in a slower request until the content is cached on the CDN.
A time-to-live (TTL) determines how long content is cached. Pull CDNs minimize storage space on the CDN, but can create redundant traffic if files expire and are pulled before they have actually changed.
Sites with heavy traffic work well with pull CDNs, as traffic is spread out more evenly with only recently-requested content remaining on the CDN.
CDN disadvantages?
CDN costs could be significant depending on traffic, although this should be weighed with additional costs you would incur not using a CDN.
Content might be stale if it is updated before the TTL expires it.
CDNs require changing URLs for static content to point to the CDN.
What is load balancer?
Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client.
Load balancers can be implemented with hardware (expensive) or with software such as HAProxy.
What are the benefits of load balancer?
Load balancers are effective at:
Preventing requests from going to unhealthy servers
Preventing overloading resources
Helping to eliminate a single point of failure
Additional benefits include:
SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
Removes the need to install X.509 certificates on each server
Session persistence - Issue cookies and route a specific client’s requests to same instance if the web apps do not keep track of sessions
How to protect load balancers against failures?
To protect against failures, it’s common to set up multiple load balancers, either in active-passive or active-active mode.
Load Balancer traffic routing patterns?
Load balancers can route traffic based on various metrics, including:
Random
Least loaded
Session/cookies
Round robin or weighted round robin
Layer 4
Layer 7
Layer 4 Load Balancing
Operates at: Transport layer (OSI model)
Decisions based on: Source/destination IP addresses and ports (TCP/UDP segment headers)
Analogy: Mailroom sorting packages by address
Characteristics:
Fast and efficient
Simple to configure
Limited functionality (no content awareness)
Use Cases:
Balancing traffic for TCP/UDP applications (web, email, database servers)
Improving performance and availability
Providing failover
Benefits:
Improved performance
Increased availability
Enhanced scalability
Simplified management
Limitations:
No content awareness
Limited traffic management features
Layer 7 Load Balancing
Operates at: Application layer (OSI model)
Decisions based on: HTTP headers, URLs, cookies, application-specific data
Analogy: Intelligent receptionist directing visitors based on their purpose and appointment details
Characteristics:
Advanced traffic management
Content-aware
More resource-intensive
Use Cases:
Routing traffic based on user location, type of content, or application
Implementing security policies (e.g., web application firewall)
Optimizing content delivery (e.g., caching)
Benefits:
Increased flexibility and control
Improved security
Optimized performance
Enhanced user experience
Limitations:
Increased complexity
Higher resource consumption
Load Balancer Disadvantages
The load balancer can become a performance bottleneck if it does not have enough resources or if it is not configured properly.
Introducing a load balancer to help eliminate a single point of failure results in increased complexity.
A single load balancer is a single point of failure, configuring multiple load balancers further increases complexity.
Disadvantages of horizontal scaling
Scaling horizontally introduces complexity and involves cloning servers
Servers should be stateless: they should not contain any user-related data like sessions or profile pictures
Sessions can be stored in a centralized data store such as a database (SQL, NoSQL) or a persistent cache (Redis,
Memcached)
Downstream servers such as caches and databases need to handle more simultaneous connections as upstream servers scale out
Reverse Proxy + Benefits
A reverse proxy is a web server that centralizes internal services and provides unified interfaces to the public. Requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server’s response to the client.
Additional benefits include:
Increased security - Hide information about backend servers, blacklist IPs, limit number of connections per client
Increased scalability and flexibility - Clients only see the reverse proxy’s IP, allowing you to scale servers or change their configuration
SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
Removes the need to install X.509 certificates on each server
Compression - Compress server responses
Caching - Return the response for cached requests
Static content - Serve static content directly
HTML/CSS/JS
Photos
Videos
Etc
How to remmember:
S.C.O.R.F.S
Security - Hide information about backend servers, blacklist IPs, limit number of connections per client
Compression - Compress server responses to improve load times.
Offloading - This now represents SSL termination, emphasizing that the reverse proxy handles the encryption/decryption workload.
Resilience - Increased scalability and flexibility; clients only see the reverse proxy’s IP, allowing you to scale servers or change their configuration without affecting them.
Fast Content - Caching (Return the response for cached requests)
Static Content - Serve static content directly (HTML/CSS/JS, Photos, Videos, etc.)
Load balancer vs reverse proxy
Deploying a load balancer is useful when you have multiple servers. Often, load balancers route traffic to a set of servers serving the same function.
Reverse proxies can be useful even with just one web server or application server, opening up the benefits described in the previous section.
Solutions such as NGINX and HAProxy can support both layer 7 reverse proxying and load balancing.