Sys Design and Distributed Systems Flashcards

1
Q

What is availability?

A

The likelihood of your system being operational and accessible to users when needed. It’s a measure of uptime, often expressed as a percentage. They’d highlight that Google Cloud offers high availability through features like regions and zones, with Service Level Agreements (SLAs) guaranteeing a specific uptime target for various services.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is latency?

A

It refers to the time it takes for data to travel between two points in your system. Think of it like how long it takes for a knight’s message to get from your castle (user) to the king’s advisors (server) and back. Lower latency means messages get delivered faster, resulting in a snappier user experience.

They’d emphasize that Google Cloud prioritizes minimizing latency with features like global network infrastructure and regional deployments. You can even use tools like Cloud Monitoring to track and optimize latency within your applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is RPC

A

Imagine you, the programmer, are a knight calling upon a powerful API (Application Programming Interface) in another castle (server). RPC, or Remote Procedure Call, is like your trusty squire.

The squire races to the castle (server) with your request (function call), waits for the API (server) to complete the task, and then sprints back with the results, all without you having to leave your comfy coding zone.

Note: The RPC runtime is responsible for transmitting messages between client and server via the network. The responsibilities of RPC runtime also include retransmission, acknowledgment, and encryption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

RPC Summary

A

The RPC method is similar to calling a local procedure, except that the called procedure is usually executed in a different process and on a different computer.

RPC allows developers to build applications on top of distributed systems. Developers can use the RPC method without knowing the network communication details. As a result, they can concentrate on the design aspects, rather than the machine and communication-level specifics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Local Procedure Call (LPC)

A

Local Procedure Call (LPC): This refers to a mechanism for communication between different parts of a program running on the same computer. It allows them to exchange data and synchronize their actions. Think of it as two colleagues working on the same project within the same office, easily passing information back and forth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

RPC + LPC For an app

A
  1. Remote Procedure Call (RPC):

Local: The mobile app (client).
Remote: The machine learning model server.
The process:

User takes a picture.
The mobile app (client) sends an RPC containing the image data to the machine learning model server (remote). This RPC acts as a messenger, carrying the image data across the network to the server.
The server receives the RPC, processes the image data using the machine learning model, and identifies the objects.
The server sends a response back to the mobile app through the same RPC channel, containing the identified objects.
The mobile app receives the response and displays the identified objects to the user.
2. Local Procedure Call (LPC):

Local: The mobile app itself.
An LPC might come into play within the mobile app’s image processing pipeline before the RPC is sent:

The app receives the image from the camera.
An LPC might be used to call a local image pre-processing function within the app. This function could resize the image, convert it to the format expected by the server, or perform other necessary transformations.
The pre-processed image data is then packaged and sent through the RPC to the server for object identification.
In essence, RPC facilitates communication between the app (client) and the separate server hosting the machine learning model, while LPC enables communication between different parts of the app itself running on the same device.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is ACID consistency?

A

Imagine you’re playing a game with your friends where you all take turns adding stickers to a picture (database). ACID helps make sure the picture doesn’t get messed up:

Atomicity: It’s like adding all your stickers at once (transaction). You either finish adding them all or none at all, so the picture doesn’t end up half-decorated and confusing.
Consistency: It’s like having rules about the picture (data stays valid). Maybe you can only use specific colors or shapes, so the picture always looks good and makes sense.
Isolation: Even if your friends (other transactions) try to add stickers at the same time, ACID makes sure each person only sees the picture one way at a time (transaction isolation). This avoids any sticker fights!
Durability: Once you stick on your stickers (update the data), they stay stuck forever (data persistence). Even if you accidentally knock over the picture (system crash), the stickers stay put!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the diff between ACID consistency and CAP consistency

A

ACID and CAP deal with consistency in different ways:

ACID (Atomicity, Consistency, Isolation, Durability):

Focuses on data integrity within a single database.
Ensures reliable updates, like your favorite game keeping your high score safe.
Imagine it as a strict teacher in the classroom (database) making sure everyone follows the rules (data stays valid) when updating the board (data).
CAP (Consistency, Availability, Partition Tolerance):

Deals with distributed systems where data is spread across multiple locations.
Focuses on trade-offs between keeping data consistent everywhere (Consistency), being always available (Availability), and tolerating network problems (Partition Tolerance).
Imagine a game with multiple scoreboards (data) in different schools (servers). You can’t have all three perfectly:
Consistent updates across all scoreboards might take time (sacrificing Availability).
Always showing the latest score might not be perfect if there’s a network issue (sacrificing Partition Tolerance).
Choosing two out of the three is the key!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Eventual Consistency

A

Eventual consistency is like waiting for the mail to deliver gossip in a big town. Updates are sent out (replicated) but might take a while to reach everyone (all servers). Eventually, everyone will have the latest news (consistent data), but there might be a short delay.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the highest availability consistency?

A

Eventual consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the weakest consistency model

A

Eventual consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SQL v NoSQL

A

Imagine your data is a collection of items in a classroom. SQL databases are like filing cabinets with neat rows and columns, perfect for things that fit in folders, like names and grades (structured data). NoSQL databases are like big boxes where you can store all sorts of things, like drawings, projects, and maybe even a toy robot (unstructured data)! They’re more flexible for messy data that doesn’t fit neatly in rows and columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SQL v NoSQL as explained by solution architect

A

Structure:

SQL: Enforces a predefined schema with rigid table structures and data types. Think of it as a strictly organized library with specific sections for books, DVDs, and audiobooks.
NoSQL: Offers flexible schema with various data models like documents, key-value pairs, or graphs. Imagine a modern library with designated areas for different media, but items within each section can be diverse.
Scalability:

SQL: Primarily scales vertically by adding more processing power to a single server. It can become expensive for massive datasets. Think of adding more shelves to a single, overflowing bookcase.
NoSQL: Scales horizontally by adding more servers to distribute the data load. Ideal for handling constantly growing datasets. Imagine adding more bookcases to a library as the collection expands.
Use Cases:

SQL: Excellent for structured data with complex queries and transactional consistency (think banking or e-commerce). It’s the go-to for relational data with established schemas.
NoSQL: Perfect for unstructured or semi-structured data with high availability and performance needs (think social media or IoT sensor data). Ideal for large, evolving datasets where flexibility is crucial.
Choosing the Right Tool:

Consider data structure, scalability requirements, and query patterns. If data is relational and requires complex joins, SQL might be ideal. For vast, evolving data with high availability needs, NoSQL could be a better fit.
Ultimately, the best choice depends on the specific needs of your application and data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Strong Consistency:

A

This is the gold standard, guaranteeing that all reads always reflect the latest write across all replicas of the data. Imagine a single source of truth, like a master document everyone can access simultaneously. It offers the highest data integrity but can impact performance and scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Read Your Writes Consistency:

A

This model ensures that a client can always read its own successful writes immediately. Think of it like writing a note and then immediately being able to read it back yourself. However, other clients might not see the update yet. This model offers a balance between availability and consistency and is suitable for scenarios where immediate access to self-generated data is important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do I choose my consistency model?

A

Data Integrity: How critical is it for all data to be immediately consistent across all replicas?
Availability: Can the system tolerate any downtime or lag in data updates?
Performance: How important are fast read and write operations?
Scalability: Will your data volume grow significantly over time?
Architecting with Consistency Models:

Strong consistency might be ideal for financial transactions or critical real-time systems requiring absolute data accuracy.
Eventual consistency is well-suited for social media platforms or e-commerce sites where immediate data updates are less crucial than high availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are some types of failures?

A

Single Point of Failure (SPOF): This occurs when a single component’s failure cripples the entire system. Imagine a bridge with only one lane – if that lane collapses, the entire bridge is unusable. Solutions include redundancy, like building additional lanes or finding alternative routes.
Cascading Failure: This occurs when the failure of one component triggers failures in other dependent components, creating a domino effect. Think of a power outage that shuts down critical servers, leading to data loss and service disruptions throughout the system. Mitigation strategies involve isolating components, designing graceful degradation, and implementing fault tolerance mechanisms.
Resource Exhaustion: This occurs when a system runs out of critical resources like CPU, memory, or storage, causing performance degradation or complete system crashes. Imagine a car running out of gas – it simply stops functioning. Solutions involve resource monitoring, auto-scaling capabilities, and capacity planning.
Byzantine Failures: This complex model describes situations where failing components can exhibit unpredictable behavior, sending misleading or inconsistent information. Imagine a group of unreliable witnesses to an event, each providing conflicting accounts. Byzantine fault tolerance is a challenging area of distributed systems design.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you mitigate failures?

A

Redundancy: Introduce backups, failover mechanisms, or load balancing to avoid SPOFs.
Isolation: Design your system with loosely coupled components to limit the impact of cascading failures.
Monitoring and resource management: Proactively monitor resource usage and implement scaling mechanisms to prevent exhaustion.
Error handling and recovery: Build robust error handling and recovery routines to gracefully handle failures and minimize downtime.
Test and validate: Regularly test your system under simulated failure conditions to verify the effectiveness of your mitigation strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Availability: What is availability?

A

Availability is the percentage of time that some service or infrastructure is accessible to clients and is operated upon under normal conditions. For example, if a service has 100% availability, it means that the said service functions and responds as intended (operates normally) all the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Non-functional Sys Char: How do we measure availability

A

((Total time-amount of time service was down)/total_time)*100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The nines of availability

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Non-func req: What is reliability?

A

Prob that the service will perform its functions for a specified time. Reliability measures how the service performs under varying operating conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Metrics to measure R

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is measurement of availability driven by

A

time loss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is measurement of reliability driven by

A

frequency and impact of failures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Scalability

A

Ability to handle an increase in amount of workload without compromising performance. A search engine, for example, must accommodate increasing numbers of users, as well as the amount of data it indexes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the two types of workload:

A

Request workload: This is the number of requests served by the system.

Data/storage workload: This is the amount of data stored by the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Dimensions of scalability

A

Size scalability: A system is scalable in size if we can simply add additional users and resources to it.
Administrative scalability: This is the capacity for a growing number of organizations or users to share a single distributed system with ease.
Geographical scalability: This relates to how easily the program can cater to other regions while maintaining acceptable performance constraints. In other words, the system can readily service a broad geographical region, as well as a smaller one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Vertical Scaling

A

Vertical scaling, also known as “scaling up,” refers to scaling by providing additional capabilities (for example, additional CPUs or RAM) to an existing device. Vertical scaling allows us to expand our present hardware or software capacity, but we can only grow it to the limitations of our server. The dollar cost of vertical scaling is usually high because we might need exotic components to scale up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Horizontal Scaling

A

Horizontal scaling, also known as “scaling out,” refers to increasing the number of machines in the network. We use commodity nodes for this purpose because of their attractive dollar-cost benefits. The catch here is that we need to build a system such that many nodes could collectively work as if we had a single, huge server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is maintainability?

A

Maintainability refers to the ease with which a system can be modified, extended, and debugged throughout its lifecycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is concept of operability in maintainability?

A

This is the ease with which we can ensure the system’s smooth operational running under normal circumstances and achieve normal conditions under a fault.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is concept of Lucidity in maintainability?

A

This refers to the simplicity of the code. The simpler the code base, the easier it is to understand and maintain it, and vice versa.

34
Q

What is concept of Modifiability in maintainability?

A

This is the capability of the system to integrate modified, new, and unforeseen features without any hassle.

35
Q

How does google define maintainability?

A

From a Google Solutions Architect’s perspective, maintainability refers to the ease with which a system can be modified, extended, and debugged throughout its lifecycle. It encompasses various aspects that contribute to keeping the system efficient, adaptable, and cost-effective in the long run.

Here are some key principles of maintainability as emphasized by Google Cloud Architects:

Modular design: Breaking down the system into well-defined, independent modules with clear interfaces promotes easier isolation of issues and facilitates targeted modifications.
Code clarity and documentation: Writing clean, well-commented code and maintaining comprehensive documentation improves understanding for future developers, reducing maintenance effort.
Automated testing: Implementing unit, integration, and system tests ensures code quality, catches regressions early, and automates repetitive tasks, streamlining maintenance processes.
Version control and configuration management: Using version control systems like Git and configuration management tools like Terraform enables tracking changes, reverting to previous states if needed, and simplifies managing infrastructure configurations.
Infrastructure as code: Treating infrastructure as code using tools like Cloud Deployment Manager allows for automated provisioning and configuration, making deployments and updates more consistent and less error-prone.
Observability and monitoring: Implementing proper monitoring and logging solutions provides deep insights into system health and performance, enabling proactive identification and resolution of potential issues.
By prioritizing these principles, Google Solutions Architects aim to build and deploy systems that are not only functional but also sustainable and manageable in the long term. This translates to:

Reduced development and maintenance costs: Easier modifications and fewer errors lead to faster development cycles and lower maintenance overheads.
Improved scalability and adaptability: Modular design and automated processes make it easier to adapt the system to changing requirements or scale it efficiently as needed.
Reduced risk of downtime: Proactive monitoring and well-documented code minimize the risk of introducing regressions or encountering unexpected issues during maintenance activities.
Overall, maintainability is a crucial aspect of building robust and sustainable solutions on Google Cloud, ensuring their long-term viability and minimizing the burden on future developers and administrators.

36
Q

How do you measure maintainability?

A

Maintainability, M, is the probability that the service will restore its functions within a specified time of fault occurrence. M measures how conveniently and swiftly the service regains its normal operating conditions.

Mean time to repair.

37
Q

Diff between maintainability and reliability

A

Maintainability and reliability#
Maintainability can be defined more clearly in close relation to reliability. The only difference between them is the variable of interest. Maintainability refers to time-to-repair, whereas reliability refers to both time-to-repair and the time-to-failure. Combining maintainability and reliability analysis can help us achieve availability, downtime, and uptime insights.

38
Q

What is fault tolerance?

A

fault tolerance refers to the ability of a system to remain operational and functional even in the presence of faults or failures.

39
Q

What are key aspects of fault tolerance defined by google?

A

Redundancy: Implementing redundant components like servers, network connections, or storage resources allows the system to automatically switch to a backup if a primary component fails. This can be achieved through services like Google Cloud Load Balancing and regional deployments of resources.
Failover mechanisms: Designing automatic failover mechanisms ensures the system gracefully switches to a secondary resource when a primary component encounters an issue. This minimizes service interruption and user impact. Google Cloud offers features like Cloud Spanner and Cloud SQL failover groups for automated failover capabilities.
Self-healing: Building self-healing capabilities allows the system to automatically detect and recover from failures without manual intervention. This can involve restarting failed processes, re-initializing connections, or automatically scaling resources in response to increased load. Google Cloud offers tools like Cloud Monitoring and Cloud Functions to automate recovery actions.
Isolation: Designing the system with loosely coupled components helps prevent cascading failures where a single fault brings down the entire system. This allows other parts of the system to continue functioning even if one component encounters an issue.
Error handling and recovery: Implementing robust error handling mechanisms ensures the system gracefully handles errors and attempts to recover from failures without crashing. This includes logging errors for further analysis and providing meaningful feedback to users.
Testing and monitoring: Regularly testing the system under simulated failure conditions and implementing comprehensive monitoring practices are crucial for identifying potential weaknesses and vulnerabilities. This allows proactive measures to be taken to strengthen the system’s fault tolerance.
By prioritizing these aspects, Google Solutions Architects aim to build systems that are resilient and adaptable to unexpected events. This translates to:

Reduced downtime and data loss: Fault tolerance mechanisms minimize service disruptions and ensure data integrity even in the face of failures.
Improved user experience: By minimizing downtime and errors, fault tolerance helps maintain a consistent and reliable user experience.
Increased operational efficiency: By automating recovery processes and proactively identifying potential issues, fault tolerance reduces the need for manual intervention and optimizes operational efficiency.
Overall, fault tolerance is a cornerstone of building reliable and robust systems on Google Cloud. By incorporating these principles, Google Solutions Architects create solutions that can withstand unexpected challenges and deliver a high level of availability and service continuity to their users.

40
Q

Fault tolerance techniques:

A

Replication- replicate services and data. Swap out failed nodes and a failed data store with its replica.

41
Q

Updating data replicas can be challenging cause

A

Updating data in replicas is a challenging job. When a system needs strong consistency, we can synchronously update data in replicas. However, this reduces the availability of the system

42
Q

Async rep data tradeoff

A

We can also asynchronously update data in replicas when we can tolerate eventual consistency, resulting in stale reads until all replicas converge.

43
Q

Example of replicating database with cloud storage:

A

Example 1: Replicating Cloud SQL Database to Cloud Storage using Cloud Functions

Scenario: You have a Cloud SQL database containing important application data and want to replicate it to another region for backup and disaster recovery purposes.

Services Used:

Cloud SQL: Stores the source database.
Cloud Functions: Triggers the replication process based on changes in the database.
Cloud Storage: Stores the replicated data in a bucket.

44
Q

Say youre given a task where the platform needs to handle:
1. Increasing number of requests from diff parts of the world
2. People have 24/7 access

What are the two most important non-function requirements?

A

Availability - They need this service to not be down.
Scalability - They need to be able to scale to the num of requests coming in.

45
Q

Consistency definition

A

From a Google Solutions Architect’s perspective, consistency refers to the guarantee of data integrity across all replicas within a distributed system. This ensures everyone sees the same “truth” when accessing the data, whether it’s strong consistency with immediate updates or eventual consistency where updates eventually propagate. Choosing the right model depends on your needs for data accuracy versus availability.

46
Q

What is low latency?

A

For a Google Solutions Architect, low latency translates to minimized delays in data transfer and processing across a system. It’s about ensuring responsiveness and fast user experiences, like serving search results or loading content near-instantaneously. They strive to optimize network infrastructure, utilize regional deployments, and leverage caching mechanisms to achieve this goal.

46
Q

Imagine a banking application for financial transactions and buying online products. This platform allows users to obtain their account status, transfer money, pay utility bills, and generate bank statements.

List the following non-functional requirements in the correct order, starting from the most important non-functional requirement to the least important non-functional requirement:

Low latency

Consistency

Security

A

Security -> Consistency -> Low Latency

47
Q

A space agency relies on critical systems to operate spacecraft, conduct space missions, and gather valuable data for scientific research. Imagine a scenario where a spacecraft is on a mission to explore a distant planet, and it encounters a hardware malfunction or a communication disruption with the control center on Earth.

State one of the most important and relevant non-functional requirements from the list provided below such that its inclusion in the design would enable us to recover from the scenario mentioned above. Please also provide proper reasoning behind your decision:

A

Fault Tolerance: This is because when one system fails we want to be able to potentially spin up another system that can get communications back online or the hardware fixed.

48
Q

Google defining non-func requirements

A

Reliability: Ensuring your system consistently delivers accurate and expected results, even under challenging conditions, like a reliable car that always starts.
Maintainability: Designing systems that are easy to modify, troubleshoot, and extend over time, like a modular house that’s easy to add rooms to.
Consistency: Guaranteeing data integrity across all copies in a distributed system, ensuring everyone sees the same “truth,” with choices based on the trade-off between immediate updates (strong consistency) and eventual consistency (updates propagate eventually).
Fault Tolerance: Building systems that can withstand failures and continue operating, like a car with redundant brakes that can still stop even if one fails.
Availability: Ensuring your system is accessible and operational for users when they need it, like a store that’s always open for business.
Scalability: Designing systems that can adapt to changing demands by efficiently adding or removing resources, like a house party that can accommodate more guests by moving furniture around.

49
Q

Load balancers as explained by google

A

A Google Cloud Architect would explain a load balancer like a traffic cop for your cloud applications. Imagine a busy intersection with cars trying to reach different buildings (your application servers). The load balancer directs incoming user traffic (cars) across multiple available application servers (buildings) to ensure smooth operation and prevent any one server from getting overloaded. This keeps response times fast and your application highly available.

50
Q

Back of the envelope calcs - why do we use them?

A

Do focus less on the knitty gritty details of a sys.

Examples:

The number of concurrent TCP connections a server can support.
The number of requests per second (RPS) a web, database, or cache server can handle.
The storage requirements of a service.

51
Q

Data Center

A
52
Q

What do Web servers do?

A

Decoupled from application servers.

A Google Cloud Architect would describe a web server as the behind-the-scenes engine that delivers your website content to users. Imagine a restaurant kitchen (web server) that receives customer orders (user requests) and prepares the food (processes the request) using recipes (web applications). Finally, it delivers the prepared dishes (web content) to the waiters (web browsers) who serve them to the customers (users).

Here at Google Cloud, we offer several options to get your web server up and running quickly:

Compute Engine: This provides you with virtual machines (VMs) where you can install and configure any web server software you prefer, like Apache or Nginx. It offers full control and flexibility, but requires more manual setup and management.

App Engine: This is a fully managed platform where you deploy your web application code (recipes) and Google handles the underlying infrastructure, including the web server (kitchen). It’s ideal for simple to complex web applications and offers automatic scaling and load balancing.

Cloud Run: This serverless offering lets you deploy containerized web applications (pre-packaged recipes) without managing any servers yourself. Google automatically provisions resources based on traffic, making it ideal for cost-effective and scalable web applications.

53
Q

What do web servers need

A

good computation

54
Q

What is an application server

A

A Google Cloud Architect would describe an application server as the chef in the kitchen of your web application. While the web server delivers the content (like the waiter bringing the food), the application server prepares the content based on the user’s request. Imagine the chef receiving orders (requests) and using the kitchen (application server) with its ingredients and tools (databases, application logic) to cook the food (process the request and generate the response). Finally, the prepared dish (processed response) is sent back to the web server for delivery to the user.

Here at Google Cloud, we offer a few options to get your application server up and running:

Compute Engine: Similar to web servers, you can use Compute Engine VMs to install and configure any application server software you prefer, like Tomcat or WildFly. This offers full control and flexibility but requires more manual setup and management.

App Engine: While primarily focused on web applications, App Engine can also handle some application server functionalities. You can deploy your application code containing business logic alongside your web application, simplifying deployment and management.

Cloud Run: Like with web servers, Cloud Run allows deploying containerized application server code alongside your web application. This serverless approach offers automatic scaling and eliminates server management, making it cost-effective and scalable.

Kubernetes Engine (GKE): This managed Kubernetes service allows deploying and managing containerized applications at scale. You can use GKE to deploy containerized application servers alongside your web applications, offering flexibility and control over the environment.

Choosing the right service depends on your specific needs:

For full control and customization: Choose Compute Engine or GKE.
For ease of use and some application server functionality: Choose App Engine.
For serverless and cost-effective deployments: Choose Cloud Run.

55
Q

What are microservices?

A

As a Google Cloud Architect, I’d explain microservices like this:

Imagine a complex restaurant operation (your application). Traditionally, everything might be done in one giant kitchen (monolithic architecture). With microservices, we break it down into specialized stations (individual services) like appetizers, entrees, and desserts.

Here’s the breakdown:

Independent services: Each microservice focuses on a specific task (preparing appetizers, grilling steaks) and operates independently.
Clear communication: Services communicate with each other through well-defined APIs (like waitstaff taking orders from tables).
Faster development: You can develop and deploy individual services faster, like adding a new dessert station without affecting the entire kitchen.
Scalability: You can independently scale up specific services (adding more chefs to the grill station during peak hours) based on their needs.
Fault tolerance: If one service has an issue (grill malfunctions), it doesn’t bring down the whole operation (other services like appetizers can still function).
Here at Google Cloud, we offer several tools to help you build and deploy microservices:

Cloud Functions: For small, event-driven tasks.
App Engine: For simple to complex web applications with built-in scaling.
Cloud Run: Serverless platform for deploying containerized microservices.
Kubernetes Engine (GKE): For managing and orchestrating containerized microservices at scale.
Choosing the right tools depends on your project’s complexity and needs.

By adopting microservices, you can build more agile, scalable, and maintainable applications, just like a well-organized restaurant kitchen can deliver delicious food efficiently.

56
Q

Application servers need

A

high computation and storage capacities as they are working with many things. They typically provide changing content whereas a webserver usually is providing static

57
Q

Storage servers

A

Typically have loads of hard drive memory and can store a ton of data.

58
Q

Diff between RAM and Hardrive mem

A

RAM (Random Access Memory):

Imagine RAM as your workstation desk. It’s where you keep the things you’re currently working on, like open documents, folders, and tools.
Faster access: RAM is super fast, like having everything you need right at your fingertips. You can access information instantly.
Volatile memory: RAM is like a whiteboard – information is erased once you turn off your computer (or clear the desk).
Smaller capacity: RAM is typically smaller in size compared to a hard drive. It’s designed to hold what you’re actively using, not everything you own.
Hard Drive Memory (HDD) or Solid State Drive (SSD):

Think of the hard drive as your room’s storage closet. It’s where you keep all your stuff, from clothes and books (documents, music, movies) to long-term projects (archived files).
Slower access: HDDs are slower than RAM, like rummaging through a closet to find something specific. SSDs are a faster type of hard drive but still slower than RAM.
Non-volatile memory: Hard drives retain information even when you turn off your computer (or close the closet door). Your stuff stays there until you take it out.
Larger capacity: Hard drives have a much larger storage capacity compared to RAM. You can store a vast amount of information there.

59
Q

Define throughput

A

RAM (Random Access Memory):

Imagine RAM as your workstation desk. It’s where you keep the things you’re currently working on, like open documents, folders, and tools.
Faster access: RAM is super fast, like having everything you need right at your fingertips. You can access information instantly.
Volatile memory: RAM is like a whiteboard – information is erased once you turn off your computer (or clear the desk).
Smaller capacity: RAM is typically smaller in size compared to a hard drive. It’s designed to hold what you’re actively using, not everything you own.
Hard Drive Memory (HDD) or Solid State Drive (SSD):

Think of the hard drive as your room’s storage closet. It’s where you keep all your stuff, from clothes and books (documents, music, movies) to long-term projects (archived files).
Slower access: HDDs are slower than RAM, like rummaging through a closet to find something specific. SSDs are a faster type of hard drive but still slower than RAM.
Non-volatile memory: Hard drives retain information even when you turn off your computer (or close the closet door). Your stuff stays there until you take it out.
Larger capacity: Hard drives have a much larger storage capacity compared to RAM. You can store a vast amount of information there.

60
Q

Examples of resource estimation:

How would you calculate the number of servers?

A

Let DAU = NUM requests per second

61
Q

How do estimate storage requirements?

A
62
Q

Sys Design Building Blocks: Domain Name System

A

This building block focuses on how to design hierarchical and distributed naming systems for computers connected to the Internet via different Internet protocols.

63
Q

Sys design building block: Load balancer

A

Here, we’ll understand the design of a load balancer, which is used to fairly distribute incoming clients’ requests among a pool of available servers. It also reduces load and can bypass failed servers

64
Q

BB: Database

A

This building block enables us to store, retrieve, modify, and delete data in connection with different data-processing procedures. Here, we’ll discuss database types, replication, partitioning, and analysis of distributed databases.

65
Q

BB: Key-value store

A

It is a non-relational database that stores data in the form of a key-value pair. Here, we’ll explain the design of a key-value store along with important concepts such as achieving scalability, durability, and configurability

66
Q

BB: Content Delivery Network

A

In this chapter, we’ll design a content delivery network (CDN) that’s used to keep viral content such as videos, images, audio, and webpages. It efficiently delivers content to end users while reducing latency and burden on the data centers.

67
Q

BB: Sequencer

A

In this building block, we’ll focus on the design of a unique IDs generator with a major focus on maintaining causality. It also explains three different methods for generating unique IDs.

68
Q

BB: Service Monitoring

A

Monitoring systems are critical in distributed systems because they help analyze the system and alert the stakeholders if a problem occurs. Monitoring is often useful to get early warning systems so that system administrators can act ahead of an impending problem becoming a huge issue. Here, we’ll build two monitoring systems, one for the server-side and the other for client-side errors.

69
Q

Distributed Caching

A

In this building block, we’ll design a distributed caching system where multiple cache servers coordinate to store frequently accessed data.

70
Q

BB Distributed Messaging Queue:

A

In this building block, we’ll focus on the design of a queue consisting of multiple servers, which is used between interacting entities called producers and consumers. It helps decouple producers and consumers, results in independent scalability, and enhances reliability.

71
Q

Publish-Subscribe System

A

In this building block, we’ll focus on the design of an asynchronous service-to-service communication method called a pub-sub system. It is popular in serverless, microservices architectures and data processing systems.

72
Q

Rate Limiter

A

Here, we’ll design a system that throttles incoming requests for a service based on the predefined limit. It is generally used as a defensive layer for services to avoid their excessive usage-whether intended or unintended.

73
Q

Blob Store:

A

This building block focuses on a storage solution for unstructured data—for example, multimedia files and binary executables.

74
Q

Distributed Search

A

A search system takes a query from a user and returns relevant content in a few seconds or less. This building block focuses on the three integral components: crawl, index, and search.

75
Q

Distributed Logging

A

Logging is an I/O intensive operation that is time-consuming and slow. Here, we’ll design a system that allows services in a distributed system to log their events efficiently. The system will be made scalable and reliable.

76
Q

Distributed Task Scheduling

A

We’ll design a distributed task scheduler system that mediates between tasks and resources. It intelligently allocates resources to tasks to meet task-level and system-level goals. It’s often used to offload background processing to be completed asynchronously

77
Q

Sharded Counters

A

This building block demonstrates an efficient distributed counting system to deal with millions of concurrent read/write requests, such as likes on a celebrity’s tweet.

78
Q

From the system design building blocks, which will you typically encounter building LLM and ML systems?

A

Databases: Storing, retrieving, and managing training data, model parameters, and output results.
Key-Value Stores: Storing and retrieving training data, model configurations, and intermediate results, especially when dealing with large datasets.
Distributed Caching: Caching frequently accessed data like model parameters or pre-computed results to improve training and inference performance.
Distributed Messaging Queues: Facilitating communication and asynchronous execution of tasks in various stages of the ML pipeline, such as data pre-processing, training, and evaluation.
Blob Store: Storing large datasets of various formats, including images, text, and audio, which are commonly used for training generative AI models.
Distributed Search: Efficiently searching and retrieving relevant data points from large datasets used for training or evaluating models.
Distributed Logging: Recording and managing logs from various components of the ML system for monitoring, debugging, and troubleshooting purposes.
Distributed Task Scheduling: Scheduling and managing the execution of computationally expensive tasks involved in training and deploying ML models, especially in large-scale systems.
These building blocks play crucial roles in building scalable, efficient, and robust ML and generative AI systems.

79
Q

How to setup the conventions?

A

Conventions#
For elaboration, we’ll use a “Requirements” section whenever we design a building block (and a design problem). The “Requirements” section will highlight the deliverables we expect from the developed design. “Requirements” will have two sub-categories:

Functional requirements: These represent the features a user of the designed system will be able to use. For example, the system will allow a user to search for content using the search bar.

Non-functional requirements (NFRs): The non-functional requirements are criteria based on which the user of a system will consider the system usable. NFR may include requirements like high availability, low latency, scalability, and so on.
Let’s start with our building blocks.

80
Q

Building blocks for online predictions with LLMS

A

RAG Architecture:

Databases: Storing retrieved documents, LLM prompts, and generated responses for training and future reference.
Distributed Search: Efficiently searching the document repository for relevant information during retrieval stage of the RAG process.
LLMs: The core component responsible for generating text based on retrieved information and prompts.
Orchestration Layer: Manages communication between the various components like retrieval models, LLMs, and potentially user interfaces.
LLM-based Recommendation Engine:

Databases: Storing user data, item information, and potentially historical interactions or feedback.
LLMs: Generating personalized recommendations based on user data and item information. This might involve tasks like summarizing item descriptions, generating personalized messages, or tailoring recommendations to specific user preferences.
Content Delivery Network (CDN): Efficiently delivering LLM-generated content to users, especially if the recommendations involve text, images, or audio.
Monitoring and Logging: Monitoring the performance and effectiveness of the recommendation engine, including user interactions and feedback loops, to improve future recommendations.