Database Design & Scaling Flashcards

1
Q

NoSQL or SQL?

Structured, tabular data with a predefined schema.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

NoSQL or SQL?

Strong adherence to data types and relationships using foreign keys, constraints, and normalization.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NoSQL or SQL?

Best suited for applications where data consistency and relationships are vital, such as transactional systems (e.g., banking).

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

NoSQL or SQL?

Pros: Data integrity, ACID compliance, strong consistency.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

NoSQL or SQL?

Cons: Limited flexibility if the schema needs frequent changes. You have to alter tables and perform migrations when adding new fields.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

NoSQL or SQL?

Schema-less or flexible schema, with hierarchical or unstructured data.

A

NoSQL (MongoDB, Cassandra):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NoSQL or SQL?

Supports key-value pairs, document-based storage, wide-column stores, or graph data structures.

A

NoSQL (MongoDB, Cassandra):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NoSQL or SQL?

Best for applications where flexibility, scalability, or storing large volumes of unstructured or semi-structured data is more important than strong consistency (e.g., social networks, big data applications).

A

NoSQL (MongoDB, Cassandra):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

NoSQL or SQL?

Pros: Dynamic schema allows you to evolve and scale without altering tables or migrating data.

A

NoSQL (MongoDB, Cassandra):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

NoSQL or SQL?

Cons: Less rigid data integrity, and query complexity increases as you scale.

A

NoSQL (MongoDB, Cassandra):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

NoSQL or SQL?

These databases typically focus on strong consistency (ACID properties: Atomicity, Consistency, Isolation, Durability), ensuring every transaction is executed exactly once and in a correct state.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

NoSQL or SQL?

Availability can be a concern when scaling horizontally (across servers), as these distributed database systems can introduce complexity and latency to maintain consistency across nodes.

A

NoSQL ( Cassandra, MongoDB )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

NoSQL or SQL?

These systems prioritize availability and partition tolerance over strong consistency (i.e., eventual consistency model). This is crucial for distributed systems where uptime and scaling are more important than immediate consistency.

A

NoSQL (MongoDB, Cassandra):

Especially Cassandra

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

NoSQL or SQL?

This DB offers tunable consistency levels (strong or eventual), while a DB of the same type defaults to eventual consistency.

A

NoSQL, specifically MongoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define ACID compliance

A

ACID compliance refers to a set of properties that guarantee reliable processing of database transactions in SQL (relational) databases. It ensures that transactions are processed consistently, even in the event of errors, power failures, or other issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The A in ACID

A

Atomicity

Ensures that each transaction is treated as a single, indivisible unit. Either all operations within the transaction are successfully completed, or none of them are. If any part of the transaction fails, the entire transaction is rolled back to its previous state.

Short: All or nothing.

Example: In a banking application, transferring money from Account A to Account B should either debit Account A and credit Account B entirely, or not happen at all.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The C in ACID

A

Consistency

Guarantees that a transaction brings the database from one valid state to another, maintaining database rules such as constraints, triggers, and referential integrity. After the transaction, the database will always be in a valid state, ensuring data integrity.

Short: Valid States Only

Example: If a transaction violates a foreign key constraint (e.g., referencing a non-existent user), it will fail, preventing invalid data from being inserted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The I in ACID

A

Isolation

Ensures that transactions are executed independently of one another. The intermediate states of a transaction should not be visible to other concurrent transactions, preventing issues like dirty reads, lost updates, and phantom reads.

Short: Transactions Don’t Intefere

Example: If two users are buying the last product from an online store, isolation ensures that only one transaction will succeed in updating the inventory, even if both are processed concurrently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The D in ACID

A

Durability

Guarantees that once a transaction is committed, it will remain committed, even in the event of a system failure. The changes made by the transaction are permanently recorded to stable storage, typically by using techniques like logging and backups.

Short: Changes are permanent after commit.

Example: After a bank transfer is confirmed, the transaction will persist even if the system crashes immediately after.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

NoSQL or SQL?

Scaling relational databases is typically vertical (scaling up by adding more power to a single machine).

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

NoSQL or SQL?

Horizontal scaling (across multiple machines) is possible but more complex due to the inherent need for transactional consistency and managing joins across nodes (sharding is complex).

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

NoSQL or SQL?

Performance can be limited by this scalability bottleneck for high-traffic applications.

A

SQL (MySQL, PostgreSQL):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

NoSQL or SQL?

Designed for horizontal scaling (sharding or partitioning data across many nodes) from the ground up.

A

NoSQL (MongoDB, Cassandra):

24
Q

NoSQL or SQL?

This database, for example, can scale across clusters more easily than its opposite databases, but has trade-offs in terms of how data relationships are managed (denormalization).

A

MongoDB specifically

25
Q

NoSQL or SQL?

This database is a peer-to-peer system designed for massive horizontal scalability and high write throughput, making it ideal for distributed workloads like real-time analytics or event logging.

A

Cassandra specifically

26
Q

CAP Theorem

A

AKA Brewer’s Theorem.

Consistency - Every read from the system returns the most recent write. All nodes in the system have the same, up-to-date data at all times.

Availability - Every request (read or write) receives a response, even if one or more nodes are down. The system remains operational, but it may not return the most recent data in the case of failures.

Partition Tolerance - The system continues to operate even if there is a network partition (communication between nodes is disrupted). Nodes may be unable to communicate with each other, but the system as a whole keeps running.

27
Q

Microservice Pro or Con?

Strong Module Boundaries

A

PRO.

Microservices reinforce modular structure, which is particularly important for larger teams.

28
Q

Microservice Pro or Con?

Independent Deployment

A

PRO

Simple services are easier to deploy, and since they are autonomous, are less likely to cause system failures when they go wrong.

29
Q

Microservice Pro or Con?

Technology Diversity

A

PRO

With microservices you can mix multiple languages, development frameworks and data-storage technologies.

30
Q

Microservice Pro or Con?

Distribution

A

Distributed systems are harder to program, since remote calls are slow and are always at risk of failure.

31
Q

Microservice Pro or Con?

Eventual Consistency

A

Maintaining strong consistency is extremely difficult for a distributed system, which means everyone has to manage eventual consistency.

32
Q

Microservice Pro or Con?

Operational Complexity

A

You need a mature operations team to manage lots of services, which are being redeployed regularly.

33
Q

NoSQL or SQL?

This language provides rich query languages that allow for complex joins, filtering, and aggregation. These queries can leverage indexes, foreign keys, and other constraints to ensure data consistency and optimize query performance.

A

SQL (MySQL, PostgreSQL):

34
Q

NoSQL or SQL?

Indexes and query optimizers in these databases are sophisticated and tuned for relational data.

A

SQL (MySQL, PostgreSQL):

35
Q

NoSQL or SQL?

Cons: For large datasets, complex joins can become inefficient as data scales.

A

SQL (MySQL, PostgreSQL):

36
Q

NoSQL or SQL?

These systems provide simpler query languages and rely more on the application logic to join data or process complex queries.

A

NoSQL (MongoDB, Cassandra):

37
Q

This DB supports secondary indexes but has limitations when handling deeply nested queries compared to SQL.

A

Specifically MongoDB

38
Q

This DB is more limited in terms of query flexibility and is built for high-speed writes with simple, predictable reads, but complex filtering or aggregation is challenging.

A

Specifically Cassandra

39
Q

What is horizontal scaling?

A

Horizontal scaling (scaling out) involves adding more servers or instances to distribute the load. Each server or node operates independently but serves the same application.

40
Q

What is vertical scaling?

A

Vertical scaling (scaling up) involves increasing the capacity of a single server by adding more CPU, memory, or storage. It improves performance without adding additional machines.

41
Q

What are the pros and cons of horizontal scaling?

A

Pros:

Better for handling large traffic surges.
More fault-tolerant.
Easier to add capacity incrementally.

Cons:

More complex system architecture.
Higher coordination overhead.

42
Q

What are the pros and cons of vertical scaling?

A

Pros:

Simpler to implement.
No need to rearchitect the application.

Cons:

Limited by hardware constraints.
Single point of failure.

43
Q

What is the role of a load balancer in scaling?

A

A load balancer distributes incoming traffic across multiple servers to ensure no single server is overwhelmed, improving availability and reliability.

44
Q

What are the types of load balancers?

A

Layer 4 Load Balancer: Works at the transport layer (TCP/UDP), distributing traffic based on IP addresses and ports.

Layer 7 Load Balancer: Works at the application layer (HTTP/HTTPS), distributing traffic based on content and application data.

45
Q

What is application sharding?

A

Sharding is the process of splitting data across multiple databases or application instances based on certain criteria, such as user ID or geographic region, to improve performance and scalability.

46
Q

What is database replication?

A

Database replication involves copying and maintaining database data across multiple servers, ensuring availability and redundancy. Common types include master-slave and master-master replication.

47
Q

What is master-slave replication?

A

Master-slave replication involves a master database where all writes occur, and slave databases that replicate the master’s data and handle read requests. This improves read performance and availability.

48
Q

What is master-master replication?

A

Master-master replication allows writes to occur on any node, and each node syncs with the others. It is more complex but improves both read and write availability.

49
Q

What is database partitioning?

A

Database partitioning divides a large database into smaller, more manageable pieces, either by rows (horizontal partitioning) or columns (vertical partitioning), to improve performance and scalability.

50
Q

What is horizontal partitioning (sharding) in databases?

A

Horizontal partitioning, or sharding, involves dividing a table’s rows into separate tables (shards) that can be distributed across different servers to handle more load.

51
Q

What is vertical partitioning in databases?

A

Vertical partitioning involves splitting a table’s columns into different tables based on their usage to optimize performance, typically storing frequently accessed columns separately from less-used ones.

52
Q

What is auto-scaling?

A

Auto-scaling automatically adjusts the number of active servers based on traffic demands, scaling up during peak times and scaling down during low usage to optimize resource usage and cost.

53
Q

What are common load balancing algorithms?

A

Round Robin: Distributes requests in a circular order.

Least Connections: Routes requests to the server with the fewest active connections.

IP Hash: Routes traffic based on the client’s IP address.

54
Q

What are the key challenges in scaling applications?

A
  • Maintaining consistency in distributed databases.
  • Managing state between servers.
  • Efficiently distributing load without creating bottlenecks.
  • Handling failures and ensuring high availability.
55
Q

What is the difference between stateless and stateful systems in scaling?

A

Stateless systems don’t store client state between requests, making them easier to scale horizontally.

Stateful systems require data about previous interactions, making scaling more complex as the state must be shared or stored persistently.

56
Q

What is Elastic Load Balancing (ELB)?

A

Elastic Load Balancing is a cloud service (like AWS ELB) that automatically distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses, adjusting to traffic changes.

57
Q

What is the purpose of a circuit breaker in scaling systems?

A

A circuit breaker is a design pattern that helps prevent system overload by stopping requests to an unresponsive service, allowing for fallback behavior and ensuring overall system stability.