Course4-M1 Flashcards
Introducing Relational Database Products
Explain single tier, two tier and three tier database architectures in simple terms
Single-Tier Architecture
What it is: The database, application logic, and user interface are all in one place.
Where it’s used: Small applications, personal projects, and simple setups.
Pros: Simple and easy to manage since everything is in one place.
Cons: Not scalable or secure for larger applications; if one part fails, everything fails.
Example: A desktop application with an embedded database like Microsoft Access.
Two-Tier Architecture
What it is: The user interface runs on a client, and the database is on a server. The application logic can be either on the client or the server.
Where it’s used: Medium-sized applications, enterprise software.
Pros: Better performance and scalability than single-tier; can handle more users.
Cons: Still limited in scalability; managing business logic can be challenging if it’s split between client and server.
Example: A client-server application where users interact with a front-end application (like a desktop app) that communicates directly with a back-end database server (like an SQL server).
Three-Tier Architecture
What it is: Separates the user interface, application logic, and database into three distinct layers.
Where it’s used: Large, complex applications, web applications.
Pros: Highly scalable, secure, and manageable; each tier can be developed, maintained, and updated independently.
Cons: More complex and expensive to set up and maintain.
Example: A web application where:
Presentation tier: The web browser (user interface).
Application tier: A web server running business logic (e.g., a server with Java, Python, or .NET).
Data tier: A database server (e.g., MySQL, PostgreSQL).
Visual Summary:
Single-Tier: All-in-one place (user interface + logic + data).
Two-Tier: Client (user interface) ↔ Server (logic + data).
Three-Tier: Client (user interface) ↔ Web/App Server (logic) ↔ Database Server (data).
These architectures offer different levels of complexity, scalability, and maintenance, making each suitable for different types of applications and organizational needs.
Application logic, also known as business logic, is the part of a software application that handles the data processing, business rules, and decision-making. It is responsible for implementing the core functionality of the application.
Example:
E-commerce Website:
Business Rule: Validate that a user’s credit card has sufficient funds before processing a purchase.
Data Processing: Calculate the total cost of items in the shopping cart, including taxes and shipping fees.
Workflow Management: Update inventory levels after a successful purchase.
What is shared disk database architecture?
Shared Disk Architecture
What it is: All the nodes in the cluster share the same storage disks.
How it works: Each node can access the same data stored on a common disk system.
Pros: Easier to manage and can balance loads effectively since all nodes see the same data.
Cons: Can become a bottleneck if too many nodes try to access the disk at once; potential single point of failure if the shared disk system goes down.
Example: Oracle RAC (Real Application Clusters).
What is nothing shared database arhcitecture?
Shared Nothing Architecture
What it is: Each node has its own private storage and doesn’t share it with others.
How it works: Nodes are responsible for their own data. They don’t depend on a central disk, which means they don’t compete for the same resources.
Pros: Scales well because adding more nodes doesn’t cause competition for the same disk; no single point of failure.
Cons: Can be more complex to manage because data needs to be partitioned and managed across multiple nodes.
Example: Google’s Bigtable, Cassandra.
What is sharding in database architecture?
Sharding
What it is: A way to split and distribute data across multiple nodes so that each node only holds a portion of the data.
How it works: The database is divided into smaller, more manageable pieces called shards. Each shard is stored on a different node.
Pros: Allows for horizontal scaling, meaning you can handle more data by adding more nodes; improves performance because each node only handles a subset of the data.
Cons: Can be complex to set up and manage; handling queries that need data from multiple shards can be challenging.
Example: MongoDB, MySQL with sharding.
What are functional dependencies?
Functional Dependencies (FDs) are fundamental building blocks for ensuring data integrity and consistency in relational databases. They represent a specific type of relationship between attributes in a relation, where the value of one attribute (determinant) uniquely determines the value of another attribute (dependent). In simpler terms, if you know the value of the determinant, you can always identify the exact value of the dependent attribute.
Notation: FDs are typically written as X -> Y, where X is the determinant and Y is the dependent attribute.
CustomerID -> (CustomerName, PhoneNumber) is another FD, as the customer ID uniquely identifies their name and phone number.
Properties:
Reflexivity: X -> X (every attribute determines itself).
Transitivity: If X -> Y and Y -> Z, then X -> Z (dependencies can chain together).
Closure: The minimal set of FDs that implies all other FDs in a relation.
Maintaining FDs help prevent data inconsistencies and duplication. Violating an FD could lead to inaccurate data and unreliable queries.
What are Multi-Valued Dependencies (MVDs)?
MVDs are more complex than FDs. With an MVD one attribute, the determinant, determines a set of possible values for another attribute, the dependent. In other words, knowing the value of the determinant narrows down the potential values the dependent attribute can hold.
Here are some key points about MVDs:
Notation: MVDs are written as X -> {Y1, Y2, …, Yn}, where X is the determinant and Y1, Y2, …, Yn are the possible values for the dependent attribute.
Example: In the “Car Dealership” schema, a potential MVD could be Model -> {Color1, Color2, …, Colorn}, where the car model determines the set of available color options.
Properties: MVDs have similar properties to FDs, like reflexivity and transitivity. However, they lack the closure property.
Identifying MVDs helps understand the complex relationships between attributes and ensures data consistency within those relationships. Violations of MVDs can lead to invalid data entries.
What are candidate keys?
Candidate keys uniquely identify each row in a relation. They represent a minimal set of attributes that collectively guarantee no duplicates exist in the table. In other words, if you know the values of the candidate key, you can pinpoint a specific row and only that row. The keys are unique within that table.
Here are some key properties of candidate keys:
Uniqueness: Each combination of values in the candidate key must uniquely identify a distinct row.
Minimality: No proper subset of the candidate key should be able to uniquely identify a row. This ensures that the candidate key is the smallest possible set of attributes needed for unique identification.
Performance: Having well-defined candidate keys significantly improves query performance. Queries searching for specific rows can utilize indexing on the candidate key for faster data retrieval. It also helps maintain data integrity by preventing duplicate entries.
Multiple keys: A relation can have multiple candidate keys, meaning different minimal sets of attributes can uniquely identify each row.