First Midterm Flashcards
What does NoSQL stand for?
Not Only SQL
It refers to a set of modern databases that don’t use traditional relational models.
What is a key trait of NoSQL databases?
Schema-less
Data can be stored as key-value pairs, documents, columns, or graphs.
What is the main purpose of NoSQL databases?
Designed to handle Big Data and unstructured/semi-structured data.
Why are NoSQL databases needed?
Traditional SQL databases can’t scale for volume, variety, and velocity of modern data.
What are the 7 V’s of Big Data?
- Volume: Enormous amounts of data (ZB, YB)
- Velocity: Speed of incoming data (real-time)
- Variety: Different formats (text, video, images, JSON, etc.)
- Variability: Data meaning can change with time
- Veracity: Trustworthiness and quality of data
- Visualization: Displaying complex data clearly (charts, graphs)
- Value: Extracting useful insights from data
Who defined Relational Databases (RDBMS) and when?
Edgar Codd in 1970.
What are the ACID properties in RDBMS?
- Atomicity: Transactions are all-or-nothing.
- Consistency: Data remains valid after a transaction.
- Isolation: Transactions don’t interfere.
- Durability: Once saved, data won’t be lost.
What are the limitations of RDBMS?
- Fixed schemas
- Hard to scale horizontally
- Not suitable for massive, real-time, or semi-structured data.
What are the main characteristics of NoSQL databases?
- Horizontal Scalability
- Schema-less
- High Performance
- Open-source and cost-effective
- Eventual Consistency
What is eventual consistency in NoSQL?
Data updates are not always immediately visible across all nodes but will eventually become consistent.
What are the types of NoSQL databases?
- Key-Value Stores
- Document Databases
- Column-Family Stores
- Graph Databases
What is a Key-Value Store?
The simplest form of a NoSQL database, storing data as (key, value) pairs.
What are the basic operations of Key-Value Stores?
- put(key, value)
- get(key)
- delete(key)
What is the BASE model in NoSQL?
- Basically Available: System is always available (even if partially)
- Soft state: Data can change over time, not always stable
- Eventual consistency: Data will become consistent… eventually
What is the CAP theorem?
- Consistency (C): All users see the same data.
- Availability (A): Every request gets a response, even if it’s partial.
- Partition Tolerance (P): The system continues to work even if network failures occur.
What are the two types of scalability?
- Vertical Scaling (Scaling up)
- Horizontal Scaling (Scaling out)
What is sharding in NoSQL?
Splits data across multiple machines based on a key.
What is the difference between replication and sharding?
- Replication: Duplicates the same data across multiple nodes.
- Sharding: Splits data across multiple machines.
What does durability mean in the context of ACID?
Once a transaction is committed, it won’t be lost.
What is the MapReduce programming model?
A three-phase model consisting of Map, Shuffle, and Reduce phases.
What does the Map function do in MapReduce?
Processes each record and outputs zero or more (key, value) pairs.
What is the Shuffle phase in MapReduce?
Groups intermediate pairs with the same key, managed by the framework.
What is the Reduce phase in MapReduce?
Applies the reduce function to each unique key to generate final output.
What is Hadoop?
An open-source implementation of MapReduce designed to process large datasets.