CHAPTER 1 Flashcards
What does NoSQL stand for?
Not Only SQL
Refers to a set of modern databases that do not use traditional relational models.
What is a key trait of NoSQL databases?
Schema-less
Data can be stored as key-value pairs, documents, columns, or graphs.
Why are NoSQL databases needed?
Traditional SQL databases can’t scale for volume, variety, and velocity of modern data.
Examples include social media, IoT, and e-commerce.
What are the 7 V’s of Big Data?
- Volume
- Velocity
- Variety
- Variability
- Veracity
- Visualization
- Value
Who defined relational databases (RDBMS) and when?
Edgar Codd in 1970.
What are the ACID properties in RDBMS?
- Atomicity
- Consistency
- Isolation
- Durability
What is a limitation of RDBMS?
Fixed schemas, hard to scale horizontally, not suitable for massive or semi-structured data.
What are the main characteristics of NoSQL?
- Horizontal Scalability
- Schema-less
- High Performance
- Open-source and cost-effective
- Eventual Consistency
What are the challenges of NoSQL?
- Immature technology
- Lack of standards
- Limited professional support
- Complex administration
- Fewer experts compared to SQL
What is a Key-Value Store?
The simplest form of a NoSQL database, storing data as (key, value) pairs.
Name an example of a Key-Value Store.
Redis, Amazon DynamoDB, Voldemort.
What are the basic operations of a Key-Value Store?
- put(key, value)
- get(key)
- delete(key)
When is it appropriate to use Key-Value Stores?
For storing data accessed only by key, such as session data, user profiles, and shopping cart data.
When should you avoid using Key-Value Stores?
- Need for relationships between data
- Require multi-key transactions
- Need queries based on values
- Need batch operations
What is Redis?
An open-source, in-memory, key-value store.
What are some key features of Redis?
- Data types: strings, lists, sets, sorted sets, hashes
- Atomic operations
- Persistence options
- Pub/Sub messaging
What is the CAP Theorem?
Consistency, Availability, Partition Tolerance. You can’t have all three in a distributed system.
What does the BASE model stand for?
- Basically Available
- Soft state
- Eventual consistency
What is the difference between vertical and horizontal scaling?
- Vertical Scaling: Add more power to a single machine
- Horizontal Scaling: Add more machines
What is sharding in NoSQL?
Data partitioning across multiple machines based on a key.
What is eventual consistency in NoSQL?
Data updates are not always immediately visible across all nodes but will eventually become consistent.
What is MapReduce?
A programming model for processing large datasets across many machines.
What are the phases of MapReduce?
- Map Phase
- Shuffle Phase
- Reduce Phase
What is a combiner in MapReduce?
An optional optimization that aggregates data locally before shuffling.