College 10: NoSQL Flashcards
NoSQL
A NoSQL database is a type of database designed to handle and store a wide variety of data models, particularly suited for large-scale, distributed data storage and retrieval. Unlike traditional relational databases (RDBMS), which use structured query language (SQL) and table-based schemas, NoSQL databases offer more flexibility and scalability to accommodate diverse data types and large volumes of data.
Characteristics: non-relational, open-source, cluster-friendly, 21st century Web, schema-less
Components of data model
- Structures
- Constraints
- Operations
Pros and cons of a database
Pros: good for data managing
Cons: hard to scale, resource intensive, not compatible with graph data
Pros and cons of NoSQL
Pros: can scaleboth vertically and horizontally, high scalability, workload can be spread among servers (partitions)
Cons: limited in data retrieval -> only by primary key, not filtering possibilities
NoSQL types
Aggregate oriented:
Document (mongoDB), Column-family (Hbase),Key-value
Schemaless: Graphs (Neo4j) and rest
Schemaless
Items in a database do not need to be in the same format/structure
ACID
- Atomic: Each transaction is all-or-nothing. It either completes fully or, if it fails, the database goes back to its state before the transaction started.
- Consistent: After a transaction, the database remains in a valid state, following all rules and constraints.
- Isolated: Transactions run independently. One transaction can’t affect others that are happening at the same time.
- Durable: Once a transaction is complete, the data is saved permanently, even if there’s a system crash or power failure.
These properties are primarily associated with traditional relational database management systems (RDBMS) and are essential for ensuring reliable transactions and data integrity
BASE
- Basically Available: Data is always accessible because it is spread out and duplicated across many nodes in the database.
- Soft State: Data might change over time because the database doesn’t enforce immediate consistency, leaving it to developers to manage consistency.
- Eventually Consistent: The database will become consistent eventually, but in the meantime, you can still read data, even if it’s not completely up-to-date.
These properties are often associated with NoSQL databases and systems designed for high availability and scalability, which may relax some ACID properties to achieve better performance and availability.
CAP theorem: can’t have them all
- Consistency (C): do all applications see all the same data?
o Every read receives the most recent write or an error.
o All nodes in the system see the same data at the same time. - Availability (A): of some node fails, does everything still work?
o Every request receives a response, without guarantee that it contains the most recent write.
o The system remains operational 100% of the time. - Partition Tolerance (P): if your nodes can’t talk to each other, does everything still work?
o The system continues to operate despite arbitrary partitioning due to network failures.
o The system can handle network failures or message losses between nodes.