no sql Flashcards
what are the 4 types of data
structured and unstructured
dynamic and static
dynamic
changing frequently
static
never changes
strcutured
formal predefined
easy to store and process
unstructured
e.g. audio, image, music
usually still has internal structural properties
sharding data
splitting the data to allow concurrent/parallel access using multiple machines
can simultaneously access each shard
in which two ways can we scale databases
vertically and horizontally
vertical scaling
upgrading hardware e.g. increasing memory
what is the limitaiton of vertical scaling
limited by the amount of cpu ram disk etc that can be configured on a single machine
horizontal scaling
adding more machines which requires shading and replication so you can work with them simultaneously
what is the limitation of horizontal scaling
read-to-write ratio and communication overhead
in which three ways can we benefit form parallelisation
maximise the fraction of the program that can be parallelised
balance the work load and parallel process
minimise the time spent on communication
how does the two phase commit protocol work
the coordinator requests cote for commit and the participants either approve or reject
if all participants accept then everything gets committed at the same time
what is the issue with two phase commit
hard to find a time where all servers are ready to commit
what is the CAP theorem
any distributed database with shared data can have at most 2/3
usually sacrificing consistency
what are the three components of cap theorem
consistency; every node always sees the same data at the same time
availability; the system continues operating even if nodes crash or software or hardware is down
partition tolerance; the system works well when distributed
what are the BASE properties
basically available; the system guaranteed availability
soft state; system state may change over time
eventual consistency; will eventually become consistent
what does it mean for a db to be eventually consistent
if all replicas will gradually become consistent in the absence of updates
what makes no sql no sql
no strict schema requirements
no strict adherence to acid properties
consistency is traded in favour of availability
document database/store
loosely structured set of key value pairs in documents encapsulate and encode data in some standard formats/encodings
treated as a whole
query languages can help retrieve documents based on their contents
addressed in the db via the unique key
in mongo what is the primary key
key; “_id”
sorted ordered column-oriented stores
columns are grouped in column families which data is stored in rather than tables
each unit of data is a set of key value pairs identified by row-key
graph db
everything is stored as an edge node or attribute
each node and edge can have any number attributes and can be labelled which narrows searches
what do document db use instead of an fk
embedded documents and referencing
what can be a value in document db
any data type
references
including links from one document in another which normalises the db
what are some benefits of using referencing
can represent more complex many-to-many relationships
good for large hierarchical datasets
what is a negative of using referencing
requires follow up queries to find all the data you need
embedded data
having a doc inside another via an array
embedded data positive
can get all the data in one call using less queries
negative of embedded data
the db isn’t normalised an not all values are atomic
data model
displays a set of tables and the relationship between them providing a blueprint so you can identify which data is important and what should be maintained
which two parts of the CAP approach does mongodb focus on
consistency and partition tolerance
what are the different parts of the mongodb structure and how do they relate to eachother
an instance has 0/more databases
a database has 0/more collections
a collection has 0/more documents
a document has 1/more fields/attributes
what is mongosh
an interactive shell that is a fully functional javascrips interpreter
in mongo what happens when you USE a db but it doesnt exist
mongo creates it
db.dropDatabase
deleted the db