Managing and querying DB Flashcards
Concurrency control schemes
Control the interactions among the concurrent transactions in order to prevent them from destroying the consistency of the database
ACID transactions
- Atomic: Either the whole transaction is run, or nothing
- Consistent: Database constrainst are preserved
- Isolated: Different transactions may not interact with each other
- Durable: Effects of a transactions are not lost in case of a system crash
Transaction concept
- A transaction is a unit of pgrogram execution that accesses and possibly updates various data items.
- A transaction must see a consistent databse
- During transaction execution the database may be inconsistent
- When the transaction is committed, the database must be consistent
- Two main issues to deal with
- Failures of various kinds such as hardware failures and system crashes
- Concurrent exection of multiple transaction
Implicit schema
- document can be stored as it is without having to define a schema for it
- without checking that it conforms to a schema.
- An explicit schema requires the schema to be defined and that documents are compliant before they can be stored successfull
Charactaristics of NoSQL
- Non-relational
- Cluster-friendly
- schema-less
Impedance mismatch problem
- The programming model of data doesn’t match the database model of
- This led to the rise of object-oriented databases in the 1990s.
Aggregate-orientation
- Nested hierarchical structure
- Aggregate orientation fits naturally with clusters
- Can store a whole aggreagte on a single node
- For applications where we need to slice and dice data in diferent ways, RDBMS and graph databases are more appropriate

Which of the NoSQL is most different ot the others ?
- Graph database
- Other are aggregate oriented (Column-family, Document databases, and key-value)
polyglot persistence
- Different kinds of data are best dealt with different data storage technologies.

Sharding
- Taking one copy of the data and splitting it across many machines
- Nothing is shared
Technical debt
Implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer.
CAP theorem
it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
- Single-server RDBMS is a “CA” system
Difference between relation and relationship
Relation refers to a table in a relational model based database while relationship refers to how two tables are connected
Relational DBMS
- Tabular structure
- Foundation in set theory
Database implementation topics
- Database sytem architecutre
- Block buffer
- Transactions and concurrency
- Recovery
- Indexes
- Query processing and optimisation
Three-level schema
- External:
- Conceptual: Logical level
- Interal: Storage/physical level
Data independence (Logical and physical)
- Logical data independence
- Conceptual schema must be able to evolve without having to change external application program
- Physical data independence
- Must be able to substitute a different physical sotrage schema for an existing one without having to change external application programs
Data models
- “No data model”
- Flat files
- “Classical” data models
- Hierarchical
- Network
- Relational
- Semantic data models
- Entity-Relatoinship model
- Functional dat model
- SDM
- Objet oriented
- NoSQL
Failure classifcation in DB
-
Transaction failure
- Logical errors: transaction cannot complete due to some internal error condition
- System errors: the DB system must terminate an active transaction due to an error condition (e.g deadlock)
-
System crash: a power failure or other hardware or software failure causes the system to crash
- fail-stop assumption: non-volatile storage contens are assumed to not be corrupted by system crash
- Disk failure: a head crash or similiar disk failre destroy all or part of disk storage

D. input() and output()

C. Main memory
Locking
- A lock is a mechanism to control concurrent access to a data item.
- There is potential for deadlock; transactions(s) must be rolled back to release lock(s) and resolve the deadlock
-
Starvation is also possible e.g
- A transaction may be waiting for an exclusive lock on an item, while a sequence of other transactions request and are granted shared locks on the same item.
Database system architecture


B. False
























