L1: Principles of Database Systems Flashcards
Major Disadvantages
of a Traditional File-Processing System
for Storing Data
- Data redundancy and Inconsistency
- Difficulty in Accessing Data
- Data Isolation
- Integrity Problems
- Atomicity Problems
- Security Problems
- Concurrent-Access Anomalies
Database Consistency
Consistency:
An action results in a state which conforms to all integrity constraints
- Transactions must leave the database in a consistent state
- Users write integrity constraints that restrict possible transactions
- DBMS still doesn’t understand the meaning of the restraints,
- Responsibility rests with the user
Atomicity
An action that is atomic must either:
- Complete entirely or
- Not complete at all
(Original Pokemon cloning was possible due to trading NOT being an atomic operation)
Data Manipulation Language (DML) :
Two Basic Types
- Procedural DMLs
- Require a user to specify:
- What data is needed
- How to get the data
- Require a user to specify:
- Declarative DMLs
- Require a user to specify only what data is needed
- Do not need to specify how to get it
- Also called “Non-Procedural DMLs”
Types of
Schemas
-
Physical Schema
- Describes data layout
- Relations as unordered files
- Some data is in sorted order(index)
-
Logical Schema
- Defines Tables of Relations and how each relation is structured
- Example:
- students( sid : string, name : string, gpa : float)
-
External Schema(Views)
- How data might be viewed
- Derived from other tables
Two Major Components
of a
DBMS
- Entities
- Real world objects, such as students, courses, products, facilities, etc
- Relationships
- Connections that exist between the entities
- Examples:
- Product A is stored in Facility F
- Professor Smith teaches Physics II
Concurrent Transactions:
Locking
DBMS ensures that the
execution of multiple transactions, { T1, …, Tn}
is equivalent to some serial execution of the transactions.
Locking:
Before reading/writing, a transaction requires a lock from the DBMS, which it holds until it is finished.
Other transactions are blocked, and cannot read or write to the locked items until the first transaction is finished.
Write-Ahead Logging (WAL)
- One way that a DBMS ensures atomicity when a transaction crashes
- Key Idea:
- Keep a log of all the writes done
- If a crash occurs, partially executed transactions are undone using the log
- Before any action is finalized, a corresponding log entry is forced to disk
Data Independence:
Basic Idea
Applications do not need to worry about how the data is structured and stored.
The database can be structured or stored in many ways.
The application still accesses data the same way.
Data Independence:
Two Types
- Logical Data Independence
- Protection from changes in the logical structure of the data
- Should not need to ask:
- Can a new entity be written without rewriting the application?
- Physical Data Independence
- Protection from physical changes in layout
- Should not need to ask:
- Which disks are the the data stored on?
- Is the data indexed?
DBMS
Challenges with
Many Users
- Security
- Different users with different roles
- Performance
- Need to provide concurrent access
- Disk access is slow
- DBMS hide latency by doing more CPU work concurrently
- Consistency
- Concurrence can lead to update problems
- DBMS allows a user to write programs as if they were the only user
What is a
Transaction?
A Transaction (TXN)
is an Atomic sequence of database actions(read/writes)
- When transferring data from one place to another, the transaction must be completed, or it will not happen at all (atomic)
- Leaves the Database in a consistent state
Schema:
Definition
Schema
A description of a particular collection of data,
using the given data model
- Every relation in a relational data model has a schema which describes its types and their names
DBMS:
Overview
Database Management System
- A piece of software designed to store and manage databases
- A large, integrated collection of data
- Models a real-world enterprise
Data Model
A collection of concepts for describing data
The Relational Model of Data
is the most widely used data model
-Main concept is the relation (basically a table)