Exam 1 Flashcards
Functions of a Database Management System (DBMS)
Enables users to create and maintain a database. Allows for:
- Defining: specifying data types, structures and constraints
- Constructing: storing the data on some storage medium controlled by the DBMS
- Manipulating: querying the database to retrieve data, updating to reflect changes in the miniworld, generating reports
- Sharing:allows multiple users or programs access
Query vs. transaction
Query causes data to be retrieved
Transaction causes data to be read or written
Other important functions of a DBMS
Protection: System and security protection
Maintenance: Allow the system to evolve as requirements change
Database System
Database + DBMS
Database Structure
File = Collection of records
Data records = Collection of data elements
Data elements = the different pieces of data
Data types = the specific type of each piece
Database Approach vs. File-Processing Approach
File-processing refers to old practice of each user or application having its own data.
Database approach is better because:
- Self-describing nature of a database system
- Insulation between programs and data, and data abstraction (i.e. program-data independence)
- Support of multiple views of the data
- Sharing of data and multiuser transaction processing
Meta-data
The complete definition of database structure and constraints.
Structure of each file, the type and storage format for each data item, various constraints on the data.
Self-describing data
Data stored as item names and values in one structure. Used in NOSQL
Program-Data Independence
The structure of the data files is stored in the DBMS catalog, separately from the access programs (so a change to the data structure doesn’t break the access programs functioning)
Program-Operation Independence
An operation = method or function
An “interface” to an operation = the name of the method and parameters it takes.
The “implementation” of the operation = The part that takes action. Because the operation can change like a black box, you can change the workings but use the same interface. This is basically data-abstraction.
Data Abstraction allows Program-Operation Independence.
What allows program-data independence and program-operation independence?
Data Abstraction
Conceptual Representation (i.e. how the user might imagine the data or operations to work) ignores the implementation details required to actually perform operations.
Data Model
The conceptual representation of the data that hides implementation details
Catalog
Data might be stored by length and byte references.
Catalog turns that into the rows and columns we understand.
Online Transaction Processing (OLTP)
Ensure that, booking airline seats, say, an agent can only access the same seat on at a time.
Is basically concurrency control software.
Transactions and their 2 properties
Accessing or updating the database.
Must execute correctly while thousands of other transactions take place.
“Isolation” ensures each transaction holds, separate from others.
“Atomicity” ensures either all operations in the transaction takes place or none do.
Redundancy - problems with
1) Have to perform updates on multiple systems
2) Wastes extra storage space
3) “Inconsistency” - two pieces of data that should be the same are different, because one was updated differently from the other
Data Normalization
Norm of storing a giving piece of information (like name) in only one place in the database.
Controlled Redundance
“Denormalization” of data (put in multiple places) so that it can be retrieved all from one file, for example.
Can control against inconsistency by running checks the data hasn’t drifted.
Database Schema
Description of the database.
Specified during design, it is expected not to change.
Steps of database design
1) Collect and analyse requirements - Getting user-defined operations/transactions that will be applied to the database
2) Create a conceptual schema - concise description of the data requirements
3) Logical design/data model mapping - actual implementation of the database
4) Physical design - internal storage structures, file organizations indexes, access paths and physical design parameters
Entity
A thing or object with real world and independent existence.
*Attributes* describe properties of an entity. e.g. EMPLOYE entity maybe described by name, age, address, etc.
Types of Entity Attributes
Simple (or atomic) = can’t be broken down vs…
Composite = Can be divided into parts (e.g. parts of an address)
Single Valued = age (only one value at a time) vs…
Multi-valued = colors on a car (i.e. can hav multiple values at once)
Stored (birth date) vs…
Derived (age derived by birth date)
Entity Type vs. Entity Set/Collection
Entity type defines (is the name of) a collection of entities with the same attributes (the intension).
Entity set/collection is a set of actual entities that are part of the same entity type (the extension).
Key Attribute
The attributes that, together, establish the uniqueness constraint (i.e. that an entity is the one they say it is).
No concept of a “primary key” like in relational database (even though these keys can perform that role).
Meta data
The structure and constraint info that is stored in the catalog so the dbms can refer to the schema
Intension vs extension
Intension is Schema.
Extension is DB state.
Three-Schema architecture
Internal Level or Schema = physical storage and access paths.
Conceptual Level = structure of DB for users (e.g. columns and types).
View or external level = hides part of db so user only sees what’s needed.
The DBMS does “mapping’s” between these layers
Logical data independence
The ability to change the conceptual Schema (i.e. the columns and types) without changing the view or application schema.
Hard, because the application usually is referring to things as defined in the conceptual Schema.
Physical Data Independence
Capacity to change the internal Schema (e.g. how the data is laid out on the drive) without changing the conceptual schema.
Easier because data abstraction allows conceptual schema to stay the same. If conceptual schema is the same, an application doesn’t have to change.
How to achieve logical and physical data independence?
The levels don’t have to change, just the *mappings*
Data Definition Language (DDL)
If no clear levels, DDL defines conceptual and internal schemes.
If there are clear levels, it defines conceptual.
Impedance Mismatch Problem
Data structure of the DBMS not matching that of the programming language
Steps of database design
1) Collect and analyse requirements - Getting user-defined operations/transactions that will be applied to the database
2) Create a conceptual schema - concise description of the data requirements
3) Logical design/data model mapping - actual implementation of the database
4) Physical design - internal storage structures, file organizations indexes, access paths and physical design parameters
Functional Data Model
When two entities represent their relationship by having each other as attributes.
For example: DEPARTMENT might have employee and EMPLOYEE might have department.
These have to be inverses of each other.
Recursive or self-referencing relationships
When the same entity is on both sides of a relationship. For example: SUPERVISION has two employee entities involved: one for employee and one for supervisor
Add elements on diagram:
1) Add arrows to show direction of the relationship
2) Add roles names to name what role each party plays in that relationship
Cardinality Ratios
The constraint on a binary relationship that mimics the “miniworld” constraint. For example: in WORKSFOR the ratio of DEPARTMENT:EMPLOYEE of 1:N means that an employee can only work for one department but many employees can work for a department.
It is always the max number.