TOPIC 2 - Databases Flashcards
what are the 2 ways to store data digitally?
flat files (one large table) relational database (many smaller tables)
benefits of a flat file
simple to create
easy to understand
all data in one place
good for one person with small amount of data
key question of flat files
how are the needs of storing data for one person diff from needs of storing data for a large company
goal of flat files
accurate, timely, relevant information
limitations of flat files
lack of security lack of multiple access lack of data integrity lack of scalability program data dependance lack of custom formats (view)
what is a database?
a collection of related information stored in a structured form
what does schema describe
structure of a database
what is a database management system
a collection of programs that manipulate a database: updates, queries, etc
why use databases and dbms?
Data integrity -> reduce redundancy and inconsistency
Data independence from the program
Security
Enable data sharing and high availability
Provide centralized data administration
Reduce application development time because standard software packages exist in the market for these tasks
name the 4 common types of databases
hierarchy, network, relational, object oriented
explain a hierarchy model
model data as a tree
Captures relationship among the data (parent can have many children)
explain a network model
model data as a network
A child can have multiple parents
Faster than relational
Not as common
explain object oriented model
model data as objects
Stores both the type of data and procedures that manipulate the data
Becoming more popular
explain inheritance vs polymorphism
Inheritance is when an instance of a sub-class receives the same properties of its super-class
Polymorphism is when a sub-class implements a method which can override the one defined in the super-class
explain relational model
modelled as a table and information is related
what is an attribute/field
single value of a column (headers)
what is a domain?
set of allowed values for an attribute
what are entities?
a table stores information about entities (rows in a table)
people, places, things, or events
what is a record/tuple
a collection of attribute values for an entitity
what is a primary key
A minimum set of attributes whose values are unique in each row of a table
what is a composite key?
when it takes 2 or more attributes to uniquely identify a row
what are candidate keys?
when there is more than one key that can be used as a primary key and one is designated as the primary key
what is a foreign key?
An attribute in a table that is also a primary key in another table and is used to link different tables together
what do you do if your database is growing so large that you need to split it over multiple hard drives?
Separated how the data is stored (physical schema from how the data is used (exetrnal schema)
what do you do if different users are interested in different parts of the database?
Create a single global view of the data (conceptual schema) that feeds into many individual views of the data (external schema) for the different user groups
what are the 3 schema architecture?
external, conceptual, physical
explain external schema
How the data is displayed to a particular user
Different views for different user groups
The rest of the database is hidden from that user
explain conceptual schema
A global description of the whole database
Unbiased towards any particular group of users
explain physical schema
How the data is physically stored and organized
What data is in which file on which disk
why have 3 layers or schema?
data independence: separation of storage, organization and presentation
what is the motivation of 3 layers for schema?
change the data format without changing the software
what is SQL?
structured query language
Most commonly used language to create, manage, and query a database
list the main operations of SQL
select, join, project
explain data manipulation and definition
manipulation: the contents of a database can be accessed using a data manipulation langage which specifies the contents to extract
definition: database contents must be clearly defined using a data defn language which specifies the type of each attribute/field/column heading
what are the limitations of relational databases
multimedia data
arrays of data
unstructured text
hierarchical data
what are the criteria for a good database design
Correctness: info is accurate
Completeness: it characterizes all the data
Minimum redundancy: unfortunately this cannot be completely elimninated in all cases
what are the 3 steps in database design for relational databases
- Identify the entities to store and the relationships between these entities
- Convert the ER diagram into tables
- Fine tune your design
what are the 3 cardinalities of relationships? Where do you put the primary key?
1: 1 relationships: place the primary key from one entity and any attributes of that relationship into the other entity’s table
1: N relationships: place the primary key from the “1” side of the relationship to the “N” entity’s table
N:M relationships: create a new table with composite key (primary keys from both entities) and include associated data
how are relationships represented in a tablle?
foreign keys in one entity’s table
get their own table
main purpose of normalization process?
minimize redundancy
explain Boyce-Codd Normal Form
every attribute for an entity depends only on the candidate key(s)
explain functional dependency A -> (B,C)
The value of A determines the values of B and C
Knowing the value of A, there is only one possible value for B and C in the database
Abilities of a relational database
Record data Search for an item Select and project Join Summarize
Limitations of a relational database
Statistical analysis: what if, forecasting, correlation
Discover relationships, patterns, trends