MGS 351 Exam 2 Databases and Data Warehouses Flashcards
Data Hierarchy
Database Table (File, relation) Records ( Rows, Tubles) Fields (Columns, Attributes) Bytes Bits
Issues in Traditional File Environment (1950’s)
(1) Data redundancy- no right way of managing data
(2) Data Inconsistency- Bad Data
(3) Data Isolation (some departments have certain info while other do not)
(4) Data Integrity (is the information valid?)
(5) Security
(6) Application/Data Dependence
- Programs written on top of data files were very iffy because they had to follow very specific organizational structures and modification would result in error
- basically they had to rewrite programs to add updates
- towards the end— it was hard to keep up- time consuming
Database Approach Advantages
(1) Minimal data redundancy
(2) data consistency
(3) Integration of data
(4) Sharing of data
(5) Uniform security, Privacy, and integrity
(6) ** Data independence (can add fields, tables, etc…)
(7) Centralized Environment
Data Definition Language (DDL)
Creates Structure
- Specifies content and structure of database and defines each data element
- You can use the design view
Data Manipulation language (DML)
Manipulates data in a database
- Inserting records-
- Deleting records
Data Dictionary
Stores definitions of data elements and characteristics
-Keeps track of what the user has built
-Backing up a dictionary creates an empty skeleton
—Having a “blueprint of the structure”
—Does not have the data but it does have the structure of the database.
Example: stores properties and fields.
Does NOT store records
Evolution of Databases
(1) Traditional files (1960’s)
(2) Hierarchal (1970’s)
(3) Objective oriented (new) -people creating private ones
(4) Relational (this is what we use now) its more flexible
Relational Database represents Two-dimensional tables called
Relations
Relational Database relates data across tables based on…
Common data elements (Primary key links to foreign keys)
Examples: DB2, Oracle, MS SQL server
Hierarchal Database
Used to break one component into subgroups
Network Database
Depicts data logically as many-to-many relationships (no flexibility)
“Hard to get data our of there”
Database Design
-Conceptual/Logical Design
Abstract model of database from a business perspective
Database design
-Physical design
How data is arranged and optimized on storage devices
-Database administrators can modify how it is stored
Example: C-drive, Flash drive, etc…
-All about performance. Don’t touch the index options(primary key) because its slows inserting, updating data and query are faster
Business Intelligence need to have knowledge in…
(1) Customers
(2) Competitors
(3) Partners
(4) Competitive Environment
(5) Internal Operations
Database Administrators are responsible for…
Technical operations
Data Administrators are responsible for…
Making sure data is stored correctly and consistently
-Defining and Tracking
Online Transaction Processing (OLTP)
Gathering of input information, Processing the information, and updating existing information to reflect the gathered and processed information
-Operational database for day-to-day operations
Example: who is in a certain class?
Online Analytical Processing (OLAP)
The Manipulation of information to support decision making
-Takes data from OLTP, Summarizes it, and stores it in a data warehouse
Example: How many people in total are in a class?
Data Warehouse
Logical collection of information - Gathered from many different operational databases (extract transformation Load(ETL)) - used to create business intelligence that supports business analysis and decision-making tasks
- Analyzing information by zip code not by address
- pre-summarized information (think of the ruby cube digram–also know as slice and dice)
- **Gets any data from many places–>summarizes it–>Uses it for decision making
Data Marts
Subset of data warehouse in which only a focused portion of the data warehouse information is kept
-Example SAT, GPA, Major —>Created information
Data Mining
Finding hidden patterns and trends in data
Requirements for Data Mining tools
Requirements for searching hidden info is…
-Query tools
-Intelligence agents
-Multidimensional analysis (OLAP “Slice and Dice”)
-Statistical analysis
Example: pattern that was identified showed a trend that baby diapers were commonly purchased with beer
“Duplicate” in a Distributed Database
Make a copy for every location
Downside: a centralized unit is needed to maintain a master file
Upside: Multiple backup files are created
“Partitioned” in a Distributed Database
Split up and distributed to relevant departments
The web is used to query Databases…
- Not fast enough for advanced analysis
- Static page: Server provides webpage from file
- Dynamic page: Server connects to a database, queries data, and custom builds the web page on the fly
Example: Ublearns->Go to a template page, enter username, record is found and the page is automatically updated
PHP
Open source software
ASP
Active server page
CFM
Cold fusion