SQL theory Flashcards
Database the same as excel?
DB not comparable to Excel, because databse tables are related to eachother.
database?
A collection of related tables, deisgned, maintained and utilized by multiple users with software to update and query the data.
consists of: data, hardware, software and users
DBMS
Software that controls the data
Relational database allows?
- Data to be grouped into tables
2. sets relationships between tables
What is superkey
key that uniquely identifies each row
What is a candidate key
superkey without unnecessary attributes
Why choose a seperate data warehouse?
- historic data (normal db do not store this)
- data consolidation (dq requires aggregation and summarization from multiple heterogenous sources
- data quality: different sources use inconsistent data represenations, codes and formats (that why ETL)
A data ware house is: (4)
- subject oriented: not focused on daily operations but on analysis for decision makers
- integrated: constructed by integrating multiple, heterogeneous data sources
- time variant: time horizon for DW is significantly larger than a regular db
- non-volatile: operational updates do not occur in dw
DW development methods:
- data mart approach (bottom up: simple, smaller dw’s)
(independent data marts, data mart bus architecture, canned dw, consistency achieved through conformed dimensions) - enterprise dw (top-down: centralized, more data)
(independent dw, hub and spoke data warehouse, federated dw)
ETl loading methods
- integral load
2. incremental/delta load
ETL generations
- do it yourself
- code generators
- ETL engines
OLAP density
in dw typically 25%
dashboard designed on three levels
- perception of elements in environment (what is happening around the company)
- comprehension of current situation (benchmarking)
- projection of future status
what is data?
Raw facts; internal or external; structured or unstructured
What is information
Organized data that has meaning