Data Management - Important Facts Flashcards
What is a data model?
A simple representation of a complex real-world object or event.
Or a simple representation of a data structre and its relationships.
What is functional dependence?
If attribute A determines the value of all the rows for attribute B.
Emphasis on all.
What is full functional dependence?
If the attribute (B) is functionally dependent on a composite key (A) but not on any subset of that composite key, the attribute (B) is fully functionally dependent on (A).
What is entity integrity?
That a primary key attribute can not have a row with a null value.
Each row is uniquely identified by the primary key.
What is referential integrity?
All values of a foreign key refer to existing values in another table.
What is data normalization?
A process for evaluating and correcting table structures to minimize data redunandcies, thereby reducing the likelihood of data anomalies.
-> Dividing everything into seperate tables as much as possibile to minimize redundancies.
How does a DSS look like from an architectual viewpoint?
- Model based management system that is connected to
- A knowledge engine
- A graphical user interface.
What are mandatory requirements for a datawarehouse process?
To turn operational data into information that can be used to support decision-making?
- Accesibility to users that are not familiar with data
- Intergration of data on the basis of a standard model
- Query flexibility to maximize the obtained advantages from existing information
- Information conciseness to allow for target-oriented and effective analyses
- Multidimensional representation to give users a manageable view of information
- Correctness and completeness of integrated data
What are three features of a data warehouse?
- It is subject-oriented
- It is integrated and consistent
- It shows its evolution over time and it is not volatile
What are the three main differences between an operational database and a datawarehouse?
- Operational data covers a short period of time, whereas a datawarehouse covers a period of time.
- Data in operational databases can be added, removed or modified. This does not happen in a data warehouse.
- Operational queries focus on a small number of tables and their relations. Data warehouse queries focus on huge amounts of data and queries.
What are 5 architecture properties that are essential for a data warehouse system?
- Separation
- Scalability
- Extensibility
- Security
- Administerability
Which main five types of system are distinguished in scientific literature that also include the beforementioned layers?
- Independent data marts architecture
- Bus architecture
- Hub-and-Spoke architecture
- Centralized architecture
- Federated architecture
What are five advantages of the two-layer architecture?
- Good quality information is always available, even if access to the sources is unavailable.
- Analysis queries do not affect the management of transactions
- Data warehouses are logically structured according to the multi dimensional model
- OLAP system have historical and summarized data as well - so you avoid a mismatch in terms of time and level of detail
- Data warehouses can be designed specifically to optimize performance
What are the advantages of the added ‘reconciled data’ layer?
- It creates a common reference data model for a whole enterprise.
- Sometimes the reconciled layer can be used to better accomplish operational tasks (instead of using the operational sources)
When does ETL happen?
When a data warehouse is populated for the first time. After that, it occurs every time the data warehouse is regularly updated.