Week 4 Flashcards by Joost Kok

What is the distinction between data, information and knowledge?

Data: Raw facts, building blocks for information
Information: Processed data so that it has meaning to the user
Knowledge: How to interpret information and make decisions based on it.

Data: Air pressure at a moment
Information: Time series of air pressure, see how it evolves)
Knowledge: Deciding to shut off a gauge, but also knowing which gauge.

How well did you know this?

Not at all

Perfectly

How can you obtain knowledge?

Training
Experience
Others

How well did you know this?

Not at all

Perfectly

What is a database?

A shared integrated computer structure that stores data.

How well did you know this?

Not at all

Perfectly

What two (simple) types of data can a database store?

End-user data

- Meta data

How well did you know this?

Not at all

Perfectly

What is meta data?

Data about the data. Think about timestamps, who modified it etc.

How well did you know this?

Not at all

Perfectly

What is a database management system (DBMS)

Collection of programs that manage multiple databases;
Also manages the structure of these databases
Think about Oracle

How well did you know this?

Not at all

Perfectly

What is the main purpose of a DBMS?

To make data management more efficient and effective.

How well did you know this?

Not at all

Perfectly

What are advantages of a DBMS?

Better access to better structured data for end-users;
Integrated view on operations;
Reduced number of data inconsistencies due to normalization;
Quick answers with ad-hoc queries

How well did you know this?

Not at all

Perfectly

What are the disadvantages of a DBMS?

Everything needs to be integrated;

- Might not be efficient with big data

How well did you know this?

Not at all

Perfectly

What are two types of data bases?

Transactional databases (operational)

2. Data Warehouses

How well did you know this?

Not at all

Perfectly

What is a transactional database

A database containing meaningful business events.
Used to support daily business operations
Think of SAP

How well did you know this?

Not at all

Perfectly

What is a datawarehouse?

A database that stores data to generate information to make tactical and strategic decisions

How well did you know this?

Not at all

Perfectly

What are the key characteristics of a data warehouse?

Subject oriented
Large
Historic data
De-normalized data
Batch Updates
Complex queries

How well did you know this?

Not at all

Perfectly

What are the key characteristics of a transactional warehouse?

Transaction oriented
Small
Current data
Normalized data
Continious updates
Simple to complex queries

How well did you know this?

Not at all

Perfectly

What are the main data abstraction levels?

According to the lecture there are three, only 2 discussed.

Physical model
Conceptual model

How well did you know this?

Not at all

Perfectly

What is the physical model data abstraction layer?

Study These Flashcards

The data as it is, as it was collected and uploaded to the database. There is no meaning to the data and it is a set of tables. This layer can be modified by the information manager without end-users knowing

What is the conceptual model data abstraction layer?

Study These Flashcards

The layer if you are interested in the meaning of the data. You can get different interface screens as a end-user and you can only call up the information you are interested in. You do not see or care about how the database is structured or stored

Why is database design important?

Study These Flashcards

It defines the expected use of the database
You can avoid redundant data
Poor design leads to errors which leads to poor decision making

What is a relational database?

Study These Flashcards

The standard in terms of database design. It is build up in one or multiple tables where there is a focus on the relations between entities to avoid data renduncies

What are the key components in a relational database?

Study These Flashcards

Entity: A category or object (say: student)
Attribute: A characteristic that descrives an entity in a table (say: name, class, age)
Key attribute: Identifies by which attribute different entities are identified (say: student number)

What kind of keys are there in a relational database?

Study These Flashcards

Candidate key: Possible attributes that can become primary key
Primary key: Primary attribute by which an entity is identified
Foreign key: An attribute in table B that is the primary key in table A –> way to link tables

What are the advantages of data integration

Study These Flashcards

It is always best (golden rule)

- Reduces maintenance and overhead cost compared to having many seperate databases

What is a disadvvantage of data integration

Study These Flashcards

You take away data ownership. That can cause someone not to want to cooperate with the integration

What is a problem with IoT and data integration

Study These Flashcards

IoT gathers too much data to integrate it all in a central database

What is data normalization??

Methodology for organizing attributes into tables to eliminate data redundancy among non-key attributes

What is the result of data normalization?

Properly structured relational database

What is your input for data normalization?

- All attributes that need to be incorporated | - A list of all functional dependencies between attributes

What is a functional dependency?

If attribute A has a specific value, that means that attribute B has a dependent value (zipcode --> city)

What is a determinant?

A unique value that determines the value of another value. --> employee number -> employee name

What are the steps (just name) of the data normalization process

1. First nominal form 2. Second nominal form 3. Third nominal form

What is the first nominal form in data normalization?

Starting point, non-organized data. Could contain multi-value attributes

What is the second nominal form in data normalization?

- There are no partial functional dependencies. If there are you need to split it up in more tables. - Every non-key element must fully depend on a key attribute. - Multiple tables

What is the third nominal form in data normalization?

- End stage - No more transitive dependencies. That means that a nonkey attribute may not depend on other non-key attributes - No data rendundancy

What is a drawback of data normalization

- Could be less efficient or fast as everything is split up. BUT, nowadays they are pretty fast.

What is meant by data normalization is progressive?

The 2nd form adheres the rules of the first form, but builds on it.

What is data mining?

Process of looking for patterns and relationships in arge data sets

What is business intelligence?

Analyzing collected data hoping to obtain competitive advantage

Week 4 Flashcards

(37 cards)