Week 4 Flashcards

1
Q

What is the distinction between data, information and knowledge?

A

Data: Raw facts, building blocks for information
Information: Processed data so that it has meaning to the user
Knowledge: How to interpret information and make decisions based on it.

Data: Air pressure at a moment
Information: Time series of air pressure, see how it evolves)
Knowledge: Deciding to shut off a gauge, but also knowing which gauge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you obtain knowledge?

A
  • Training
  • Experience
  • Others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a database?

A

A shared integrated computer structure that stores data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What two (simple) types of data can a database store?

A
  • End-user data

- Meta data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is meta data?

A

Data about the data. Think about timestamps, who modified it etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a database management system (DBMS)

A
  • Collection of programs that manage multiple databases;
  • Also manages the structure of these databases
  • Think about Oracle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the main purpose of a DBMS?

A

To make data management more efficient and effective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are advantages of a DBMS?

A
  • Better access to better structured data for end-users;
  • Integrated view on operations;
  • Reduced number of data inconsistencies due to normalization;
  • Quick answers with ad-hoc queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the disadvantages of a DBMS?

A
  • Everything needs to be integrated;

- Might not be efficient with big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are two types of data bases?

A
  1. Transactional databases (operational)

2. Data Warehouses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a transactional database

A
  • A database containing meaningful business events.
  • Used to support daily business operations
  • Think of SAP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a datawarehouse?

A

A database that stores data to generate information to make tactical and strategic decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the key characteristics of a data warehouse?

A
  • Subject oriented
  • Large
  • Historic data
  • De-normalized data
  • Batch Updates
  • Complex queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the key characteristics of a transactional warehouse?

A
  • Transaction oriented
  • Small
  • Current data
  • Normalized data
  • Continious updates
  • Simple to complex queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the main data abstraction levels?

A

According to the lecture there are three, only 2 discussed.

  1. Physical model
  2. Conceptual model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the physical model data abstraction layer?

A

The data as it is, as it was collected and uploaded to the database. There is no meaning to the data and it is a set of tables. This layer can be modified by the information manager without end-users knowing

17
Q

What is the conceptual model data abstraction layer?

A

The layer if you are interested in the meaning of the data. You can get different interface screens as a end-user and you can only call up the information you are interested in. You do not see or care about how the database is structured or stored

18
Q

Why is database design important?

A
  • It defines the expected use of the database
  • You can avoid redundant data
  • Poor design leads to errors which leads to poor decision making
19
Q

What is a relational database?

A

The standard in terms of database design. It is build up in one or multiple tables where there is a focus on the relations between entities to avoid data renduncies

20
Q

What are the key components in a relational database?

A

Entity: A category or object (say: student)
Attribute: A characteristic that descrives an entity in a table (say: name, class, age)
Key attribute: Identifies by which attribute different entities are identified (say: student number)

21
Q

What kind of keys are there in a relational database?

A

Candidate key: Possible attributes that can become primary key
Primary key: Primary attribute by which an entity is identified
Foreign key: An attribute in table B that is the primary key in table A –> way to link tables

22
Q

What are the advantages of data integration

A
  • It is always best (golden rule)

- Reduces maintenance and overhead cost compared to having many seperate databases

23
Q

What is a disadvvantage of data integration

A
  • You take away data ownership. That can cause someone not to want to cooperate with the integration
24
Q

What is a problem with IoT and data integration

A

IoT gathers too much data to integrate it all in a central database

25
Q

What is data normalization??

A

Methodology for organizing attributes into tables to eliminate data redundancy among non-key attributes

26
Q

What is the result of data normalization?

A

Properly structured relational database

27
Q

What is your input for data normalization?

A
  • All attributes that need to be incorporated

- A list of all functional dependencies between attributes

28
Q

What is a functional dependency?

A

If attribute A has a specific value, that means that attribute B has a dependent value (zipcode –> city)

29
Q

What is a determinant?

A

A unique value that determines the value of another value. –> employee number -> employee name

30
Q

What are the steps (just name) of the data normalization process

A
  1. First nominal form
  2. Second nominal form
  3. Third nominal form
31
Q

What is the first nominal form in data normalization?

A

Starting point, non-organized data. Could contain multi-value attributes

32
Q

What is the second nominal form in data normalization?

A
  • There are no partial functional dependencies. If there are you need to split it up in more tables.
  • Every non-key element must fully depend on a key attribute.
  • Multiple tables
33
Q

What is the third nominal form in data normalization?

A
  • End stage
  • No more transitive dependencies. That means that a nonkey attribute may not depend on other non-key attributes
  • No data rendundancy
34
Q

What is a drawback of data normalization

A
  • Could be less efficient or fast as everything is split up. BUT, nowadays they are pretty fast.
35
Q

What is meant by data normalization is progressive?

A

The 2nd form adheres the rules of the first form, but builds on it.

36
Q

What is data mining?

A

Process of looking for patterns and relationships in arge data sets

37
Q

What is business intelligence?

A

Analyzing collected data hoping to obtain competitive advantage