Data Science Flashcards

1
Q

What is Business Intelligence System (BI)?

A

Business intelligence systems are information systems that assist managers and other professionals

1) To analyse current and past activities
2) To predict future events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 characteristics of BI systems?

A

1) Reporting (RFM Analysis + OLAP)

2) Data Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Operational DB vs Dimensional DB?

A

Operational DB

1) Used for structured transaction data processing
2) Current data is used
3) Data are inserted, updated and deleted by the user

Dimensional DB
1) Used for unstructured analytical data processing
2) Current and historical data are used
3) Data are loaded and updated systematically, not by
user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is RFM Analysis?

A

RFM Analysis analyses and ranks customers according to purchasing patterns.
R =>recent (most recent order)
F =>frequent (how often an order is made)
M => money (dollar amount of money)

1 (Highest/Best) to 5 (Lowest/Worst)
Customers sorted into 5 groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Online Analytical Processing (OLAP)

A

An OLAP report has measures and dimensions
Measure - data item of interest
Dimension - a characteristic of a measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a OLAP Cube?

A

A presentation of a measure with associated dimensions:

1) An OLAP cube can have any number of axes
2) The terms OLAP cube and OLAP report are synonymous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does OLAP allow?

A

OLAP allows drill down a further division of data into more detail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Star Schema?

A

Data modelling technique used to map multidimensional decision support data into relational database
Creates near equivalent of multidimensional database schema from existing relational database

Four components:

1) Facts
2) Dimensions
3) Attributes
4) Attribute Hierarchy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is FACTS?

A

Facts contain numeric measurements (values) that represent a specific business aspect or activity

Normally stored in a fact table that is centre of the star schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a FACT table?

A

Fact tables contain facts that are linked through their dimensions via keys.

Metrics are facts computed or derived at runtime

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are Dimension tables?

A

Dimensions describe the business objects that controllable keys to the fact table
1:N relationship between dimension tables and fact tables
Dimensional tables are denormalized to maximize performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are Attributes in terms of Star Schema?

A

Used to search, filter or classify facts

Dimensions provide descriptive characteristics about the facts through their attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Attribute Hierarchy in terms of Star Schema?

A

The attribute hierarchy allows the end user to perform drill down and null up searches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the snowflake schema?

A

Logical arrangement of tables in a multidimensional database, resembles snowflake shape

Extension of star schema, adds additional dimensions

Dimensions are normalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Inmon Vs Kimball Architecture?

A

Inmon
1) Top-down approach
2) Data warehouse designed using a normalized
enterprise data model that contains atomic data
which is at lowest level of detail (typically 3NF)

Kimball
1) bottom-up approach
2) Data warehouse is nothing but a set of data marts
designed as per business processes and joined
together using conformed dimensions across the
business process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pros and cons of Inmon Architecture?

A

Pro:
Very useful for creating highly consistent
dimensional views - data marts

Con:
cost and time of creating an organization wide central repository of data is high and may be a difficult

17
Q

Pros and of Kimball Architecture using dimensional modelling?

A
  • provide easier and simplified data access for analysis

* provide faster query performance