01 Data Science for Business Flashcards

1
Q

What is Big Data?

A
  • Big Data is high-volumen, high velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation.
  • The three V’s: volume, velocity and variety.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the general procedure of Data Science regarding extracting knowledge from data?

A
  • It is assumed that extracting useful knowledge from data to solve business problems can be done systematically.
  • IT is used to compile large data sets. Then data is analyzed to identify correlations to predict variables.
  • Results from the analysis need to be generalizable (prevent overfitting the dataset).
  • Meaningful decisions = identify the contexts in which data are created, analyzed and used.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the relation of decision making and data:

A
  • Traditionally, many decisions are made based on “gut feeling” of executives.
  • Nowadays, data is available at sufficient quantity and granularity to let more decisions be based on facts.
  • Top-down decision making is supplemented by bottom-up data analytics.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why are Data Science capabilities regarded as a strategic asset?

A
  • The strategic value of data science (compiling and analyzing data) can:
    • Transform existing opperations to be more efficient
    • Transform entire business models and generate new ways to earn profit/market share
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Give some examples of data science solutions for business problems?

A
  • Classification and class probability estimations.
  • Regression (estimate numerical values for an individual)
  • Similarity matching (identify similar individuals)
  • Clustering (group of individuals, customer segments)
  • Profiling (typical behaviours of individuals/groups)
  • Link prediction (connections among data items)
  • Data reduction (replace large data sets with smaller datasets)
  • Causal modeling (what events influence each other).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name some (5) approaches for data science:

A
  • Statistics
  • Database querying
  • Data Warehousing
  • Regresion analysis
  • Machine learning, data mining
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Briefly describe the evolution of analytical information systems:

A

1960: MIS (Management IS) Efficiente data processing, integrated information systems, vision of automated decision making.

1970: DSS (Decision Support System) Statistical algorithms, “what if” analysis, complex and rigid structures, databases.

1980: EIS (Executive IS) Multidimensional modeling, transaction processing and decision support, top management decisions.

1990: DWH (Data Warehouse) Integration of diverse data, interactive and customized reports/OLAP, historical data.

2000: BI (Business Intelligence) KPI Systems, balanced scorecards, analytical applications, data mining.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the key differences between model-driven vs data-driven decision making?

A
  • Model-based decision making is based on predictions, based on statistics and operations research techniques.
  • Data-driven decision making is based on providing data in one place (in a DWH).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly