Introduction Data Science Terminology Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

CRISP-DM

A

Cross-industry standard process for data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

CRISP-DM

6 steps

A

Business Understanding

Data Understanding

Data Preparation

Modeling

Evaluation

Deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

KDD (Knowledge Discovery in Databases)

Process

A
Data 
--> Selection
Target Date
--> Preprocessing
Preprocessed Data
--> Transformation
Transformed Data
--> Data Mining
Patterns
--> Interpretation/ Evaluation
Knowledge
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

PDCA

A

PDCA (Plan–Do–Check–Act)

methodology by William Deming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

DMAIC

A

DMAIC (Define, Measure, Analyze,
Improve and Control) methodology
used in Six Sigma projects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Extract, Transform, Load (ETL)

A
IS
--> extract
raw data
--> transform
data warehouse (predefined structure)
--> load
analytics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Extract, Load, Transform (ELT)

A
IS
--> extract & load
transform (data lake raw data & prepared data on demand)
-->
analytics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Another 80/20 rule

A

• 80% of the data scientist’s time is spent on finding,
cleaning, preprocessing and organizing data, leaving
only 20% to actually perform an analysis.

• However, the 20% effort determines 80% of the final
result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly