Data Analytics Flashcards

1
Q

Conditional probability

A

Probability of event A occurring, given that event B occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Association analysis

A

Task of finding interesting relationships in large datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hadoop

A

Java language; allows for distributed processing, of large datasets across clusters of computers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 V’s of Big Data

A

Volume; Variety; Velocity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Kano Analysis

A

Impact on customer satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SAAS

A

Software as a Service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistical significance

A

Defines whether the null hypothesis is assumed to be accepted or rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Quantitative

A

Numbers based, countable, measurable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Qualitative

A

Interpretation based, descriptive, relating to language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Type I error

A

False-Positive: rejecting null hypothesis when it’s true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Type II error

A

False-Negative: Failing to reject null hypothesis when it’s false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Type III error

A

Correctly rejecting null hypothesis for wrong reason

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nominal data

A

E.g., Male v Female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ordinal data

A

E.g., 1st, 2nd, 3rd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mann Whitney U Test

A

Test whether two samples are likely to derive from the same population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Wilcoxon Test

A

Test is used to compare two independent samples

17
Q

IMPACT cycle

A

Identify questions
Master the data
Perform test plan
Address and refine results
Communicate insights
Track outcomes

18
Q

ETL process

A

Extract, Transform and Load data.

Goal is identify and obtain data needed for solving problem

19
Q

Multiple series with closely related data - what graph?

A

Line graph

20
Q

Single data series - what graph?

A

Bar graph

21
Q

Two data series - what graph?

A

Combo chart

22
Q

Relationship between 2 data series and determining their correlation - what graph?

A

Scatter plot

23
Q

Variance analysis; explaining how “actual” result is different to budget - what graph?

A

Waterfall chart

24
Q

Distribution of dataset - what graph?

A

Histogram

25
Q

Descriptive Analytics

A

Tells you what happened in the past