FA4 + M4 - Sheet1 Flashcards

1
Q

Which of the following activities are associated with Data Exploration?

Data cleaning
Data augmentation and transformation
Exploratory data analysis
Feature selection
Identify data dependencies and correlations
Identify trends or anomalies in the data

A

Exploratory data analysis
Identify data dependencies and correlations
Identify trends or anomalies in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following activities are associated with Data Exploration?

Group of answer choices

Identify data dependencies and correlations

Identify trends or anomalies in the data

Exploratory data analysis

Data cleaning

Feature selection

Data augmentation and transformation

A

Identify data dependencies and correlations

Identify trends or anomalies in the data

Exploratory data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following activities are associated with Data Modification?

Group of answer choices

Data cleaning

Data augmentation and transformation

Exploratory data analysis

Feature selection

Identify data dependencies and correlations

Identify trends or anomalies in the data

A

Data cleaning

Data augmentation and transformation

Identify trends or anomalies in the data

hindi dapat identify, feature selection dapat :/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which activity involves adding new data points or modifying existing ones to improve the dataset?

Group of answer choices

Data augmentation

Data cleaning

Exploratory data analysis

Feature selection

A

Data augmentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following is NOT typically a part of Data Exploration?

Group of answer choices

Cleaning the data

Identifying data dependencies

Identifying trends in the data

Exploratory data analysis

A

Cleaning the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which activity is crucial for understanding the relationships between different variables in a dataset?

Group of answer choices

Identifying data dependencies and correlations

Data cleaning

Data augmentatio

Feature selection

A

Identifying data dependencies and correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the data say will happen?

A

Predictive Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What has happened or what is happening now?

A

Descriptive Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why it happened?

A

Diagnostic Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What will likely happen?

A

Predictive Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Predictive Analytics Process:

A

Project Design
Data Sampling
Data Exploration
Data Modification
Model Validation
Model Development
Project Design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Project Design:

A

Kickoff meeting
Understand modeling objective
Define acceptance criteria
Document data and deployment requirement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Sampling

A

Data extraction
Apply filters and exclusions
Identify external data sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Exploration

A

Exploratory data analysis
Identify data dependencies and correlations
Identify trends or anomalies in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Modification

A

Data Cleaning
Data augmentation and transformation
Feature selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Model Validation

A

Model performance review
Feedback based on business knowledge and inputs from subject matter experts (SME’s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Model Development

A

Apply different modeling techniques and select final methodology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Dependent Variable (Value to be predicted)

A

y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Beta coefficient (Rate multiplied to X)

A

6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Independent variable (Value driving prediction)

A

x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Alpha intercept (Baseline figure for y)

A

α

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Error term (Balancing figure)

23
Q

To account for unexplained variability in the dependent variable for other relevant independent variables, which may not have been included in the model

A

Inclusion for the Error Term

24
Q

To capture measurement error in both the dependent and independent variables

A

Inclusion for the Error Term

25
Q

You can have more than one predictor variable (x1 - xn)

A

Multiple Linear Regression

26
Q

Training vs. Validation vs. Test Data

A

Splitting the Dataset

27
Q

Can I use the model already for prediction purposes?

You still need to investigate the model’s ______

You need to prove if your predictors are ____

A

goodness-of-fit.
significant

28
Q

The ________ , is a goodness-of-fit measure

A

coefficient of multiple determination, R^2

29
Q

___ is a figure of merit

30
Q

the ____ the R^2, the better is the success of the model in explaining the variation in the response using the set of predictors

31
Q

___ is normally expressed as a percentage and is interpreted as the amount of variability in the response explained by the independent variables

32
Q

The _____ is a decomposition of the total variation in the response into explained (pattern) and unexplained (error) parts

33
Q

ANOVA meaning:

A

Analysis of Variance

34
Q

The ____ variability is the amount of variation in the response variable that may be attributed to the predictors explicitly state in the model

35
Q

The _____ variability is the amount of variation attribute to random error

A

unexplained

36
Q

SS refers to

A

Sum of Squares

37
Q

There is good fit if the Regression Sum of Squares is ____ than the Residual Sum of Squares

A

much larger

38
Q

The df column refers to the ____

A

degrees of freedom

39
Q

The df for Regression is always the ________

A

number of regression parameters minus one

40
Q

The df for Residual, it is the sample size minus the _____

A

number of regression parameter

41
Q

The total df is the _____

A

sum of those two degrees of freedom

42
Q

MS refers to _____.

A

Mean Squares

43
Q

The values in this column are the ratio of each sum of square to their respective degrees of freedom.

A

Mean Squares

44
Q

have no physical meaning but are instrumental in computing the F-statistic

A

Mean Squares

45
Q

Mean squares have no physical meaning but are instrumental in computing the _____

A

F-statistic

46
Q

The ____ determines if regression is meaningful for the data at hand

47
Q

When the ____ is small. it means that there is at least one significant predictor in the analysis

48
Q

When the p-value is _____. it means that there is at least one significant predictor in the analysis

49
Q

When p is ___, Ho must

50
Q

The p-value is _____ than the a significance level

A

low if it is less

51
Q

The ___ helps in assessing if an individual predictor is significant

52
Q

If p <0.05:

A

significant predictor

53
Q

if p >0.05:

A

insignificant predictor