Unit 2 Flashcards

1
Q

What is a scatterplot used for

A

Immediate visual impression of a possible relationship between 2 variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the correlation coefficient

A

Used as a quantitative value of strength and direction of a linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inequality of value of correlation coefficient and symbol

A

-1 < r < 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is r^2

A

Coefficient of determination - % of variation in the resp var that is explained by the explanatory var

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is r^2 represented

A

As a %

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does 100% r^2 mean

A

Perfect fit

all variation in x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to describe a scatter plot and eg

A

Direction
Form
Strength

Strong positive linear association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the boundaries of r (what value of r is strong, etc)

A

r > 0.8 = strong
0.7 < r < 0.8 = moderately strong
r < 0.7 = weak

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a residual

A

Difference between an observed and predicted value

= actual - predicted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum and mean of residuals is always equal to

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Line of regression eqn and what each variable is

A

y hat = a + bx

a = y intercept
b = slope
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calculate B

what does each variable mean

A

r * sy/sx

r = correlation coeff
sy = SD of y
sx = SD of x
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to check if linear model is a good fit

A

r^2 should be high

Residual plot shouldn’t show any pattern

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How and whyto transform a graph

A

Use log or ln on one of the variables, may allow for a linear model to be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to assess the effectiveness of transformation

A

Checking if randomness in residual plot has increased

Checking if r^2 value has increased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the LSRL

Full form

A

Least squared regression line

Linear model that minimizes the sum of squared

17
Q

If residuals is negative then

if residuals is positive then

A

Negative - overestimated the response var

Positive - underestimated the response var

18
Q

what is the response var

A

The dependent var

19
Q

What is the explanatory var

A

the independent var

20
Q

How to use residual plot to assess good fit (criteria)

A

Random
Centered at 0
No clear patterns

21
Q

What does a residual plot do

A

Helps decide if you should use a linear model or should consider others.

22
Q

What is extrapolation

why is it bad

A

Prediction that was made outside the interval of current data x values.
Trends may not continue at this next X value so it may not be accurate.

23
Q

How to interpret slope

A

For every ‘x units’ of ‘independent var’, model predict an average increase of ‘slope value’ of ‘dependent var units’

24
Q

How to interpret y intercept

A

When ‘explanatory var’ = 0, model predicts that the ‘resp value’ woudl be ‘y int value’

25
Q

How to interpret r^2

A

‘r^2’ percent of var in ‘resp var’ can be explained by the linear relationship with the ‘explanatory var’

26
Q

If r^2 is closer to 0 then

if r^2 is closer to 1 then

A

r^2 closer to 0 = weak relationship

r^2 closer to 1 = strong relationship.

27
Q

If slope is positive then r is

A

Positive

28
Q

What is a high leverage point

if removed then what
what does it change

A

Points with unusually large or small x values far away from xbar (mean of x values)

Cause a substantial shift in the model being used

Slope or y int could be changed.

29
Q

What does an outlier affect

A

R
R^2
Strength of the model

30
Q

What is an influential point
3 types
what do they affect

A

Points that when removed change the slope, y int and/or correlation coeff substantially

Outliers (corr)
High lvg pt (change slope/y int)
both of the above