Topic 2: Correlation, Simple, and Multiple Linear Regression Flashcards

1
Q

correlation

A

displays the form, direction, and strength of a relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

pearson’s correlation

A

measures the direction & strength of a linear relationship between two quantitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

covariance

A

indicates the degree to which x and y vary together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

interpreting covariance

A

positive = x and y move in the same direicotn
negative = x and y move in opposite directions
0 = x and y are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when should we not use r?

A
  • when two variables have a non-linear relationship
  • observations aren’t independent
  • outliers exist
  • homoscedasticity is violated
  • the sample size is very small
  • both variables are not measured on a continuous scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

point-biserial correlation

A

binary & continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

phi coefficient

A

two binary variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

spearman’s rho

A
  • two ordinal variables
  • recommended when N > 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

kendall’s tau

A
  • two ordinal varibales
  • recommended when N < 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

counfounder

A

an association between two variables that might be explained by some observed common factor that influences both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

lurking factors

A

potential common causes that we don’t measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

partial correlation

A

the correlation between two variables after the influence of another variable is removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

hypotheses for significance of a correlation coefficient

A
  • H₀: ρ = 0 (ρ = population correlation)
    No linear association between the two variables
  • H₁: ρ ≠ 0 Linear association between variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

simple linear regression

A
  • used to study an asymmetric linear relationship between x and y
  • describes how the DV changes as a single IV changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

β

A

the relationship of x on y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

linear regression equation

A
  • Ŷ = ɑ + βX
  • Ŷ = predicted line
  • ɑ = intercept
  • β = slope
17
Q

method of least squares

A
  • makes the sum of the squares of the vertical distances of the data points from the line as small as possible
  • minimizes ss (error)
18
Q

stating hypotheses for the significance of the slope in simple linear regression

A
  • H₀: β = 0 (There is no linear relationship between x & y)
  • H₁: β ≠ 0 (There is a linear relationship between x & y)
19
Q

t-test formula

A

t = sample statistic/ standard error

20
Q

standard error

A

standard deviation of a sample population

21
Q

assumptions to apply t-test to slope

A

normal distribution & independence of observation

22
Q

partitioning variance in simple linear regression

A
  • ss (regression) = variation in y explained by the regression line
  • ss (error) = variation in y unexplained by the regression line
23
Q

A

the proportion of total variation in y accounted for by the regression model

24
Q

interpreting r²

A

0 = no explanation at all
1 = perfect explanation

25
Q

multiple linear regression

A

explains how the DV changes as multiple IVs change

26
Q

regression plane

A

Ŷ = ɑ + β₁X₁ + β₂X₂

27
Q

how can we compute a and bⱼ?

A

using the least squares method

28
Q

standardized regression coefficient

A
  • the effect of a standardized IV on the standardized DV (z-scores)
  • the change in the standard deviation of the DV that results from a change of one standard deviation in xⱼ
29
Q

stating hypotheses for overall significance in multiple regression

A
  • H₀: β₁ = β₂ = … = βⱼ= 0
    (None of the xs are linearly related to y)
  • H₁: at least one coefficient is not 0
    (At least one x is linearly related to y)
30
Q

stating hypotheses for individual regression coefficients in multiple regression

A
  • H₀: β₁ = 0
    (Xⱼ is not linearly related to y)
  • H₁: β₁ ≠ 0
    (Xⱼ is linearly related to y)
31
Q

r² in multiple regression

A

If a new IV is added to the model, SS (Error) will always be smaller and SS (Reg) will always become larger, so r² never decreases when another variable is added to the model

32
Q

adjusted r²

A

if no substantial increase in r² is obtained by adding a new IV, adjusted r² rends to decrease

33
Q

f-test vs. t-test in linear regression

A
  • simple: f-test = t-test
  • multiple: can only use f-test