Multiple Linear Regression (MLR) Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

How can we detect variable that co-vary

A

scatter plots, causion of cause or effects, always think of potential third variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons

A

statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Covariance doesn’t tell us about _________. So always use _________________

A

independence; scatter plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Covariance is _______________ sensitive, so we use the _________________ values in place of the raw data, this makes all scales have mean_________ and standard deviation ____, and is called _________________________________

A

size; standardised, 0, 1, Pearson’s product moment correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is effect size

A

a quantitative measure to allow comparisons between studies, this is given by r squared
aka coefficient of determination
r squared is the proportion of variance that one variable explains in another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to find the slope of regression

A

find the slope that gives the minimum error variance: the least squares approach to regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is partial correlation

A

the amount of variance in X3 that is related to X1 and X1 alone
‘How much of the X3 variance that is not explained by other variables is explained by X1’
‘partial out’ the variability explained by X2
look at how much remaining variability in X3 is explained by knowing the remaining variability in X1
take out the variability using simple linear regression
a pure measure uncontaminated by other variables
uniqueness of variance makes theoretically simpler
reveal hidden relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is semipartial correlation

A

how much of the total variance of X3 does X1 and X1 alone explain
a more intuitive baseline
allows easy comparison of coefficients because X3 is constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give an example of an misleading schematics on illustrating partial correlation

A
  1. No correlation (r=0) between desirability (X1) and frequency of buying (X3)
  2. Positive correlation (r=0.5, r^2=0.25) btw amount of pocket money (x2) and buying frequency (X3)
  3. Negative correlation (r=-0.4, r^2=0.16) btw desirability (X1) and pocket money (X2)
  4. from Venn diagram, partial correlation r3.2 should be zero
  5. but calc
    r13. 2 = (r13-r12r32)/ rt((1-r12^2)(1-r32^2)) = 0.252
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

partial correlation uncovering hidden relationships when ____________________________

A

two components / factors influence a third in opposite ways

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to regard ANOVA as regression

A

In ANOVA, looking to see if ‘knowing’ the level of the factor (IV) explains variability of the DV
In regression, looking to see if ‘knowing’ the score of the IV explains var of DV
ANOVA talks about a level mean, e.g., 3 different levels of coffee consumption (1cup, 2cup…), to do regression, you take a measure of how much coffee they had and now you have a continuous variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why can ANCOVA be seen as a kind of partial correlation

A

In ANOVA: have DV and factor, factor explain some var of DV, the unexplained var is the SSerror
Idea of covariate is to ‘remove’ unwanted var in DV, some DV var is attributed to CV and hence removed from the SSerror
Covariate is the X2 here

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
Multiple linear regression describes the relationship between \_\_\_\_\_ variables
Have \_\_\_\_\_\_\_\_\_\_ predictors (X1,X2...) and \_\_\_\_\_\_\_\_\_\_ DV (Y)
Effect size (R^2), is the proportion of variance of Y (the DV) explained knowing \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
A

multiple variables
2 or more predictors, 1 DV
all of the predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Multiple linear regression
with 1 predictor, describe data as _______
with 2 predictors, describe data as ___________
use ___________ method with k predictors

A

1 predictor, describe data as a line
2 predictors, describe data as a surface
use least squares method with k predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Curvilinear regression

with single predictor, __________________

A

with single predictor, restricted range of curves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Curvilinear regression
MLR allows for multiple predictors
e.g., can use X^2, X and a constant
same predictor (X) and X^2. In MR you are fitting your data from a _________ with a _____________, but your predictor might not be independent, here you are fitting ur data with curve

A

fitting your data from a plane with a line but your predictor might not be independent

17
Q

What is polynomial regression

A
  1. you fit equations with increasing powers of X
  2. each increase in the order adds a point of inflection (bend) to the curve)
  3. as the number of predictors increases, the fir will get better
  4. caution of overfitting
18
Q

What is overfitting? Why not use the Lagrange polynomial

A

1.when the number of beta is the same as number of data points, polynomial equation will be exact (when all the X’s are different), go through every data point with R^2=1
2. Lagrange polynomial fits the noise
we get extremely small or large numbers between the points

19
Q

when does overfitting data occurs

A
  1. when regression is fitting noise rather than the underlying trends
20
Q

How to avoid overfitting

A
  1. build up the model step-by-step (Y=a+bx –>Y=a+bX+cX^2, etc)
  2. each additional term (beta) costs a degree of freedom
    so can ask if the additional term explains a significant amount of ‘extra’ variance
  3. Calculate Fchange= change in SS/MSerror
  4. Plot R^2 against Polynomial order, stop when p< 0.05 (or other significance level)
  5. Can get different results if add several variables at once (several ns improvements can, overall, indicate a sig improvement)
  6. Use change in R^2 (Fchange) as a way of evaluating extra predictors
21
Q

Compression using regression

Number of ___________ is typically much less than the number of _______________

A

Number of parameters is typically much less than the number of data points

22
Q

what is the disadvantage of regression

A

compression of data leads to loss of information (variations) in data
which equation
how many parameters

23
Q

what is the assumption of regression

A
  1. uses a pooled estimate of error variance
    assumes independently and identically distributed (iid) distributions over the predictor variable
  2. need to check the error deviations
    look at plot of ‘residuals’ against the values of the different X’s
    sign of good model residuals should look random
    Residual = distance from regression surface to the data point
24
Q

How to report regression

A

reporting regression analysis is all about the coefficients
so a standardised slope might be useful
the beta values (annoyingly subtle name change) give standardised slopes