Multiple Linear Regression (MLR) Flashcards

Question 1

Q

How can we detect variable that co-vary

Answer

A

scatter plots, causion of cause or effects, always think of potential third variable

Question 2

Q

statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons

Answer

A

statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons

Question 3

Q

Covariance doesn’t tell us about _________. So always use _________________

Answer

A

independence; scatter plots

Question 4

Q

Covariance is _______________ sensitive, so we use the _________________ values in place of the raw data, this makes all scales have mean_________ and standard deviation ____, and is called _________________________________

Answer

A

size; standardised, 0, 1, Pearson’s product moment correlation coefficient

Question 5

Q

what is effect size

Answer

A

a quantitative measure to allow comparisons between studies, this is given by r squared
aka coefficient of determination
r squared is the proportion of variance that one variable explains in another

Question 6

Q

how to find the slope of regression

Answer

A

find the slope that gives the minimum error variance: the least squares approach to regression

Question 7

Q

what is partial correlation

Answer

A

the amount of variance in X3 that is related to X1 and X1 alone
‘How much of the X3 variance that is not explained by other variables is explained by X1’
‘partial out’ the variability explained by X2
look at how much remaining variability in X3 is explained by knowing the remaining variability in X1
take out the variability using simple linear regression
a pure measure uncontaminated by other variables
uniqueness of variance makes theoretically simpler
reveal hidden relationships

Question 8

Q

what is semipartial correlation

Answer

A

how much of the total variance of X3 does X1 and X1 alone explain
a more intuitive baseline
allows easy comparison of coefficients because X3 is constant

Question 9

Q

Give an example of an misleading schematics on illustrating partial correlation

Answer

A

No correlation (r=0) between desirability (X1) and frequency of buying (X3)
Positive correlation (r=0.5, r^2=0.25) btw amount of pocket money (x2) and buying frequency (X3)
Negative correlation (r=-0.4, r^2=0.16) btw desirability (X1) and pocket money (X2)
from Venn diagram, partial correlation r3.2 should be zero
but calc
r13. 2 = (r13-r12r32)/ rt((1-r12^2)(1-r32^2)) = 0.252

Question 10

Q

partial correlation uncovering hidden relationships when ____________________________

Answer

A

two components / factors influence a third in opposite ways

Question 11

Q

How to regard ANOVA as regression

Answer

A

In ANOVA, looking to see if ‘knowing’ the level of the factor (IV) explains variability of the DV
In regression, looking to see if ‘knowing’ the score of the IV explains var of DV
ANOVA talks about a level mean, e.g., 3 different levels of coffee consumption (1cup, 2cup…), to do regression, you take a measure of how much coffee they had and now you have a continuous variable

Question 12

Q

Why can ANCOVA be seen as a kind of partial correlation

Answer

A

In ANOVA: have DV and factor, factor explain some var of DV, the unexplained var is the SSerror
Idea of covariate is to ‘remove’ unwanted var in DV, some DV var is attributed to CV and hence removed from the SSerror
Covariate is the X2 here

Question 13

Q

Multiple linear regression describes the relationship between \_\_\_\_\_ variables
Have \_\_\_\_\_\_\_\_\_\_ predictors (X1,X2...) and \_\_\_\_\_\_\_\_\_\_ DV (Y)
Effect size (R^2), is the proportion of variance of Y (the DV) explained knowing \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

Answer

A

multiple variables
2 or more predictors, 1 DV
all of the predictors

Question 14

Q

Multiple linear regression
with 1 predictor, describe data as _______
with 2 predictors, describe data as ___________
use ___________ method with k predictors

Answer

A

1 predictor, describe data as a line
2 predictors, describe data as a surface
use least squares method with k predictors

Question 15

Q

Curvilinear regression

with single predictor, __________________

Answer

A

with single predictor, restricted range of curves

Question 16

Q

Curvilinear regression
MLR allows for multiple predictors
e.g., can use X^2, X and a constant
same predictor (X) and X^2. In MR you are fitting your data from a _________ with a _____________, but your predictor might not be independent, here you are fitting ur data with curve

Answer

Study These Flashcards

A

fitting your data from a plane with a line but your predictor might not be independent

Question 17

Q

What is polynomial regression

Answer

Study These Flashcards

A

you fit equations with increasing powers of X
each increase in the order adds a point of inflection (bend) to the curve)
as the number of predictors increases, the fir will get better
caution of overfitting

Question 18

Q

What is overfitting? Why not use the Lagrange polynomial

Answer

Study These Flashcards

A

1.when the number of beta is the same as number of data points, polynomial equation will be exact (when all the X’s are different), go through every data point with R^2=1
2. Lagrange polynomial fits the noise
we get extremely small or large numbers between the points

Question 19

Q

when does overfitting data occurs

Answer

Study These Flashcards

A

when regression is fitting noise rather than the underlying trends

Question 20

Q

How to avoid overfitting

Answer

Study These Flashcards

A

build up the model step-by-step (Y=a+bx –>Y=a+bX+cX^2, etc)
each additional term (beta) costs a degree of freedom
so can ask if the additional term explains a significant amount of ‘extra’ variance
Calculate Fchange= change in SS/MSerror
Plot R^2 against Polynomial order, stop when p< 0.05 (or other significance level)
Can get different results if add several variables at once (several ns improvements can, overall, indicate a sig improvement)
Use change in R^2 (Fchange) as a way of evaluating extra predictors

Question 21

Q

Compression using regression

Number of ___________ is typically much less than the number of _______________

Answer

Study These Flashcards

A

Number of parameters is typically much less than the number of data points

Question 22

Q

what is the disadvantage of regression

Answer

Study These Flashcards

A

compression of data leads to loss of information (variations) in data
which equation
how many parameters

Question 23

Q

what is the assumption of regression

Answer

Study These Flashcards

A

uses a pooled estimate of error variance
assumes independently and identically distributed (iid) distributions over the predictor variable
need to check the error deviations
look at plot of ‘residuals’ against the values of the different X’s
sign of good model residuals should look random
Residual = distance from regression surface to the data point

Question 24

Q

How to report regression

Answer

Study These Flashcards

A

reporting regression analysis is all about the coefficients
so a standardised slope might be useful
the beta values (annoyingly subtle name change) give standardised slopes

Multiple Linear Regression (MLR) Flashcards

(24 cards)