6. Regression - Categorical variables and BEYOND!!! Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

For a variable with g levels, how many dummy variables do we need to capture all the levels?

A

g-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

With dummy coded variables, what does the constant represent (assuming nothing else in model)?

A

The mean of the reference group - the score for someone who gets 0 on all variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Do you have to control for type I error when using dummy coded variables in regression?

A

Nope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How must dummy variables be interpreted in conclusions?

A

As a mean difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is an interaction effect moderation or mediation?

A

Moderation. Relationship between X1 and Y depends on the level of –or is moderated by – X2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the three steps in testing interactions?

A
  1. Calculate an interaction variable
  2. Run sequential MR with three predictors –original variables and interaction
  3. Interpret either delta R square or b for the interaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the regression formula for an interaction?

A

Y=a+b1X1 +b2X2 +b3X1X2 +e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is mean-centring when it comes to continuous variables?

A

Subtracting the mean from each score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two steps in calculating an interaction variable?

A
  1. Mean-centre continuous variables

2. Multiply the two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why do you mean-centre?

A

It reduces multicollinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you interpret the constant with mean-centred variables?

A

Easier to interpret constant when variables are not mean-centred – otherwise the value is based on a mean of zero. Mean-centredness is only important for calculating interaction, so best use it just for that.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do you need to mean-centre z-scores to calculate an interaction?

A

Nope, they’re already mean-centred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you interpret an interaction between two continuous variables?

A

Well, you can say there is an interaction and it’s significant.

One other option is to break on of the variables into groups - e.g., high, medium and low ability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which model should be interpreted, with or sans interaction?

A

Depends. If the interaction is significant, interpret that model (Model 2). If it’s not, interpret Model 1 and say you tested an interaction, but it was insignificant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is polynomial regression?

A

Interaction between the continuous IV and itself. Effect of variable X on DV depends on level of variable X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you transform a variable to test polynomials? Three steps.

A
  1. Mean centre the variable
  2. Square it
  3. Enter it as the last block in equation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What’s the quadratic regression equation?

A

Y= a+b1X1 + b2X1 square + e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If the quadratic regression coefficient is positive what shape does that indicate in the data?

A

Curve is u-shaped function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If the quadratic regression coefficient is negative what shape does that indicate in the data?

A

Curve is inverted u-shaped function

20
Q

If quadratic regression coefficient is very small, what shape does that indicate?

A

Almost flat line.

21
Q

What does -3.8E-02 mean?

A

-3.8 x 10 to the power of -2

22
Q

What’s the difference between moderation and mediation?

A
Moderation = interaction
Mediation = indirect effect - mediating variable
23
Q

What’s the difference between partial and total mediation?

A

Partial mediation is when an introduced mediating variable partitions part of the direct effect of the IV on the DV.

Total mediation is when an introduced mediating variable accounts for all of the effect of the IV on the DV. The effect of the IV is completely explained by the new variable.

24
Q

What are the four assumptions underlying linear regression?

A
  1. Dependent variable is a linear function of the predictors
  2. Each observation is drawn independently
  3. Homogeneity of variance
  4. Errors are normally distributed with a mean of 0
25
Q

How do you test assumption of independence?

A

Look at the variability in box-plots, broken down by clusters.

26
Q

What is the risk of violating the assumption of independence?

A

You may underestimate the standard error, which increases risk of type I error.

27
Q

How do you test the assumption of linearity?

A

Plot residuals against predictor. Ask for Loess line. Should be straight.

28
Q

What is the assumption of homoscedasticity?

A

That variance of errors is not a function of the predictors, i.e. the variance of errors is constant at all values of X.

29
Q

How do you test the assumption of normality of errors?

A

Use a Q-Q plot:

X-axis - observed values of the residual
Y-axis - expected values of the residual if the residuals are normally distributed

If the scatter is close to the ideal line, residuals are normally distributed.

30
Q

How do you test for multicollinearity?

A

Test tolerance – from 0 to 1 –want score closer to 1.
Test VIF –1 or more, want closer to 1 –Keith says 10 is large. (If lots of predictors, can look for one VIF that stands out from others.)

31
Q

What are the two sources of error in classical test theory?

A
Method error (e.g. warm/cold testing room)
Trait error (due to characteristics of individual)
32
Q

What is the conceptual formula for reliability?

A

True score / True score + error

Proportion of observed score that is accounted for
by variance in true score

33
Q

What is the result in regression of assuming that error-laden predictors are free of error?

A

Underestimate the true effects of the predictors on the dependent variable. Lower R2 because more scatter of scores around regression line.

34
Q

Regression coefficients in simultaneous regression tell us about __________ effects.

A

Regression coefficients in simultaneous regression tell us about direct effects.

35
Q

Regression coefficient entered last into a sequential model tells us about the __________ effect of that variable.

A

Regression coefficient entered last into a sequential model tells us about the total effect of that variable.

36
Q

What effects does the variable entered last in the last model of sequential regression tell us?

A

About the total AND direct effect of that variable. It assumes the variable has no indirect effect (all shared variance already taken by other variables)

37
Q

What are rectangles and ellipses used to signify in structural equation modelling?

A

Rectangles - observed (manifest) variables

Ellipses - unobserved (latent) variables

38
Q

What are exogenous/endogenous variables?

A

Exogenous – have arrow coming out – causal
Endogenous – have arrow coming in –effect

If arrow coming out and another arrow coming in, then still Endogenous

39
Q

What is the name for residual in structural equation modelling?

A

Disturbance

40
Q

What is the formula for disturbance?

A

Square root of (1 - R square)

41
Q

What kind of regression do you run in SEM?

A

Simultaneous

42
Q

How do you calculate the indirect effect of a variable on a DV?

A

Multiply the variable’s effect on mediating variable BY the mediating variable’s effect on the DV.

43
Q

How do you calculate the total effect of a variable?

A

Add its direct effect and indirect effect.

44
Q

How do you know which model to report in sequential regression?

A

Report the final statistically significant model (based on the Model Summary Table). So if the last model is not stat. sig., report the penultimate one, assuming it is stat. sig.

45
Q

What is a recursive model?

A

A recursive model is one in which causal flow travels in only one direction - i.e. no feedback loops or reciprocal causes.