[L11] Multiple Regression Analysis Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

__ (rather correlation) is the term used when we have
the specific aim of predicting values on a ___ variable (or
target) from a “
__ variable”.

A

Regression; criterion; predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The square of ___gives us an estimate of
the variance in y explained by variance in x.

A

correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Because there is a correlation between x and y , we can to a
certain extent, dependent on the size of r², predict __
from x scores.

A

y scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

line of best fit placed among the
points in a scatterplot.

A

REGRESSION LINE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

REGRESSION LINE

On this line will lie all our
___,
symbolized as ŷ, made from our knowledge of x values.

A

predicted values for y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The vertical line between an actual y value & its
associated ŷ value is known as

A

PREDICTION ERROR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

But it is better known as a ____ because it
represents how wrong we are in making the prediction for
that particular case.

A

RESIDUAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The
_
__ then is a line that minimizes these
residuals

A

regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

__– it is the number of units Ŷ increases
for every unit increase in x.

A

Regression coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

_ is a constant value. It is the value of Ŷ when x is 0

A

c =

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In regression we also deal with___ rather than raw
scores

A

standard scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When scores (x and y) are expressed in standard score form
then the regression coefficient is known as the
__

A

STANDARDIZED REGRESSION COEFFICIENT or
BETA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Where there is only one predictor, __ is in fact the correlation
coefficient of x with y.

A

beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

_
* Can be used when we have a set of variables (x1, x2, x3 etc)
each of which correlates to some extent with a criterion
variable (y) for which we would like to predict values.

A

MULTIPLE PREDICTIONS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

_ of two variables (green portion is the
shared variance) = r²

A

Co-variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Because multiple regression has so much to do with
correlation, it is important that the variables used are
__.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

That is, they need to incorporate measures on some kind of a
___ scale

A

linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In __, variables like marital status where codes 1-4 are
given for single, ,married, divorced, widowed etc. can not be
done

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The exception, as with correlation, is the __ variable
which is exhaustive, such as gender.

A

dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Even with these variables, however, it does not make sense to
carry out a __ regression analysis if almost all variables
are dichotomously categorical

A

multiple

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if almost all variables
are dichotomously categorical, In this instance a better procedure would be __

A

LOGISTIC
REGRESSION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

__
* Refers to predictor variables that will also correlate
with one another

A

COLLINEARITY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If one IV is to be a useful predictor of the DV, independently
of its relationship with another IV (collinearity), we need to
know its unique relationship with the dependent variable.
* This is found using the statistic known as the ___

A

SEMI-PARTIAL
CORRELATION COEFFICIENT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

___is a way of partialling out the
effect of a third variable (z) on the correlation between two
variables, x and y.

A

PARTIAL CORRELATION

20
Q

In __ correlation we take the residuals of only one of
the two variables involved.

A

semi-partial

21
Q

Semi-partial correlation gives us the ___ only
shared between an IV & the DV, with the variance of another
IV partialled out.

A

common variance

22
Q

Remember that the explained variance is found by ___

A

squaring the
regression coefficient.

23
Q

Now, imagine that for each predictor variable a regression
coefficient is found that, when squared, gives us the unique
variance in the DV explained by that predictor on its own with
the effect of all other predictors partialled out

In this way we can improve our prediction of the variance in a
dependent variable by adding in __ in addition to any variance already explained by
other predictors

A

predictors that explain
variance in y

24
Q

In multiple regression, then, a ___ of one
variable is made using the correlations of other known
variables with it.

A

statistical prediction

25
Q

For the set of predictor variables used, a particular
combination of __is found that maximizes
the amount of variance in y that can be accounted for.

A

regression coefficients

26
Q

In multiple regression, then, there is an ___ that predicts
y, not just from a x as in single predictor regression, but from
the regression coefficients of X1, X2, X3 and so on.. Where
Xs are predictor variables whose correlations with y are
known.

A

equation

27
Q

bi are the __

A

regression coefficients for each of the predictors (xi)

28
Q

bo is the __ (c in the simple example earlier)

A

constant

29
Q

These b values are again the ___increases for
each unit increase in the predictor (xi), if all the other
predictors are held constant.

A

number of units Ŷ

30
Q

However, in this multiple predictor model, when standardized
values are used, the ___are
not the same value as that predictor’s correlation with y on its
own.

A

standardized regression coefficients

31
Q

What is especially important to understand is that, although a
single predictor variable might have a strong individual
correlation with the criterion variable, acting among a set of
predictors it might have a
__

A

very low regression coefficient.

32
Q

In this case the potential contribution of one predictor variable
to explaining variance in DV has, as it were, already been
mostly used up by another predictor or IV with which I shares
a lot of __

A

common variance

33
Q

The multiple regression procedure produces a __, symbolized by R, which
is the overall correlation of the predictors with the criterion
variable.
* In fact, it is the simple correlation between actual y values &
their estimated y values.

A

MULTIPLE
CORRELATION COEFFICIENT

34
Q

The higher R is, the _
_
between actual y value &
estimated ŷ.

A

better is the fit

35
Q

The closer R approaches to +1, the ___
the differences between actual & estimated value

A

smaller are the residuals –

36
Q

Although R behaves just like any other correlation coefficient,
we are mostly interested in R² since this gives us the
__ in the criterion variable that has been
accounted for by the predictors taken together.

A

proportion of variance

37
Q

This is overall what we set out to do – to find the ___ to account for variance in the
criterion variable.

A

best
combination of predictors

38
Q

To find an R that is significant is __

A

no big deal

39
Q

This is the same point about __ correlations, that their
strength is usually of greater importance than their
significance, which can be misleading.

A

single

40
Q

However, to check for significance the R² Value can be
converted into an __

A

F value

41
Q

R² has to be adjusted because with small N its value is
artificially __.

A

high

42
Q

This is because, at the extreme, with N = number of predictor
variables (p) + 1, prediction of the criterion variable values is
__ and R² = 1, even though, in the population, prediction
can not be that perfect.

A

perfect

43
Q

The issue boils down to one of __

A

sampling adequacy.

44
Q

Various rules of thumb are given for the __to produce a meaningful estimate of the relationship
between predictors & criterion in the population – remember
we are still estimating population parameters from samples.

A

minimum number of
cases (N)

45
Q

Although some authors recommend very high N indeed, Some
recommend that the minimum should be __, and most
accept this as reasonable, though the more general rule is __

A

p + 50; “as
many as possible”.

46
Q

Effect Size

A

Small = .02
* Medium = .15
* Large = .35

47
Q

The output table in Multiple Regression Analysis\

A
  • 1st table - simple descriptives for each variable.
  • 2nd table - correlation between all variables.
  • 3rd - which variables have been entered into the equation
48
Q
  • Model Summary –
A

gives R, R², and adjusted R²

49
Q

Tells whether or not the model accounts for a
significant proportion of the variance in the criterion variable.
It is a comparison of the variance “explained” vs. the variance
“unexplained” (the residuals)

A
  • ANOVA –
50
Q
  • – contains information about all the individual
    predictors
A

Coefficients

51
Q

__ – are the b weights. This tells us
how many unit/point increase will there be in the criterion
variable for every 1 unit/point increase in a particular predictor
variable.

A

Unstandardized Coefficients

52
Q

__- are the beta values. This tells us
how many SD unit increase will there be in the criterion
variable for every 1 SD unit increase in a particular predictor
variable.

A

Standardized Coefficients

53
Q

__– found by dividing the unstandardized b value by its
standard error. If t is significant, it means the predictor is
making a significant contribution to the prediction of the
criterion.

A

t-values

54
Q

__ – when predictor variables correlate together too
closely. If tolerance values in the coefficients table are very
low (e.g., under .2) then multicollinearity is present.

A

Collinearity