Correlation And Mulitple Regression Flashcards

1
Q

3 types of multiple regression

A

–simultaneous
–stepwise
–hierarchical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are correlation and regression for?

A

study of the relationship between two or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Regression

A

allows prediction of Y on the basis of knowledge of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Correlation

A

measures strength of relationship between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Scatter plot

A

–2-D diagram
–1 point for each participant
–coordinates are scores on variables: e.g. (X1,Y) or (X2,Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation and scatter plot

A

–linked to degree to which points cluster around regression line
–value between -1 and +1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Venn diagram

A

size of circles represent variance of variable

overlapping circles denote correlated variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the relationship between 2 variables once the effect of the other variables has been removed?

A

measures the strength of dependence between 2 variables that is not accounted for by the way in which they both change in response to variations in a selected subset of the other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is multiple regression for

A

learn about relationship between several independent variables (predictors) and one dependent variable (criterion)
 predictive tool

•examples
–estate agent analyzes selling price: for each house, he records size, number of bedrooms, average income in neighbourhood, subjective appeal, etc.
 how do these relate to the selling price?

–psychologist studies depression: for each participant, he records age, gender, stress, measure of neuroticism, etc.
 how do these relate to depression?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Assessing goodness to fit

A

–multiple correlation coefficient
correlation between the criterion Y and the best linear combination of the predictors, Ŷ

–coefficient of determination (R2)
•proportion of variability in data set accounted for by statistical model
•square of multiple correlation coefficient

–F-ratio
improvement in prediction of criterion compared to inaccuracy of model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Multiplied regression

Simulatanous (standard)

A

–no a priori model

–enter all IVs at once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Multiple regression

Step wise

A

–no a priori model
–computer chooses, on statistical ground, an a posteriori model (best sub-set of IVs)
–capitalises on chance effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Multiple regression

Hierarchical (sequential)

A

–theoretically sound

–a-priori sequence of entry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Factors affecting regression

A
  • outliers & influential points
  • homo/hetero-scedasticity
  • singularity & multi-collinearity
  • number of cases vs number of predictors
  • range
  • distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Outliers and influential points

A
  • points which deviate markedly from others in sample

* Cook’s distance of 1 or greater

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

homoscedasticity vs heteroscedasticity

A

•homoscedasticity
variability of scores (errors) in one continuous variable same in second variable
 uniform scatter or dispersion of data points about the regression line.
(homogeneity of variance)

•heteroscedasticity
one variable is skewed or the relationship is non-linear

17
Q

Singularity and multi collinearity

A

•singularity
redundancy, one variable combination of 2 or more other variables
•multi-collinearity
variables are highly correlated ( > 0.90)

18
Q

Singularity and multi collinearity problems and solutions

A

•problems
–logical: don’t want to measure the same thing twice.
–statistical: singularity prevents matrix inversion (division) as determinants = zero

•screening & solutions
–high bivariate correlations (> 0.9)
 compute correlations amongst IVs, remove appropriate IV
–high multivariate correlations
 examine SMC (squared multiple correlation) of each IV w.r.t others
(tolerance = 1 – SMC)

19
Q

A small range…

A

Restricts power of tests

20
Q

Anacombes quartet:

A

same mean, variance, correlation, regression line

21
Q

What technique is used for two sets of independent variables?

A

What is common between the sets?

Canonical correlation

22
Q

What technique is used for many independent variables, when asking the question, what is relationship between 2 variables once effect of others removed?

A

Partial correlation

23
Q

One dependant variable, technique used

Predicting IV from DV’s

A

Multiple regression

24
Q

Relationship between IV and DV, technique used:

A

Multiple correlation