Regression Flashcards

1
Q

What is a correlation?

A

An association or dependency between two independently observed variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What graph/plot can we use to visualise a correlation?

A Bar chart
B Bar Graph
C Q-Q Plot
D Scatterplot

A

D Scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pearson correlation co-efficient scores give an r value ranging between -1 and 1,
A score of 0 indicates what?
What does a score of 1 indicate?
What does a score of -1 indicate?
What do positive and negative scores indicate?

A

0 = no relationship between the variables
1 = Variables are identical
Positive - variables positively correlated
-1 = variables are exactly inverse
negative means variables are negatively correlated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If variables are both interval/ratio, we use a ___________ coefficient test, giving us an ___ value

A

Pearson’s coefficient
r value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If both variables are ordinal. we use either a Spearman’s rank, giving us a ____ value, or a Kendall’s rank, giving us a ____ value

A

Spearman’s = Rho
Kendall’s = tau

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If both variables are dichotomous (binary), we use a ____coefficient.

A

phi coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If we have one dichotomous variable, and one interval/ratio variable, we use a _____-______ coefficient, giving us an ___ value.

A

point-biserial coefficient
rpb value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A partial correlation is used when we have more than __ variables, and we want to test the _________ of a pair, whilst __________ for another variable.

A

partial correlation = when more than 2 variables
want to test correlation/association of one pair whilst accounting for a third

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiple linear regressions describe/examine what?

A

The relationship between one or more predictor variables and a criterion variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or false, virtually all statistical models we use (ANOVA, t test, correlations) are special cases of the regression model

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The regression line has the equation:

y= ax + b

Where y is the ______, x is the _______, ax is the ____/______, and b is the _-_______

A

y is height, x is length across
ax = slope, b = y-intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A residual error is how far a ____ _____ is from the _____ __ ___.

A

how far data point lies from line of fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SST = ___ + ____

A

SST = SSR+SSM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

prediction error is the difference between the ______ value and the _____ value

A

Pe = difference between prediction value and actual value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The best fit of the ____ occurs by minimizing _____ ______

A

best fit of model occurs by minimising prediction error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The coefficient of determination value is represented as ________

A

R squared

16
Q

The goodness of fit of a model can be assessed using what 3 measures?

A

Multiple correlation coefficient = R
Coefficient of determination = R squared
F-Ratio

17
Q

Similar to ANOVA , F-rations of regressions compare the ________ variance to the ________ variance or ________ total variance. Higher F-Ratios represent a _____ model and an increased prediction of _______ value - ______ value.

A

comparing explained variance to residual variance or total variance
higher f ratios = better model
better prediction, of actual value - predicted value

18
Q

In Regressions, we use _____ squares rather than ____ __ squares for F-Ratios.

F = ____/____

A

use mean squares rather than sum of squares

f = MSm/MSR

19
Q

MSM = ____/____

MSR = ____/____

A

MSR = SSM/dfM

MSR = SSR/ dfR

20
Q

dfM (degrees of freedom M) is the number of _______ _______

dfR (degrees of freed R) is the number of ____________ minus the number of ________

A

dfM = number of predictor variables
dfR = number of observations (p’s) - number of coefficients

21
Q

Effect size for regressions use the value of ___________.
A small effect size is a value of ______
A medium effect size is a value of ______
A large effect size is a value of ______

A

cohens f squared
small effect size = 0.02
medium = 0.15
large = 0.35

22
Q

Cohens f squared = ________/ (_______)

A

rsquared/ 1-rsquared

23
Q

What are the 3 main types of regressions, how do they each work?

A

Simultaneous - no a priori - all variables fit together
Stepwise - no a priori - predictor variables are added/removed one at a time
Hierarchical - based on a priori knowledge - create sever models by adding/removing variables each step, compare models to see which explains the best

24
Q

Outliers are points which _______ substantially from others. They can affect the _______ _____ of ______. ______ distance measures the ______ of an outlier, where a value over _ is concerning

A

outliers = deviatins from other data points
can affect the linear model of fit
cooks distance reveals how bad outliers are, value over 1 is concerning

25
Q

Scedasticity refers to the distribution of the residual error.

What is homoscedasticity, how can it be seen?

What is heteroscedasticity. how can it be seen?

A

homoscedasticity = distribution of residuals remain constant over range of predictor, no discernable pattern.

heteroscedasticity = distribution of residuals vary systematically, forming a pattern

26
Q

Multicollinearity refers to what? is it good or bad

A

multicollinearity refers to high similarity between two or more variables - do not want this

27
Q

Singularity refers to a _______ variable. This is when one variable is a _________ of two or more variables or __-______.

A

singularity refers to redundant variable
when one variable = combo of two subscores/subscales

28
Q

Multicollinearity can be detected using _________ correlations
singularity can be detected using ________ correlations.
Both can also be checked using ________ values

A

multicollinearity detected with bivariate correlations
singularity tested using multivariate correlations
both can be tested with tolerance values

29
Q

Multicollinearity can be detected from ____ tolerance values

A Low
B High
C Varying
D Similar

A

A Low

30
Q

The rule of thumb is that number of __________ (N) should be high compared to the number of ________ variables (m).

A

N should be high compared to number of predictors (M)

31
Q

A small range of the predictor variable ______ statistical power.

A

restricts statistical power