Regression Flashcards

1
Q

purpose of ordinary least squares regression

A

technique for finding the best fitting straight line for a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when would you use the sum of squares residular

A

some residuals are positive, some negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

simplest model (null model)

A

uses the mean as model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

coefficent of determination

A

amount of variance in the outcome that is explained by the regression line compares to that explained by the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MSm

A

How much the model has improved the prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MSr

A

level of inaccuracy of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

spearman’s correlation coefficent

A

non parametric statistica based on ranked data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

can the coefficient of determination be used to determine causality?

A

nope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

square of pearson’s gives you what

A

portion of squared variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

square of spearman’s gives you what

A

proportion of variance in ranks that 2 variables show

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

can you square kendall’s tau

A

nope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

outcome variable

A

dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

predictive variable

A

independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

simple regression

A

1 predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

multiple regression

A

multiple predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

residulars

A

difference between what the model predictes and observed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how to assess error in a regression model

A

sum of squared residulas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

F-Tests are vased on what

A

ratio of improvement due to the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

degrees of freedom for the sum of squares of model

A

number of variablesin model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

degrees of freedom for sum of squares of residula

A

number of observations - number of parameters being estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

standardized residulars

A

residulars are converted to z scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

studentized residual

A

unstandardized residula dvidied by an estimate of it’s standard deviation that varies point by point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

deleted residual

A

adjusted predicted value - original observed variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

cook;s distance

A

considers the effect of a single case on the model as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

mahaladi’s distance

A

measure the distance of cases from the mean(s) of the predictor variable(s)

26
Q

what type of distibution does mahaldi’s distance have

A

chi-squared

27
Q

how do you determine degrees of freedom for mahalandi’s distance

A

number of predictors

28
Q

DFBeta

A

parameter estimated using all cases - estimated when 1 case is excluded

29
Q

DFFit

A

predicted value for a case when the model is calculated including that case - excluding that case

30
Q

if a case is not influential what would DFFit be

A

0

31
Q

Covariance ratio

A

where a case influences the variance of regression parameteres

32
Q

assumptions of linear model

A
additivity ahnd linearity
independent errors
homoscedasticity
normally distributed errors
predictors are uncorrelated with external variables
variable types
33
Q

additivity

A

combined effect of predictors is best descrived by adding their effects together

34
Q

Durbin-watson test

A

tests for serial correlation between errors (assumption of independent errors)

35
Q

what does the Durbin Watson test statistic vary from

A

0 - 4

36
Q

Durbin Watson test statistic of 2

A

residuals are uncorrelated

37
Q

Durbin watson test statistic > 2

A

negative correlation

38
Q

Durbin watson test statistic < 2

A

positive correlation

39
Q

What happens to a linear model if the independent errors assumption is broken

A

CI and significance tests are invalid

method of least squares still ok

40
Q

types of predictor variables allowed in a linear regression

A

quantiative or categorical (dichotomous)

41
Q

types of outcome variables allowed in a linear regression

A

quantitative, continuous, and unbounded

42
Q

unbounded variable

A

no constrants on the variability of outcome

43
Q

no perfect multicollinearity

A

no perfect linear relationship between 2+ predictors

44
Q

cross-validation fxn

A

assess the accuracy of a model across different samples

45
Q

adjusted R^2

A

tells us how much of the variance in Y is accounted for if the model had to be derived from the population from which the sample was taken

46
Q

Data splitting

A

split your data then run a regression equation on both halves of your data then compare

47
Q

sample size needed for an expected large effect

A

77

48
Q

sample size needed for an expected medium effect

A

160

49
Q

b-value

A

tells us the gradient of the regression line and the strength of relationship between a predictor and the outcome

50
Q

F

A

tells us how much variability the model can explain vs. what it doesn’t explain

51
Q

hierarchiacal regression

A

predictors are selected based on past work and research decides in which order to enter predictors into the model (known predictors get entered first)

52
Q

forced entry

A

all predictors are forced into model simultaneously

53
Q

stepwise regression

A

decision about order predictors are entered are purely mathmatical

54
Q

suppressor effects

A

predictor has a singificant effect but only when another variable is held constant

55
Q

forward method has a higher risk of what type of errors

A

type II

56
Q

akaike information criterion (AIC)

A

measure of fit which penalizes the model for having more variables

57
Q

perfect coliinearity

A

at least one predictor is a perfect linear combination of the others (correlation coefficent of 1)

58
Q

As collinearity increases what else increases

A

standard errors

59
Q

test that looks at collinearity

A

variance inflation factor (VIF)

60
Q

tolerance

A

1/VIF

61
Q

When should we be concerned about VIF

A

when the largest is >10 or the average is >1

62
Q

When should be be concerned about tolerance

A

<0.2 is potential problem

<0.1 is seirous problem