Regression Flashcards

1
Q

purpose of ordinary least squares regression

A

technique for finding the best fitting straight line for a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when would you use the sum of squares residular

A

some residuals are positive, some negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

simplest model (null model)

A

uses the mean as model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

coefficent of determination

A

amount of variance in the outcome that is explained by the regression line compares to that explained by the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MSm

A

How much the model has improved the prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MSr

A

level of inaccuracy of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

spearman’s correlation coefficent

A

non parametric statistica based on ranked data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

can the coefficient of determination be used to determine causality?

A

nope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

square of pearson’s gives you what

A

portion of squared variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

square of spearman’s gives you what

A

proportion of variance in ranks that 2 variables show

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

can you square kendall’s tau

A

nope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

outcome variable

A

dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

predictive variable

A

independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

simple regression

A

1 predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

multiple regression

A

multiple predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

residulars

A

difference between what the model predictes and observed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how to assess error in a regression model

A

sum of squared residulas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

F-Tests are vased on what

A

ratio of improvement due to the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

degrees of freedom for the sum of squares of model

A

number of variablesin model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

degrees of freedom for sum of squares of residula

A

number of observations - number of parameters being estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

standardized residulars

A

residulars are converted to z scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

studentized residual

A

unstandardized residula dvidied by an estimate of it’s standard deviation that varies point by point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

deleted residual

A

adjusted predicted value - original observed variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

cook;s distance

A

considers the effect of a single case on the model as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
mahaladi's distance
measure the distance of cases from the mean(s) of the predictor variable(s)
26
what type of distibution does mahaldi's distance have
chi-squared
27
how do you determine degrees of freedom for mahalandi's distance
number of predictors
28
DFBeta
parameter estimated using all cases - estimated when 1 case is excluded
29
DFFit
predicted value for a case when the model is calculated including that case - excluding that case
30
if a case is not influential what would DFFit be
0
31
Covariance ratio
where a case influences the variance of regression parameteres
32
assumptions of linear model
``` additivity ahnd linearity independent errors homoscedasticity normally distributed errors predictors are uncorrelated with external variables variable types ```
33
additivity
combined effect of predictors is best descrived by adding their effects together
34
Durbin-watson test
tests for serial correlation between errors (assumption of independent errors)
35
what does the Durbin Watson test statistic vary from
0 - 4
36
Durbin Watson test statistic of 2
residuals are uncorrelated
37
Durbin watson test statistic > 2
negative correlation
38
Durbin watson test statistic < 2
positive correlation
39
What happens to a linear model if the independent errors assumption is broken
CI and significance tests are invalid | method of least squares still ok
40
types of predictor variables allowed in a linear regression
quantiative or categorical (dichotomous)
41
types of outcome variables allowed in a linear regression
quantitative, continuous, and unbounded
42
unbounded variable
no constrants on the variability of outcome
43
no perfect multicollinearity
no perfect linear relationship between 2+ predictors
44
cross-validation fxn
assess the accuracy of a model across different samples
45
adjusted R^2
tells us how much of the variance in Y is accounted for if the model had to be derived from the population from which the sample was taken
46
Data splitting
split your data then run a regression equation on both halves of your data then compare
47
sample size needed for an expected large effect
77
48
sample size needed for an expected medium effect
160
49
b-value
tells us the gradient of the regression line and the strength of relationship between a predictor and the outcome
50
F
tells us how much variability the model can explain vs. what it doesn't explain
51
hierarchiacal regression
predictors are selected based on past work and research decides in which order to enter predictors into the model (known predictors get entered first)
52
forced entry
all predictors are forced into model simultaneously
53
stepwise regression
decision about order predictors are entered are purely mathmatical
54
suppressor effects
predictor has a singificant effect but only when another variable is held constant
55
forward method has a higher risk of what type of errors
type II
56
akaike information criterion (AIC)
measure of fit which penalizes the model for having more variables
57
perfect coliinearity
at least one predictor is a perfect linear combination of the others (correlation coefficent of 1)
58
As collinearity increases what else increases
standard errors
59
test that looks at collinearity
variance inflation factor (VIF)
60
tolerance
1/VIF
61
When should we be concerned about VIF
when the largest is >10 or the average is >1
62
When should be be concerned about tolerance
<0.2 is potential problem | <0.1 is seirous problem