Regression The Linear Model Flashcards

1
Q

What does a linear model with several predictors look like on a graph?

A

A 3d regression plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SSr

A

Residual Sum of Squares. How well a linear model fits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Cross validation of linear regression model

A

Ensures model accurately predicts samw outcome in a different group of people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Methods of cross validation

A

Adjusted R squared
Steins method
data splitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does adjusted R squared do?

A

Tells how much variance in Y would be accounted for if the model had been derived from population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Steins formula do?

A

Tells how well model cross validates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 oversimplified common rules of thumb for sample size when using linear model?

A

10-15 cases per predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a good method of deciding desired sample size?

A

Desired effect size

Amount of power wanted for statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Size of sample for large effect

A

77 participants with up to 20 predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If medium effect expected use sample size of

A

55-150 (20 predictors)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If small effect expected use sample size of

A

1043 cases with 20 predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 main stages in fitting a linear model

A

Initial data checks
Run initial regression
Check residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

4 Steps in initial checks when fitting linear model

A

Check linearity and unusual cases
Graphs: scatter plots
If lack of linearity
Transform data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fitting linear regression model: run initial regression

A

Save diagnostic statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Fitting linear regression model : check residuals

A
Use zpred and zresid graphs to check 3 things
Linearity
Homodasticity
Independence
Check normality with histogram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fit general linear model: If glm assumptions met and no bias

A

Model can be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fit general linear model: If heteroscedasticity found.?use either

A

Weighted least squares regression
OR
Bootstrap and transform data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Fit general linear model: If no normality

A

Bootstrap and transform
OR
Use a multi level model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Fit general linear model: If data lacks independence

A

Use a multi level model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Glm : multicollinearity defn

A

Strong corellation between 2+ predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is less than perfect collinearity avoidable?

A

No. It is virtually unavoidable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does perfect collinearity mean?

A

One predictor variable has a perfect linear correlation

23
Q

Glm: multicollinearity and untrustworthy b’s

A

As correlation between two predictor variables increases the standard error of b’s increases. This increases the chance b is unrepresentative of the population

24
Q

Compare glm with one predictor: what does large value of r squared mean

A

Better fit of model

25
Q

Compare glm with one predictor: test Stat for assessing significance of r sqaured

A

F statistic

How much variability model explains relative to what isn’t explained.

26
Q

What is b value?

A

Gradient of regression line. Strength of relationship.

27
Q

What does multicollinearity do to r

A

Limits the size

28
Q

Why does multicollinearity limit the size of r

A

The correlation of two or more predictor variables mean they account for same variance portions. Give less unique variance to r squared.

29
Q

Why multicollinearity is a problem

A

It makes it difficult to assess the importance of each predictor variable

30
Q

Two steps for Identifying multicollinearity

A

Correlation matrix

Variance inflation factor

31
Q

Identify multicollinearity by scan correlation matrix

A

Find highly correlated predictor variables.

R>=. 8 or. 9

32
Q

Identifying multicollinearity: variance inflation factor indicates

A

If predictor variable has strong linear corellation with another predictor variable

33
Q

Identifying multicollinearity : interpretation of variance inflation factor

A

Largest vif = 10 tolerance =. 10 serious problem indicated
Average vif substantially greater than 1 regression may be biased
Tolerance

34
Q

What is an eigenvalue

A

The length of the line that goes from each side of an eclipse drawn around a scatter plot.

35
Q

What is an eigenvector?

A

The two lines that go from each side of an ellipse drawn around the scattorplot of data.

36
Q

1what happens to residuals when a model is a poor fit?

A

Residuals will be large.

37
Q

What are three types of residuals?

A

standardised, unstandardised, and studentised.

38
Q

What are unstandardised residuals?

A

The raw difference between predicted and observed scores.

39
Q

What are standardised residuals?

A

unstandardised residuals converted to z scores.

40
Q

What are studentised residuals?

A

Unstandardised residuals divided by the estimation of the standard deviation.

41
Q

Name six ways to assess influential cases.

A

Mahalanobis distance, cooks distance, deleted residuals studentised deleted residuals, leverage (hat) values, DFFit.

42
Q

What is adjusted predicted value?

A

the predicted value of the outcome for a case from a model where the case has been deleted.

43
Q

What is the deleted residual

A

The different between the adjusted predicted value and the observed value.

44
Q

What is the studentised deleted residual?

A

A deleted residual divided by the standard error.

45
Q

What is the leverage (hat) value?

A

Influence of observed value of outcome variable over the predictor variable

46
Q

What is mahalanobis distance?

A

The distance of cases from the mean of the predictor variable.

47
Q

What is cooks distance?

A

A measure of the overall influence of a case on the model.

48
Q

What is DFFit?

A

The difference between the adjusted predicted value and the original predicted value.

49
Q

What is DFBeta?

A

The difference between a parameter estimated using all cases and estimated when one case is excluded.

50
Q

What is covariance ratio (CVR)?

A

quantifies the degree to which a case influences the variance of the regression parameters.

51
Q

Check the assumptions of heterodasticity and linearity of residuals by?

A

plot standardised predicted values vs standardised residual values. If random array data is linear and homoscedastic.
Partial plot: residuals of outcome variable vs. each predictor variable evenly spaced dots around line indicate homoscedasticity.

52
Q

Test normality of residuals?

A

histogram, probability plot

53
Q

What statistics does bayesian regression give?

A

estimation of b, 95% credible of intervals for model parameters, eg: 95% probability population value of b lies between…