Regression The Linear Model Flashcards by Susan Feron

What does a linear model with several predictors look like on a graph?

A 3d regression plane

How well did you know this?

Not at all

Perfectly

SSr

Residual Sum of Squares. How well a linear model fits the data.

How well did you know this?

Not at all

Perfectly

What is Cross validation of linear regression model

Ensures model accurately predicts samw outcome in a different group of people.

How well did you know this?

Not at all

Perfectly

Methods of cross validation

Adjusted R squared
Steins method
data splitting

How well did you know this?

Not at all

Perfectly

What does adjusted R squared do?

Tells how much variance in Y would be accounted for if the model had been derived from population.

How well did you know this?

Not at all

Perfectly

What does Steins formula do?

Tells how well model cross validates

How well did you know this?

Not at all

Perfectly

2 oversimplified common rules of thumb for sample size when using linear model?

10-15 cases per predictor

How well did you know this?

Not at all

Perfectly

What is a good method of deciding desired sample size?

Desired effect size

Amount of power wanted for statistical significance

How well did you know this?

Not at all

Perfectly

Size of sample for large effect

77 participants with up to 20 predictors

How well did you know this?

Not at all

Perfectly

If medium effect expected use sample size of

55-150 (20 predictors)

How well did you know this?

Not at all

Perfectly

If small effect expected use sample size of

1043 cases with 20 predictors

How well did you know this?

Not at all

Perfectly

3 main stages in fitting a linear model

Initial data checks
Run initial regression
Check residuals

How well did you know this?

Not at all

Perfectly

4 Steps in initial checks when fitting linear model

Check linearity and unusual cases
Graphs: scatter plots
If lack of linearity
Transform data

How well did you know this?

Not at all

Perfectly

Fitting linear regression model: run initial regression

Save diagnostic statistics

How well did you know this?

Not at all

Perfectly

Fitting linear regression model : check residuals

Use zpred and zresid graphs to check 3 things
Linearity
Homodasticity
Independence
Check normality with histogram

How well did you know this?

Not at all

Perfectly

Fit general linear model: If glm assumptions met and no bias

Model can be generalized

How well did you know this?

Not at all

Perfectly

Fit general linear model: If heteroscedasticity found.?use either

Weighted least squares regression
OR
Bootstrap and transform data

How well did you know this?

Not at all

Perfectly

Fit general linear model: If no normality

Bootstrap and transform
OR
Use a multi level model

How well did you know this?

Not at all

Perfectly

Fit general linear model: If data lacks independence

Use a multi level model

How well did you know this?

Not at all

Perfectly

Glm : multicollinearity defn

Strong corellation between 2+ predictor variables

How well did you know this?

Not at all

Perfectly

Is less than perfect collinearity avoidable?

No. It is virtually unavoidable

How well did you know this?

Not at all

Perfectly

What does perfect collinearity mean?

Study These Flashcards

One predictor variable has a perfect linear correlation

Glm: multicollinearity and untrustworthy b’s

Study These Flashcards

As correlation between two predictor variables increases the standard error of b’s increases. This increases the chance b is unrepresentative of the population

Compare glm with one predictor: what does large value of r squared mean

Study These Flashcards

Better fit of model

Compare glm with one predictor: test Stat for assessing significance of r sqaured

F statistic | How much variability model explains relative to what isn't explained.

What is b value?

Gradient of regression line. Strength of relationship.

What does multicollinearity do to r

Limits the size

Why does multicollinearity limit the size of r

The correlation of two or more predictor variables mean they account for same variance portions. Give less unique variance to r squared.

Why multicollinearity is a problem

It makes it difficult to assess the importance of each predictor variable

Two steps for Identifying multicollinearity

Correlation matrix | Variance inflation factor

Identify multicollinearity by scan correlation matrix

Find highly correlated predictor variables. | R>=. 8 or. 9

Identifying multicollinearity: variance inflation factor indicates

If predictor variable has strong linear corellation with another predictor variable

Identifying multicollinearity : interpretation of variance inflation factor

Largest vif = 10 tolerance =. 10 serious problem indicated Average vif substantially greater than 1 regression may be biased Tolerance

What is an eigenvalue

The length of the line that goes from each side of an eclipse drawn around a scatter plot.

What is an eigenvector?

The two lines that go from each side of an ellipse drawn around the scattorplot of data.

1what happens to residuals when a model is a poor fit?

Residuals will be large.

What are three types of residuals?

standardised, unstandardised, and studentised.

What are unstandardised residuals?

The raw difference between predicted and observed scores.

What are standardised residuals?

unstandardised residuals converted to z scores.

What are studentised residuals?

Unstandardised residuals divided by the estimation of the standard deviation.

Name six ways to assess influential cases.

Mahalanobis distance, cooks distance, deleted residuals studentised deleted residuals, leverage (hat) values, DFFit.

What is adjusted predicted value?

the predicted value of the outcome for a case from a model where the case has been deleted.

What is the deleted residual

The different between the adjusted predicted value and the observed value.

What is the studentised deleted residual?

A deleted residual divided by the standard error.

What is the leverage (hat) value?

Influence of observed value of outcome variable over the predictor variable

What is mahalanobis distance?

The distance of cases from the mean of the predictor variable.

What is cooks distance?

A measure of the overall influence of a case on the model.

What is DFFit?

The difference between the adjusted predicted value and the original predicted value.

What is DFBeta?

The difference between a parameter estimated using all cases and estimated when one case is excluded.

What is covariance ratio (CVR)?

quantifies the degree to which a case influences the variance of the regression parameters.

Check the assumptions of heterodasticity and linearity of residuals by?

plot standardised predicted values vs standardised residual values. If random array data is linear and homoscedastic. Partial plot: residuals of outcome variable vs. each predictor variable evenly spaced dots around line indicate homoscedasticity.

Test normality of residuals?

histogram, probability plot

What statistics does bayesian regression give?

estimation of b, 95% credible of intervals for model parameters, eg: 95% probability population value of b lies between...

Regression The Linear Model Flashcards

(53 cards)