Regression Flashcards by Natalie Slaughter

purpose of ordinary least squares regression

technique for finding the best fitting straight line for a set of data

How well did you know this?

Not at all

Perfectly

when would you use the sum of squares residular

some residuals are positive, some negative

How well did you know this?

Not at all

Perfectly

simplest model (null model)

uses the mean as model

How well did you know this?

Not at all

Perfectly

coefficent of determination

amount of variance in the outcome that is explained by the regression line compares to that explained by the mean

How well did you know this?

Not at all

Perfectly

MSm

How much the model has improved the prediction

How well did you know this?

Not at all

Perfectly

MSr

level of inaccuracy of the model

How well did you know this?

Not at all

Perfectly

spearman’s correlation coefficent

non parametric statistica based on ranked data

How well did you know this?

Not at all

Perfectly

can the coefficient of determination be used to determine causality?

nope

How well did you know this?

Not at all

Perfectly

square of pearson’s gives you what

portion of squared variance

How well did you know this?

Not at all

Perfectly

square of spearman’s gives you what

proportion of variance in ranks that 2 variables show

How well did you know this?

Not at all

Perfectly

can you square kendall’s tau

nope

How well did you know this?

Not at all

Perfectly

outcome variable

dependent variable

How well did you know this?

Not at all

Perfectly

predictive variable

independent variable

How well did you know this?

Not at all

Perfectly

simple regression

1 predictor

How well did you know this?

Not at all

Perfectly

multiple regression

multiple predictors

How well did you know this?

Not at all

Perfectly

residulars

difference between what the model predictes and observed data

How well did you know this?

Not at all

Perfectly

how to assess error in a regression model

sum of squared residulas

How well did you know this?

Not at all

Perfectly

F-Tests are vased on what

ratio of improvement due to the model

How well did you know this?

Not at all

Perfectly

degrees of freedom for the sum of squares of model

number of variablesin model

How well did you know this?

Not at all

Perfectly

degrees of freedom for sum of squares of residula

number of observations - number of parameters being estimated

How well did you know this?

Not at all

Perfectly

standardized residulars

residulars are converted to z scores

How well did you know this?

Not at all

Perfectly

studentized residual

unstandardized residula dvidied by an estimate of it’s standard deviation that varies point by point

How well did you know this?

Not at all

Perfectly

deleted residual

adjusted predicted value - original observed variable

How well did you know this?

Not at all

Perfectly

cook;s distance

considers the effect of a single case on the model as a whole

How well did you know this?

Not at all

Perfectly

mahaladi's distance

measure the distance of cases from the mean(s) of the predictor variable(s)

what type of distibution does mahaldi's distance have

chi-squared

how do you determine degrees of freedom for mahalandi's distance

number of predictors

DFBeta

parameter estimated using all cases - estimated when 1 case is excluded

DFFit

predicted value for a case when the model is calculated including that case - excluding that case

if a case is not influential what would DFFit be

Covariance ratio

where a case influences the variance of regression parameteres

assumptions of linear model

``` additivity ahnd linearity independent errors homoscedasticity normally distributed errors predictors are uncorrelated with external variables variable types ```

additivity

combined effect of predictors is best descrived by adding their effects together

Durbin-watson test

tests for serial correlation between errors (assumption of independent errors)

what does the Durbin Watson test statistic vary from

0 - 4

Durbin Watson test statistic of 2

residuals are uncorrelated

Durbin watson test statistic > 2

negative correlation

Durbin watson test statistic < 2

positive correlation

What happens to a linear model if the independent errors assumption is broken

CI and significance tests are invalid | method of least squares still ok

types of predictor variables allowed in a linear regression

quantiative or categorical (dichotomous)

types of outcome variables allowed in a linear regression

quantitative, continuous, and unbounded

unbounded variable

no constrants on the variability of outcome

no perfect multicollinearity

no perfect linear relationship between 2+ predictors

cross-validation fxn

assess the accuracy of a model across different samples

adjusted R^2

tells us how much of the variance in Y is accounted for if the model had to be derived from the population from which the sample was taken

Data splitting

split your data then run a regression equation on both halves of your data then compare

sample size needed for an expected large effect

sample size needed for an expected medium effect

160

b-value

tells us the gradient of the regression line and the strength of relationship between a predictor and the outcome

tells us how much variability the model can explain vs. what it doesn't explain

hierarchiacal regression

predictors are selected based on past work and research decides in which order to enter predictors into the model (known predictors get entered first)

forced entry

all predictors are forced into model simultaneously

stepwise regression

decision about order predictors are entered are purely mathmatical

suppressor effects

predictor has a singificant effect but only when another variable is held constant

forward method has a higher risk of what type of errors

type II

akaike information criterion (AIC)

measure of fit which penalizes the model for having more variables

perfect coliinearity

at least one predictor is a perfect linear combination of the others (correlation coefficent of 1)

As collinearity increases what else increases

standard errors

test that looks at collinearity

variance inflation factor (VIF)

tolerance

1/VIF

When should we be concerned about VIF

when the largest is >10 or the average is >1

When should be be concerned about tolerance

<0.2 is potential problem | <0.1 is seirous problem