Linear Regression Flashcards

1
Q

Linear Regression

A

Examine linear relationship between independent/predictor variables and a continuous dependent/outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple Linear Regression

A

1 independent/predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multiple Linear Regression

A

> 1 independent/predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fitted Line (best fit) Equation

A

Y= b0 + b1X
b0 = intercept
b1 = slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Slope

A

The average amount of change in the outcome variable for each one-unit increase in the predictor variable
-Characterizes the relationship or marginal effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Residual

A

Relationship between each individual observation and the trend line can be measured by the vertical distance between each data point and the trend line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Residual is also known as what?

A

Estimated Error (u)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can the residual be negative and positive values?

A

YES.
Positive = above the trend line
Below = under the trend line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Regression Equation mean?

A

Sum of the fitted line and the error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum of Squares Regression

A

How far the predicted values on the fitted line differ from the overall mean
SSB between on ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sum of Squares Residual

A

Difference between the original data and the predicted values on the fitted line
SSW within on ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Residual and Regression Sum of Squares demonstrates what?

A

Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

For a Regression calculation what must be done first?

A

ANOVA and F-Stat
-confirm or deny significance prior to proceeding with the rest of the regression calculations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ANOVA and F-Stat state whether or not the regression model is significant, but what does it NOT tell us?

A
  1. Positive or Negative Relationship
  2. Extent of change in Dependent Variable based on Independent Variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

b1 the SLOPE is what?

A

UNSTANDARDIZED estimate of slop coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The t-stat in a regression where the p-value is less than alpha you would what?

A

REJECT null hypothesis that b=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The 95% CI for a regression cannot include what if you want to reject the null hypothesis?

A

ZERO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

b1 the Slope tells you the direction of the relationship (+/-), and extent of change in dependent variable based on independent variable, but it does NOT tell us what?

A
  1. Proportion of variance in dependent variable explained by independent variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Coefficient of Determination

A

Proportion of the variance in the dependent variable explained by the independent variable

20
Q

R^2 is the Coefficient of Determination (square of Pearson coefficient), and that R^2 value is interpreted how?

A

R^2 = variance in outcome explained by the exposure

21
Q

1-R^2 = what?

A

Proportion of variance in outcome NOT explained by exposure

22
Q

R^2 is bounded between 0 and 1 what do the values imply?

A

1 = excellent model fit
0 = no model fit

23
Q

Adjusted R^2

A
  1. Always lower than R^2
  2. Adjusts for inherent increase in R^2 that occurs every time we add an independent variable to regression equation
  3. Preferable when comparing models and multiple linear regression
24
Q

Multiple Linear Regression

A

Examine linear relationships between two or more independent variables and a continuous dependent variable

25
Q

What does Multiple Linear Regression consider?

A
  1. Accounts for Confounders
  2. Assess for Moderator effects
26
Q

Multiple Linear Regression does not look at a trend line but what?

A

Best Fit PLANE

27
Q

The goal is to remove each and every possible source of bias this allow for what?

A

Estimates to be more unbiased, more efficient, and CLOSER to the true population parameter aka a lower variance

28
Q

What type of outcome variable MUST be utilized in a linear regression?

A

CONTINUOUS
the predictor variables (exposure) do not have to be

29
Q

Continuous Regression Coefficient

A

All other control variables besides the one being analyzed are held constant and the 95% confidence interval does NOT include 0

30
Q

Binary Regression Coefficient

A

Dummy variables are BINARY and take a value of 1 if a particular criterion is met and zero otherwise
The 95% confidence interval INCLUDES 0

31
Q

Is the Adjusted R^2 preferred or not?

A

YES, adjusted R^2 adds a penalty for incorporating extra control variables into the model
LOWER value due to the penalties

32
Q

Adj R^2 < R^2

A

A large difference between the two values indicates the possibility of superfluous control variables

33
Q

Manifestation of Effect of a Modified Variable

A

Should be tested and if statistically significant remain as an independent variable

34
Q

How would you check to see if a Moderator Variable is significantly significant?

A

Two Way Interaction
1. Forces effects to be additive
2. If the outcome changes/depends on various factors affects the exposure the effects are NOT simply additive but ALSO MULTIPLICATIVE

35
Q

b0 and b1 are what type of predictor variables?

A

UNSTANDARDIZED Estimates of intercept and slope coefficients

36
Q

What is SE?

A

Standard Error of the Coefficients

37
Q

What is B?

A

STANDARDIZED Regression Coefficient in SD UNITS

38
Q

Can you compare between unstandardized and standardized units?

A

NO

39
Q

t = b-BH0/SE

A

For a simple linear regression, the t-statistic is the square root of the analysis F statistic

40
Q

Unstandardized b

A

Relative predictions of unstandardized regression coefficients CANNOT be compared t each other

41
Q

Standardized B

A

Permit direct comparison of regression coefficients to one another (i.e. which independent variable explained more variance in the dependent variable)

42
Q

What are the Factors that influence appropriateness of Multiple Linear Regression?

A
  1. Homoscedasticity
  2. Multicollinearity
43
Q

Homeoscedasticity of Residuals

A

Variance is the same at ALL points along the regression line

44
Q

If there is homoscedasticity present at a points appearing in a pattern, what type of test should be used instead?

A

Logistic Regression

45
Q

Multicollinearity

A

Strong correlations between predictor variables (r> 0.90) =RESULT IN TYPE II ERROR

46
Q

If Multicollinearity is present, what test should be used instead?

A

Stepwise Regression