Linear Regression Flashcards

1
Q

Linear Regression

A

Examine linear relationship between independent/predictor variables and a continuous dependent/outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple Linear Regression

A

1 independent/predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multiple Linear Regression

A

> 1 independent/predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fitted Line (best fit) Equation

A

Y= b0 + b1X
b0 = intercept
b1 = slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Slope

A

The average amount of change in the outcome variable for each one-unit increase in the predictor variable
-Characterizes the relationship or marginal effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Residual

A

Relationship between each individual observation and the trend line can be measured by the vertical distance between each data point and the trend line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Residual is also known as what?

A

Estimated Error (u)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can the residual be negative and positive values?

A

YES.
Positive = above the trend line
Below = under the trend line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Regression Equation mean?

A

Sum of the fitted line and the error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum of Squares Regression

A

How far the predicted values on the fitted line differ from the overall mean
SSB between on ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sum of Squares Residual

A

Difference between the original data and the predicted values on the fitted line
SSW within on ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Residual and Regression Sum of Squares demonstrates what?

A

Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

For a Regression calculation what must be done first?

A

ANOVA and F-Stat
-confirm or deny significance prior to proceeding with the rest of the regression calculations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ANOVA and F-Stat state whether or not the regression model is significant, but what does it NOT tell us?

A
  1. Positive or Negative Relationship
  2. Extent of change in Dependent Variable based on Independent Variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

b1 the SLOPE is what?

A

UNSTANDARDIZED estimate of slop coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The t-stat in a regression where the p-value is less than alpha you would what?

A

REJECT null hypothesis that b=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The 95% CI for a regression cannot include what if you want to reject the null hypothesis?

A

ZERO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

b1 the Slope tells you the direction of the relationship (+/-), and extent of change in dependent variable based on independent variable, but it does NOT tell us what?

A
  1. Proportion of variance in dependent variable explained by independent variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Coefficient of Determination

A

Proportion of the variance in the dependent variable explained by the independent variable

20
Q

R^2 is the Coefficient of Determination (square of Pearson coefficient), and that R^2 value is interpreted how?

A

R^2 = variance in outcome explained by the exposure

21
Q

1-R^2 = what?

A

Proportion of variance in outcome NOT explained by exposure

22
Q

R^2 is bounded between 0 and 1 what do the values imply?

A

1 = excellent model fit
0 = no model fit

23
Q

Adjusted R^2

A
  1. Always lower than R^2
  2. Adjusts for inherent increase in R^2 that occurs every time we add an independent variable to regression equation
  3. Preferable when comparing models and multiple linear regression
24
Q

Multiple Linear Regression

A

Examine linear relationships between two or more independent variables and a continuous dependent variable

25
What does Multiple Linear Regression consider?
1. Accounts for Confounders 2. Assess for Moderator effects
26
Multiple Linear Regression does not look at a trend line but what?
Best Fit PLANE
27
The goal is to remove each and every possible source of bias this allow for what?
Estimates to be more unbiased, more efficient, and CLOSER to the true population parameter aka a lower variance
28
What type of outcome variable MUST be utilized in a linear regression?
CONTINUOUS the predictor variables (exposure) do not have to be
29
Continuous Regression Coefficient
All other control variables besides the one being analyzed are held constant and the 95% confidence interval does NOT include 0
30
Binary Regression Coefficient
Dummy variables are BINARY and take a value of 1 if a particular criterion is met and zero otherwise The 95% confidence interval INCLUDES 0
31
Is the Adjusted R^2 preferred or not?
YES, adjusted R^2 adds a penalty for incorporating extra control variables into the model LOWER value due to the penalties
32
Adj R^2 < R^2
A large difference between the two values indicates the possibility of superfluous control variables
33
Manifestation of Effect of a Modified Variable
Should be tested and if statistically significant remain as an independent variable
34
How would you check to see if a Moderator Variable is significantly significant?
Two Way Interaction 1. Forces effects to be additive 2. If the outcome changes/depends on various factors affects the exposure the effects are NOT simply additive but ALSO MULTIPLICATIVE
35
b0 and b1 are what type of predictor variables?
UNSTANDARDIZED Estimates of intercept and slope coefficients
36
What is SE?
Standard Error of the Coefficients
37
What is B?
STANDARDIZED Regression Coefficient in SD UNITS
38
Can you compare between unstandardized and standardized units?
NO
39
t = b-BH0/SE
For a simple linear regression, the t-statistic is the square root of the analysis F statistic
40
Unstandardized b
Relative predictions of unstandardized regression coefficients CANNOT be compared t each other
41
Standardized B
Permit direct comparison of regression coefficients to one another (i.e. which independent variable explained more variance in the dependent variable)
42
What are the Factors that influence appropriateness of Multiple Linear Regression?
1. Homoscedasticity 2. Multicollinearity
43
Homeoscedasticity of Residuals
Variance is the same at ALL points along the regression line
44
If there is homoscedasticity present at a points appearing in a pattern, what type of test should be used instead?
Logistic Regression
45
Multicollinearity
Strong correlations between predictor variables (r> 0.90) =RESULT IN TYPE II ERROR
46
If Multicollinearity is present, what test should be used instead?
Stepwise Regression