3.4: Introduction to Linear Regression Flashcards

1
Q

Correlations

A

Measures the strength and direction of a linear relationship between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What variable is used to represent correlations?

A

R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the unit of R?

A

There are no units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the value of R help show?

A

The effect size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does R being unitless allow us to do?

A

Compare correlations across measures of different scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are correlations dependent on?

A

2 variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 2 flaws of correlations?

A
  1. Does not account for third (confounding) variables
  2. Do not help predict the actual values of the variables in a new situation (only given situation).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Regression

A

Explores how one variable (dependent) changes in response to another (independent).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Least Squares Regression Equation

A

Yi = bXi + a (slope formula)

Yi = predicted value
b = estimated slope
a = estimated intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are errors in prediction based on the regression line minimized?

A

Based on the least squares method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the least squares method minimize error?

A

Minimizes the squared deviations from the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Residuals

A

Deviations around the line of best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do you assume in residuals?

A

Homoscedasticity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Homoscedasticity

A

The error in predictions (scatter around line) is evenly distributed across the range of x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Homogeneity

A

The idea that the spread of residuals is constant across all levels of the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is regression or correlation model more reliable?

A

regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Heterogeneity

A

The variance of the residuals changes across levels of the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What can heterogeneity lead to?

A

Biased estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the graph of heterogeneity tend to look like?

A

It funnels (wider in some spots then narrows out)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does the regression equation describe?

A

The numeric relationship between the variables in the graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does the line of best fit help predict?

A

Scores of one variables given the scores of another variable.

22
Q

What can regression equations be expanded to?

A

Multiple variables

23
Q

Q: In the linear regression equation, Y =bX + a, what is the value of b called?

Best fit line

Beta (slope)

X intercept

Correlation between X and Y

A

Beta (Slope)

24
Q

Q: If there is a negative correlation between X and Y then the linear regression equation Y = bX + a, would necessarily have…?

A<0

A>0

B<0

B>0

25
Q

Multivariable Linear Regression

A

Models the relationship between one dependent variable and two or more independent variables.

26
Q

What does the equation for multivariable linear regression look like?

A

Y = b1X1 + b2X2 + b3X3… + a

27
Q

Coefficient

A

The beta (slope) value in units of the variables.

28
Q

Standard Coefficient

A

The beta (slope) value in z-scores units.

29
Q

What does using the standard coefficient do?

A

Makes the slope values comparable across all the predictor variables (no units).

30
Q

Continuous Variables

A

Betas represent slopes.

31
Q

Group Variables

A

Betas represent difference scores to a reference level.

32
Q

What do regression lines help predict?

A

Y scores from X scores

33
Q

Predictions are rarely___

34
Q

How is residual deviation from the predicted regression line summarized?

A

By the squared deviations from the predicted values on the line.

35
Q

Root Mean Squared Error

A

Provides a measure of the deviation of a point away from the regression line.

36
Q

Root Mean Squared Error Formula

A

Square root of: Sum of deviations squared ÷ total samples

37
Q

Without X, what is the best prediction method for y?

A

Finding the distance from the mean

38
Q

What question for Root Mean Squared Error help answer?

A

After we use X to predict Y, how much variability is left in Y?

39
Q

What 2 questions does deviation squared answer?

A
  1. How much variability in Y did the regression explain with X?
  2. How much better did we do by using X instead of simply the mean of Y?
40
Q

What is the formula for variability explained?

A

SSy (total variability in Y) - SSy/x (unexplained variability)

41
Q

How do you find the unexplained variability?

A

Sum of squared deviations

42
Q

What is R^2?

A

Coefficient of Determination

43
Q

What is the formula for4 R^2?

A

( SSy - SSy/x ) ÷ SSy

44
Q

Coefficient of Determination

A

The proportion of explained variability to total variability

45
Q

What does R^2 indicate?

A

The proportional gain in variability accounted for predicting y from x, rather than from the mean.

46
Q

What does R^2 = 0 mean?

A

The independent variables in the model explain none of the variation in the dependent variable.

= Model is useless for prediction

47
Q

What does a graph of R^2 = 0 look like?

A

No correlation
- scatter around the center with no pattern

48
Q

What does R^2 = 1 mean?

A

The model perfectly explains the variation in the dependent variable.

= Model is perfect; every point falls on the regression line.

49
Q

What does a graph of R^2 = 1 look like?

A

All points fall on the regression line, whether it is a positive line or a negative.