Regression Flashcards

1
Q

Regression is…

A

used to understand relationship between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Independent variable (X)

A

Predictor or regressor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dependent variable (y)

A

Outcome or reponse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Goal of regression

A

Predict changes in Y based on X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Correlation vs regression

A

Correlation - measures the strength and direction of a linear relationship

Regression Predicts Y based on X

Shared variance (r^2); Proportion of Ys variance explained by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Simple linear Regression Equation

A

Y = B0 + B1 X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

B0

A

Intercept: value of y when x is 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

B1

A

Slope: Change in Y per unit X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

e

A

Error: Difference between observed and predicted Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Regression using a SAMPLE of the populations

A

Sample estimates intercept and slope and predicted values of Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Predicted values of Y are…

A

points on the Regression line that corresponds to the given value of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Residuals (e’hat’) are…

A

distances between observed and predicted values of Y for corresponding X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Equations for the Slope (B1)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Equation for the Intercept (B0)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Correlation equation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What do you do with the r to find the proportion of shared variance?

A

rxy^2

Square it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

1-r^2xy is the…

A

Variance of Y independent of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Suppose we observe a high correlation between a child’s weight and their reading ability. This correlation is likely due to age, how can we combat the confounds?

A

We can control for the hypothesized influence of age on reading ability by removing the shared variance between age and weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Squared Multiple Correlation (R-squared) formula:

A

The SMC represents the proportion of variance in Y shared with (or “explained by”) the set of all X variables

Numerator: proportion of non-redundant variance in Y shared with X1 and X2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Shared variance in Prediction

A

In two predictor regression, we are interested in imposing statistical control over X2 to test the unique effects of X1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Goal of Multiple Regression

A
  1. Evaluate the unique effect of X predictors on Y outcomes (holding constant other X)
  2. Determine the incremental contribution of new X predictors to estimating variance in Y (in addition to X already in the model)
  3. Determine the amount of variance explained in Y from a set of X predictors
22
Q

To determine Incremental contribution to the model we use…

A

squared semi partial correlation

23
Q

To determine variance explained in Y we use…

A

Squared multiple correlation

24
Q

Regression is a method of finding an equation to describe…

A

The line of best for a set of data

25
Q

How to define “best fitting” line when there are so many possibilities?

A

A line that is best fit for the actual data minimizes prediction errors

26
Q

Error of prediction is…

A

the distance each point is from the regression line (Y- Ŷ)

27
Q

Least-squared-error solution

A

Procedure that produces a line that minimizes the squared error of prediction

28
Q

Linear model with several predictors

A

The linear model can be expanded to include as many predictors as you like

Expanded formula:
𝑌𝑖= (𝑏0+ 𝑏1 𝑋1𝑖+ 𝑏2 𝑋2𝑖 )+𝑒𝑖

29
Q

r can be thought of a standardized version of…

30
Q

Model Estimation

31
Q

Residual Sums of Squares (SSr)

A

Gauge of how well a particular line fits the data

32
Q

Sums of Squares Regression (SSR)

A

Tells us how much error there is in the model but not wheter it is a better fit than nothing

Need to compare our model against a baseline model

Mean is a model of no relationship

33
Q

Sums of Squares Total (SST)

A

The differences between observed values and the values predicted by the mean

34
Q

Sums of Squares Model (SSM)

A

The difference between SST and SSR

35
Q

SSY and SST notation is

A

Sums of Squares total
dfy = n-1

Consists of adding
Sums of squares Regression
df regression = 1
and
Sums of Squares residual
df residual = n-1

36
Q

if SSM is large the regression model…

A

is very different from the mean to predict the outcome variable

This implies that the regression model has made a big improvement to how well the outcome variable can be predicted. If its small then using the regression model is better than using the mean

37
Q

Variance explained by the regression model (R^2) formula

38
Q

Mean Squares Regression Formula

39
Q

Mean Squares Residual Formula

40
Q

F Statistic Formula

41
Q

Assessing individual predictors

42
Q

Bivariate observations variable measurement scale

43
Q

Different notation for sample and population regression statistics

44
Q

A test of (rho=0)

A

If Rho=0 then the sampling distribution of r is almost normal with an expected value of rho and an estimated standard error of (Sr), given below (where n is the number of bivariate ‘pairs’ of observations…

45
Q

To test the hypothesis of p Formula

46
Q

Example of

47
Q

Confidence Interval on r (Formula)

48
Q

Confidence Interval on r (example)

50
Q

Residual sums of squares (SSR)

A

Gauge how well the particular line fits the data

51
Q

Sums of Squares Total (SST)

A

The differences between the observed values and the values predicted by the mean

52
Q

Sums of Squares Model (SSM)

A

The difference between SST and SSR