Linear Regression (L9 & 10) Flashcards

1
Q

what are the 2 types of linear regression

A

simple linear regression:
outcome <– predictor

multiple linear regression :
outcome: <– 3+ predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

correlation

A
  1. Only provides the direction and magnitude (effect) of a
    relationship: says nothing about causation
  2. The x and y variables are interchangeable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

correlation vs regression:

A

correlation:
Y (Variable 1)<—> X (variable 2)

Regression:
Y (outcome) X (predictor)

Linear regression takes us a step beyond correlation, and a step closer to causation.
It does this by allowing us to predict an ‘outcome’ variable by knowing a ‘predictor’ variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what formula do almost all statistical tests follow:

A

Outcome = model + error

outcome: The thing you want to know
(information, prediction etc.,)

model: formula u will use to find it

error: how much ur “off”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Example Correlation: what is the
relationship between 10 rep max back
squat and 1 rep max?

Example Regression:

A

Can I predict
someone’s max squat by knowing their
10 rep max?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example Correlation: what is the
relationship between number of hours
studied and exam performance?

Example Regression:

A

Can I predict a
person’s test score from the number of
hours they studied?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how to find the line that best fits the data?

A

W E U S E T H E ‘ O R D I N A R Y L E A S T S Q U A R E S ’ M E T HO D

*This is how we find the line that
best fits the data.
*The line that goes through, or
as close to, as many of the data
points as possible.
*It’s called least squares because
we are summing the squared
distance!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

outcome=model + error

Yi = (b0 + b1Xi) + ∈

A

Yi= outcome
E=error basically the score we
predict minus the actual score
b=model
Model intercept (b0) plus the slope (b1) * X (ith persons
score on the predictor variable)
Note: b0 + b1 are typically referred to as parameters or
regression coefficients.
A note here, “i” is just a placeholder for whichever
variable you are dealing with.

B0 = Intercept (where is y when x = 0)
B1 = Slope (negative relationship)
The line that goes through, or as
close to, as many of the data points
as possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In regression, these differences between the predicted
and actual values, or ∈, are referred to as

A

RESIDUALS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

H O W D O W E C H E C K
A M O D E L ?

A

*Just because we found the best
possible line for our data, doesn’t
mean that the line does a great job
of fitting our data, and therefore of
making predictions.
*We have to check!
*We compare against the mean of
the outcome (dependent) variable

*We look at the difference between the
observed values and the mean of Y.
*Remember, we square each value and
then sum!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

H O W G O O D I S O U R
R E G R E S S I O N L I N E ?
C A L C U L A T I N G T H E S S R

A

*This is called the SSR, or the sum of squares
of the residuals (∈from our line equation).

*We find the difference between the actual
data and the regression line.

*The degree of inaccuracy when our best
model is fitted to the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

T H E I M P O R T A N T S T E P :
I S O U R M O D E L B E T T E R
T H A N T H E M E A N ?

A

*If the SSMis LARGE, then our model is much better than
the mean, it has reduced the error (residuals) drastically,
and therefore improved prediction!
*If the SSMis SMALL, then our model is no better than the mean
SSM= SST - SSR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when comparing groups of 2 continuous variables,

A

We can have different
groups of people exposed
to different conditions.

We can have the same group of people
measured at different times, and possibly
exposed to different conditions.

We can also have more than two groups, in
either of the conditions mentioned above
(different or same people).

WE CAN USE LINEAR REGRESSION TO DO THIS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

the coefficient(slope) is

A

the difference between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly