Unit 13a: Regression I Simple Linear Regression Flashcards

1
Q

The correlation coefficient…
A. Will always fall between 0 and 1
B. Compares the mean of a sample to a population
C. Measures how strongly related two variables are
D. Can only be calculated for 2 continuous (i.e., ratio) variables

A

C. Measures how strongly related two variables are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Purpose of Simple (bivariate) Regression

A

How relations are used to predict outcomes: the stronger the correlation, the more accurate the prediction.
* The correlated variables can indicate the values of each other using a simple or bivariate regression.
* Standard error of the estimate can help us determine the accuracy of a prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

strong correlation means accurate or weak prediction

A

accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

correlation

A

If two variables co-vary, they have a relation (correlation)
* Regression extends the correlation to make a prediction of one variable on another.
* The accuracy of the prediction depends on the strength of the relation (correlation)
* More information shared between the variables (a higher, stronger correlation) means less error
* Not to be confused with one variable causing another.
* Not sure which variable cause which
* Or if a third variable accounts for relation (confounder)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Simple/Bivariate Regression has how many variables

A

2-

  • 1 outcome variable
  • 1 predictor variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Key Elements of Linear Regression

A
  • F-test: the Omnibus Test
  • Is there any association
  • Regression Equation
  • Beta coefficient
  • “Slope” of the function
  • This is the important element we want because it has
    meaningful interpretation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Key Elements of Linear Regression

A
  • F-test: the Omnibus Test
    (Is there any association?)
  • Regression Equation
  • Beta coefficient, the “Slope” of the function
  • This is the important element we want because it has meaningful interpretation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Regression Equation

A

y = mx + b
* 𝑦 = 𝛽0 + 𝛽1𝑥
* Introduce an index i for each participant or observation (𝑋𝑖, 𝑌𝑖)
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖
* We allow an error in the equation for each observation i
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝜀𝑖

The regression equation tells us that y’ (the predicted value of y) is a
function of a value for the intercept (𝑎), a value for the slope (𝑏), and
a value for the predictor variable (x).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Regression Equation parts

A

Regression Equation
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝜀𝑖
* 𝑌𝑖: value of the outcome (or response or dependent) variable for the ith
observation
* 𝑋𝑖: value of the predictor (or independent) variable for the ith
observation
* 𝛽0 & 𝛽1: regression parameters (the intercept and slope) to be estimated
* 𝛽0 = the intercept
* 𝛽1 = the slope
* 𝜀𝑖: the random error term for the ith observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Regression Analysis Variables

A

In regression language, the criterion variable is regressed on the
predictor variable.
* Criterion variable: the variable to predict
— The dependent variable; the y axis in a scatter-plot
—Actual values denoted as y
—Predicted values denoted y’ or ො𝑦

  • Predictor variable: the variable used in the prediction
    –The independent variable; the x axis in a scatter-plot
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Criterion variable

A

the variable to predict

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Predictor variable

A

the variable used in the prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Determining the line of best fit:
the Least Squares Criterion

A

A good fit will limit the divergence between our predicted value and
the actual data (the “error”)
* With a single line, we cannot fit the data exactly.
* Some points will be above and some below
* How do we trade-off these errors?
* Minimize the square of the errors
* Deviations above or below the line are treated equally so you square them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Evidence of prediction error is known as

A

a residual score.
* The difference between y and y’ (or ො𝑦).
* 𝑒𝑖 = 𝑌𝑖 − ෠

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

the least squares criterion

A

The sum of the squared differences between the actual (y) and predicted (y’) values must have their lowest possible value

Regression designed to minimize the sum of the squared differences between y and y’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Best Fit Line

A

Ordinary Least Squares
* Make the distance
between your line and
the y of each point as
small as possible
* The line shows the
“predicted” values

17
Q

Goodness of Fit: Coefficient of Determination

A

𝑅^2: How much of the variability in Y is explained by the predictor(s).
* Lies between 0 (worst fit; 0% of variance explained) and 1 (best fit;
100% of variance explained).
* For simple regression: 𝑅2 = 𝑟(𝑥, 𝑦)2.

18
Q

Which of the following is correct with respect to the
correlation coefficient (r) and the slope of the least-
squares regression line?

a. They will always have the same sign.
b. They will have opposite signs.
c. Neither, because they are two different measures
that are not related to one another

A

a. They will always have the same sign.

b = r (sy/sx)

19
Q

Least Squares Regression Line

A

The Least Squares Regression Line is the line that minimizes the sum of the residuals squared. In other words, for any other line other than the LSRL, the sum of the residuals squared will be greater. This is what makes the LSRL the sole best-fitting line.

20
Q

Interpreting the Slope

A

If the value of the slope (b) is positive, it indicates how much y increases for every one unit (1.0) increase in x.
* If b is negative, it indicates how much y decreases for every unit (1.0) change in the predictor

21
Q

Interpreting the Intercept (a)

A

The intercept value, a, is the value of y when x = 0.
* For example, if the number of absences is used to predict a student’s test grade (the criterion), a indicates what the student’s score will be if the
student misses 0 days.
* Or… how much the predicted Groot 2 score would be if the wizard’s Groot 1 score was a 0.

22
Q

Regression Assumptions

A
  • For simple regression, essentially the same as the correlation assumptions
  • Y variable is continuous & ~normal
  • Less of an issue with large n
  • The relation between the predictor(s) and the outcome variables is linear