Unit 13a: Regression I Simple Linear Regression Flashcards

Question 1

Q

The correlation coefficient…
A. Will always fall between 0 and 1
B. Compares the mean of a sample to a population
C. Measures how strongly related two variables are
D. Can only be calculated for 2 continuous (i.e., ratio) variables

Answer

A

C. Measures how strongly related two variables are

Question 2

Q

Purpose of Simple (bivariate) Regression

Answer

A

How relations are used to predict outcomes: the stronger the correlation, the more accurate the prediction.
* The correlated variables can indicate the values of each other using a simple or bivariate regression.
* Standard error of the estimate can help us determine the accuracy of a prediction

Question 3

Q

strong correlation means accurate or weak prediction

Question 4

Q

correlation

Answer

A

If two variables co-vary, they have a relation (correlation)
* Regression extends the correlation to make a prediction of one variable on another.
* The accuracy of the prediction depends on the strength of the relation (correlation)
* More information shared between the variables (a higher, stronger correlation) means less error
* Not to be confused with one variable causing another.
* Not sure which variable cause which
* Or if a third variable accounts for relation (confounder)

Question 5

Q

Simple/Bivariate Regression has how many variables

Answer

A

2-

1 outcome variable
1 predictor variable

Question 6

Q

Key Elements of Linear Regression

Answer

A

F-test: the Omnibus Test
Is there any association
Regression Equation
Beta coefficient
“Slope” of the function
This is the important element we want because it has
meaningful interpretation

Question 7

Q

Key Elements of Linear Regression

Answer

A

F-test: the Omnibus Test
(Is there any association?)
Regression Equation
Beta coefficient, the “Slope” of the function
This is the important element we want because it has meaningful interpretation

Question 8

Q

Regression Equation

Answer

A

y = mx + b
* 𝑦 = 𝛽0 + 𝛽1𝑥
* Introduce an index i for each participant or observation (𝑋𝑖, 𝑌𝑖)
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖
* We allow an error in the equation for each observation i
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝜀𝑖

The regression equation tells us that y’ (the predicted value of y) is a
function of a value for the intercept (𝑎), a value for the slope (𝑏), and
a value for the predictor variable (x).

Question 9

Q

Regression Equation parts

Answer

A

Regression Equation
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝜀𝑖
* 𝑌𝑖: value of the outcome (or response or dependent) variable for the ith
observation
* 𝑋𝑖: value of the predictor (or independent) variable for the ith
observation
* 𝛽0 & 𝛽1: regression parameters (the intercept and slope) to be estimated
* 𝛽0 = the intercept
* 𝛽1 = the slope
* 𝜀𝑖: the random error term for the ith observation

Question 10

Q

Regression Analysis Variables

Answer

A

In regression language, the criterion variable is regressed on the
predictor variable.
* Criterion variable: the variable to predict
— The dependent variable; the y axis in a scatter-plot
—Actual values denoted as y
—Predicted values denoted y’ or ො𝑦

Predictor variable: the variable used in the prediction
–The independent variable; the x axis in a scatter-plot

Question 11

Q

Criterion variable

Answer

A

the variable to predict

Question 12

Q

Predictor variable

Answer

A

the variable used in the prediction

Question 13

Q

Determining the line of best fit:
the Least Squares Criterion

Answer

A

A good fit will limit the divergence between our predicted value and
the actual data (the “error”)
* With a single line, we cannot fit the data exactly.
* Some points will be above and some below
* How do we trade-off these errors?
* Minimize the square of the errors
* Deviations above or below the line are treated equally so you square them

Question 14

Q

Evidence of prediction error is known as

Answer

A

a residual score.
* The difference between y and y’ (or ො𝑦).
* 𝑒𝑖 = 𝑌𝑖 − ෠

Question 15

Q

the least squares criterion

Answer

A

The sum of the squared differences between the actual (y) and predicted (y’) values must have their lowest possible value

Regression designed to minimize the sum of the squared differences between y and y’.

Question 16

Q

Best Fit Line

Answer

Study These Flashcards

A

Ordinary Least Squares
* Make the distance
between your line and
the y of each point as
small as possible
* The line shows the
“predicted” values

Question 17

Q

Goodness of Fit: Coefficient of Determination

Answer

Study These Flashcards

A

𝑅^2: How much of the variability in Y is explained by the predictor(s).
* Lies between 0 (worst fit; 0% of variance explained) and 1 (best fit;
100% of variance explained).
* For simple regression: 𝑅2 = 𝑟(𝑥, 𝑦)2.

Question 18

Q

Which of the following is correct with respect to the
correlation coefficient (r) and the slope of the least-
squares regression line?

a. They will always have the same sign.
b. They will have opposite signs.
c. Neither, because they are two different measures
that are not related to one another

Answer

Study These Flashcards

A

a. They will always have the same sign.

b = r (sy/sx)

Question 19

Q

Least Squares Regression Line

Answer

Study These Flashcards

A

The Least Squares Regression Line is the line that minimizes the sum of the residuals squared. In other words, for any other line other than the LSRL, the sum of the residuals squared will be greater. This is what makes the LSRL the sole best-fitting line.

Question 20

Q

Interpreting the Slope

Answer

Study These Flashcards

A

If the value of the slope (b) is positive, it indicates how much y increases for every one unit (1.0) increase in x.
* If b is negative, it indicates how much y decreases for every unit (1.0) change in the predictor

Question 21

Q

Interpreting the Intercept (a)

Answer

Study These Flashcards

A

The intercept value, a, is the value of y when x = 0.
* For example, if the number of absences is used to predict a student’s test grade (the criterion), a indicates what the student’s score will be if the
student misses 0 days.
* Or… how much the predicted Groot 2 score would be if the wizard’s Groot 1 score was a 0.

Question 22

Q

Regression Assumptions

Answer

Study These Flashcards

A

For simple regression, essentially the same as the correlation assumptions
Y variable is continuous & ~normal
Less of an issue with large n
The relation between the predictor(s) and the outcome variables is linear

Unit 13a: Regression I Simple Linear Regression Flashcards

(22 cards)