Linear Regression Flashcards
What is a correlation?
A measure of strength of a linear relationship, between two variables.
What are the two key factors of Correlations?
- The two variables in a correlation typically reflect numerical values
- It is represented with positive or negative values
What is a correlation coefficient?
The value that indicates the strength of the relationship between two variables.
- Falls between +1 and -1
Linear relationships are typically represented by what?
Pearson’s r
What does Pearson’s r tell us?
2 points
Strength and direction of the relationship between two variables.
What does regression do?
Expresses the relationship in the form of an equation.
- The equation for the line of best fit
- can be used to form predictions
How many predictor variables in ‘Simple Linear Regression’?
1
How many predictor variables in ‘Multiple Linear Regression’?
+ 1
What is a predictor variable?
The name given to an independent variable in regression analysis
What is the equation for simple regression?
y = a + bx
a = intercept
b = slope
What is meant by intercept ?
The point the line crosses the y axis
- y = 10 + 1x (line cross y axis at point 10)
What is meant by slope?
How much y increases for every increase in x
- y = 0 + 2x (2 y-units increase per 1 x-unit)
- y = 0 - 0.5x (0.5 y-units decrease per 1 x-unit)
How can we determine the strength of a regression fit?
By calculating the Sum of Squares Error.
The smaller the SSerror the stronger the regression fit.
How do you calculate the SSerror
- Calculate the deviation for each data point and square it.
- Add up all deviations
How do you calculate Pearson’s r
- convert x + y values into z scores (equation for this)
- multiply Z values of x by Z values of y
- Add up the column + divide by the number of participants - 1 (equation for this)
How do you calculate the regression slope?
You divide the standard deviation of y by the standard deviation of x, and multiply it by Pearson’s r.
b = r (sy/sx)
How do you calculate the regression intercept?
subtract the mean of y from the slope, multiplied by the mean of x.
a = mean of y - slope * mean of x.
What is total variance?
How much each data point varies from the mean
What is Error variance?
how much each data point varies from the predicted value
What is regression variance?
How much the predicted value varies from the mean
- more regression variance = stronger relationship
How do you calculate total variance?
- Subtract the y value from the mean
- square the difference from the mean
- add up all the square differences
How do you calculate Error variance?
- Input the x-values into regression equation
- subtract the predicted score from the actual test score
- square the predicted difference
- add up all the squared predicted differences
What does variance explained tell us
how much of the data is explained by the regression model