Lecture 20: Correlation & Simple Regression Flashcards
What is the Pearson product-moment correlation coefficient (PPMCC) and from which values does it range
It is a measure of the linear correlation between variables X and Y
It is standardized and thus ranges between -1 and +1, where -1 means a total negative correlation, 0 means no correlation and +1 means a total positive correlation
What are the 3 most important assumptions for linear regression with 1 predictor variable and what do they entail
- Sensitivity; there is a certain sensitivity of the results to outliers - influence of outliers
- Homoscedasticity; variance of residuals should be equal across all expected values (look at scatterplot of standardized: predicted values x residuals)
- Linearity; continuous dependent and predictor variables are linearly related
What are the 6 ways of assessing an association
- Correlation between X and Y, standardized between -1,1
- Covariance between X and Y, unstandardized
- Regression coefficient in linear regression, standardized
- T-statistic: standardized difference between b1 and b0
- The correlation between Y and model prediction, standardized (between -1,1)
- F: signal to noise ratio of a model
—> the last two metrics are more indicative of an overall model’s performance
What does the b1 and b0 stand for in the regression formula
B1 = the slope
B0 = the intercept
What is the least squares estimate
The regression line that is closest to all the data points, which means that the squares are as low as possible