W2 Correlations and Predictions Flashcards
What does no correlation look like?
What does positive correlation look like?
What does negative correlation look like?
What does covariance tell us?
Covariance is a measure of how much two random variables vary together. The magnitude of their relationship. Their directionality ( + or -)
What is the formula for covariance?
SUM[i-n] (x[i] - μ(x))(y[i] - μ(y)) / n
What are the problems with covariance? σXY
The variables are centred, but not to scale. If Cov(X,Y) = 3.9 and Cov(Z,Q) = 5.2, we know both pairs are positively correlated, but we don’t know which one has the stronger correlation, because they could be in different scales/units.
How to scale covariance with Z scores?
z = (x – μ) / σ Divide it by the standard deviation. Standardized scores are called z-scores. SUM[i-n] ZxZy / n #for each x and y in data.
How to scale covariance from raw scores? ρ
Start with covariance. Replace (x - μ)(y - μ) with the ((x – μ) / σ) ((y – μ) / σ) Simplify and it becomes: σxy / Sqrt(σx^2 σy^2)
How do we determine if correlation means causation?
Run an experiment, explicitly manipulate independent variable, one at a time
What is the linear regression formula?
Y(hat) = b[0] + b[1]X
What’s the difference between Y and Y[hat]?
Y is the actual real life value on plotted on the graph, Y[hat] is the predicted value
What are Residuals?
Vertical deviations from a point (dot) to the line
What is the formula for SSresidual/SSerror?
SUM[i-n] (Y[i] - Y[i hat])^2
How do we calculate INTERCEPT & SLOPE from SSerror?
Start with formula, then sub in b[0] + b[1]X in place of Y[i.hat]. SUM[i-n] (Y[i] - Y[i hat])^2 SUM[i-n] (Y[i] - b[0] + b[1]Xi)^2 Then rearrange to make b1 or b0 the subject. b0 = Y[mean] - b1X[mean]
What are the key assumptions of linear regression?
Linear relationship (straight, not curve)
Homoscedasticity (not a cone, equal distrubution)
Normality of residuals (On both extremes on ends cancel out/match)