Regression Line Test Flashcards
When iare correlation and linear regression models used for analysis
When comparing 2 continuous (interval and ratio).
When are linear regression models used for EDA
when the y-variable is continuos
Pearson’s Correlation Coefficent (R) outline
2 continuos variables. Shows how linearlly 2 variables co-vary. The bigger the spread of points the lower the correlation. Note: only shows linear correlation not correlation overall (needs EDA, night be U shaped)
Covariation Outline
The squared difference between the observed value of x or y and the mean value of x bar or y bar(horizontal/verical lines on graph). explains how y changes with respect to x
EDA Test for Association
Scatter plot and observe spread
Correaltion outline
(x - xbar) X (y - ybar). Area of square
Deterministic outline
Spread around line is minimal (most points on line). Strong association, low probability of results occuring by chance
Probabilistic Outline
Spread around line is significant (most points aren’t on lome). Low association, high probability results occured by chance
Linearity Outline
Constant change in y for every change in x
How is variability around line expressed
Probability Distribution
Regression Line Outline
Line of points of all predicted values of y for every value of x
How to evaluate line of best fit for data set
Least Square Distance Criterion. Minimise the area of the square formed from the expected and observed and expected y values.
y = mx + c
yi = Beta1(xi) + Beta0 + epsilon1
Beta 0
y-intercept. value of y when x = 0
Beta1 Outline
Slope of line. How much y changes for every increase of x. Scale dependent