Lecture 6/7 (BIVARIATE REGRESSION ANALYSIS) Flashcards
REGRESSION ANALYSIS
The process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable.
CORRELATION
A measure of the degree of relatedness of two variables.
COEFFICIENT OF CORRELATION (r)
Applicable only if both variables being analysed have at least an interval level of data.
The term r is a measure of the linear correlation of two variables.
The number range from -1 to 0 to +1.
The closer it is to +1, the higher the correlation between the dependent and the independent variables.
r<0 - negative correlation
r> 0 positive correlation
r=0 no correlation
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
Formula in booklet.
r =SSxy / sqrt (SSx)(SSy)
BIVARIATE (TWO VARIABLES) LINEAR REGRESSION MODEL
The most elementary regression model.
DEPENDENT VARIABLE
The variable to be predicted, usually Y.
INDEPENDENT VARIABLE
The predictor or explanatory variable. Usually X.
DETERMINISTIC REGRESSION MODEL
y = β0 + β1x
β0 and β1 are population parameters
They are estimated by sample statistics b0 and b1
PROBABILISTIC REGRESSION MODEL
y = β0 + β1 + ͼ
EQUATION OF THE SIMPLE REGRESSION LINE
Yhat = b0 + b1x
b0 = sample intercept b1 = sample slope yhat = predicted value of y
LEAST SQUARES REGRESSION ANALYSIS
A process whereby a regression model is developed by producing the minimum sum of the squared error values.
The vertical distance from each point to the line is the error of prediction.
The least squares regression line is the regression line that results in the smallest sum of errors squared.
What is the formula for b1 and b0?
b1 = SSxy/SSxx b0 = ybar - b1 * xbar
RESIDUAL
The difference between the actual value and the value predicted by the regression model (y-hat); the error of the regression model in predicting each value of the dependent variable.
ASSUMPTIONS OF THE SIMPLE REGRESSION ANALYSIS
- The model is linear.
- The error terms have constant variances. (homoskedasticity)
- The error terms are independent.
The error terms are normally distributed.
RESIDUAL PLOT
A graph in which the residuals for a particular regression model are plotted along with their associated value of x.