Lecture 6/7 (BIVARIATE REGRESSION ANALYSIS) Flashcards

1
Q

REGRESSION ANALYSIS

A

The process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

CORRELATION

A

A measure of the degree of relatedness of two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

COEFFICIENT OF CORRELATION (r)

A

Applicable only if both variables being analysed have at least an interval level of data.

The term r is a measure of the linear correlation of two variables.
The number range from -1 to 0 to +1.
The closer it is to +1, the higher the correlation between the dependent and the independent variables.
r<0 - negative correlation
r> 0 positive correlation
r=0 no correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT

A

Formula in booklet.

r =SSxy / sqrt (SSx)(SSy)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

BIVARIATE (TWO VARIABLES) LINEAR REGRESSION MODEL

A

The most elementary regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

DEPENDENT VARIABLE

A

The variable to be predicted, usually Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

INDEPENDENT VARIABLE

A

The predictor or explanatory variable. Usually X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DETERMINISTIC REGRESSION MODEL

A

y = β0 + β1x
β0 and β1 are population parameters
They are estimated by sample statistics b0 and b1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

PROBABILISTIC REGRESSION MODEL

A

y = β0 + β1 + ͼ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

EQUATION OF THE SIMPLE REGRESSION LINE

A

Yhat = b0 + b1x

b0 = sample intercept
b1 = sample slope
yhat = predicted value of y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

LEAST SQUARES REGRESSION ANALYSIS

A

A process whereby a regression model is developed by producing the minimum sum of the squared error values.
The vertical distance from each point to the line is the error of prediction.
The least squares regression line is the regression line that results in the smallest sum of errors squared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for b1 and b0?

A
b1 = SSxy/SSxx
b0 = ybar - b1 * xbar
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

RESIDUAL

A

The difference between the actual value and the value predicted by the regression model (y-hat); the error of the regression model in predicting each value of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ASSUMPTIONS OF THE SIMPLE REGRESSION ANALYSIS

A
  1. The model is linear.
  2. The error terms have constant variances. (homoskedasticity)
  3. The error terms are independent.
    The error terms are normally distributed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RESIDUAL PLOT

A

A graph in which the residuals for a particular regression model are plotted along with their associated value of x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

STANDARD ERROR OF THE ESTIMATE

A

Residuals represent errors of estimation for individual points.
A more useful measurement of error is the standard error of the estimate.
The standard error of the estimate, denoted Se, is a standard deviation of the error of the regression model.

Se = sqrt (SSE/n-2)
SSE formula in booklet

17
Q

COEFFICIENT OF DETERMINATION

A

r^2
Is the proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x).
Ranges from, 0 to 1.
An r^2 of zero means that the predictor accounts for none of the variability of the dependent variable and that there is no regression prediction of y by x.
An r^2 of 1 means perfect prediction of y by c and that 100% of the variability of y is accounted for by x.

18
Q

FORMULA FOR COEFFICIENT OF DETERMINATION

A

SSyy = explained variation + unexplained variation
SSyy = SSR + SSE
r^2 = SSR/SSyy
=1 - SSE/SSyy

19
Q

HYPOTHESIS TESTS FOR THE SLOPE OF THE REGRESSION MODEL

A

A hypothesis test can be conducted on the sample slope of the regression model to determine whether the population slope is significantly different from zero.

Using the non regression model (the model) as a worst case, the researcher can analyse the regression line to determine whether is adds a more significant amount of predictability of y than does the model.
As the slope of the regression line diverges from zero, the regression model is adding predictability that the line is not generating.
Testing the slope of the regression line to determine whether the slope is different from zero is important.
If it is not different from zero, the regression line is doing nothing more than the average line of y predicting y.

20
Q

What is the hypothesis test for the slope of the regression model?

A
H0 : β1 = 0 
H1 : β1 =/ 0 
OR
H0 : β  0
OR
H0 : β -> 0 
H1 : β < 0
t test:
t = (b1 - β1) / Sb
WHERE:
Sb = Se/sqrt(SSxx)
Se = sqrt(SSE/n-2)
β1 = the hypothesised slope
df = n -2