Lecture 8 (MULTIVARIATE REGRESSION ANALYSIS) Flashcards

1
Q

MULTIPLE REGRESSION ANALYSIS

A

Regression analysis with two or more independent variables or with at least one nonlinear predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

PROBABILISTIC MULTIPLE REGRESSION MODEL

A

y = β0 + β1x1 + β2x2 + β3x3 + … + βkxk + ͼ

y = the value of the dependent (response variable)
β0 = the regression constant
β1 = the partial regression coefficient of independent variable 
β2 = the partial regression coefficient of independent variable 2.
βk = the partial regression coefficient of independent variable k 
k = the number of independent values
ͼ  = the error of prediction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the dependent variable y sometimes referred to as?

A

The response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the partial regression coefficient of an independent variable, βi represent?

A

Represents the increase that will occur in the value of y from a one unit increase in that independent variable if all other variables are held constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ESTIMATED REGRESSION MODEL

A

y-hat = b0 + b1x1 + b2x2 + b3x3 + … + bkxk
where:
y-hat = predicted value of y
b0 = estimate of regression constant
b1 = estimate of regression coefficient 1
b2 = estimate of regression coefficient 2
b3 = estimate of regression coefficient 3
bk = estimate of regression coefficient k
k = number of independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MULTIPLE REGRESSION MODEL WITH TWO INDEPENDENT VARIABLES (FIRST ORDER)

A

The simplest multiple regression model is one constructed with two independent variables, where the highest power of either variable is 1 (first order regression model).

In multiple regression analysis, the resulting model produces a response surface.
POPULATION MODEL:
y = β0 + β1x1 + β2x2 + ͼ
ESTIMATED MODEL:
y-hat = b0 + b1x1 +b2x2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

RESPONSE PLANE FOR FIRST ORDER TWO PREDICTOR MULTIPLE REGRESSION MODEL

A

In multiple regression analysis, the resulting model produces a response surface.
In the multiple regression model shown here with two independent first order variables, the response surface is a response plane.
Fit into a 3D space (x1,x2,y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DETERMINING THE MULTIPLE REGRESSION EQUATION

A

The simple regression equations for determining the sample slope and intercept given in earlier material are the result of using methods of calculus to minimise the sum of error for the regression model.
The formulas are established to meet an objective of minimising the sum of squares of error for the model.
The regression analysis shown here is referred to as least squares analysis. Methods of calculus are applied, resulting in k+1 equations with k+1 unknowns for multiple regression analyses with k independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

LEAST SQUARES EQUATIONS FOR k = 2

A

b0n + b1Σx1 + b2 Σx2 = Σy
b0Σx1 + b1Σx1^2 + b2Σx1x2 = Σx1y
b0Σx2 + b1Σx1x2 + b2Σx2^2 = Σx2y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

TESTING THE OVERALL MULTIPLE REGRESSION MODEL

A

H0 : β1 = β2 = β3 = … = βk = 0

Ha : At least one of the regression coefficients is =/ 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

SIGNIFICANCE TESTS FOR INDIVIDUAL REGRESSION COEFFICIENTS

A
H0 : β1 = 0
Ha : β1 =/ 0
H0 : β2 = 0
Ha : β2 =/ 0
H0 : β3 = 0
Ha : β3 =/ 0
H0 : βk = 0 
Ha : βk =/ 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does it mean if you fail to reject the null hypothesis of a regression model?

A

Stating that the regression model has no significant predictability for the dependent variable.
A rejection of the null hypothesis indicates that at least one of the independent variables is adding significant predictability for y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

F-VALUE

A

MSR (MS - Regression) / MSE (MS - Residual)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are residuals calculated for in multiple regression analysis?

A

First, a predicted value, y-hat, is determined by entering the value for each independent variable for a given set of observations into multiple regression equation and solving for y-hat.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are residuals useful for?

A

Helpful in locating outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

OUTLIERS

A

Data points that are apart, or far, from the mainstream of the other data.
They are sometimes data points that were mistakenly recorded or measured.
Because every data point influence the regression model, outliers can exert an overly important influence on the model based on their distance from other points.

17
Q

What is Se in multiple regression analysis?

A
Se = sqrt (SSE/n-k-1)
n = number of observations
k = number of independent variables
18
Q

COEFFICIENT OF MULTIPLE DETERMINATION (R^2)

A

Analagous to the coefficient of determination (r^2)
R^2 represents the proportion of variation of the dependent variable, y, accounted for by the independent variables in the regression model.
Range between 0 and 1
R^2 = SSR/SSyy
R^2 = 1 - SSE/SSY

19
Q

ADJUSTED R^2

A

Sometimes additional independent variables add no significant information to the regression model, yet R^2 increases. R^2 therefore may yield an inflated figure.
Statisticians have developed an adjusted R^2 to take into consideration both the additional information each new independent variable brings to the regression model and the changed degrees of freedom of regression.

20
Q

adj. R^2

A

1 - (SSE/n-k-1)/(SSY/n-1)