Simple Linear Regression Flashcards

1
Q

Regression analysis

A

Regression analysis is used to:
predict the value of a dependent variable (Y) based on the value of at least one independent variable (X)
explain the impact of changes in an independent variable on the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Dependent variable (y)

A

Dependent variable (Y): the variable we wish to predict or explain (response variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Independent variable (x)

A

Independent variable (X): the variable used to explain the dependent variable (explanatory variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Simple linear regression

A

Only one independent variable, X
Relationship between X and Y is described by a linear function
Changes in Y are assumed to be caused by changes in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

b0 and b1

A

b0 and b1 are obtained by finding the values of b0 and b1 that minimise the sum of the squared differences between actual values (Y) and predicted values ( )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

b0

A

b0 is the estimated average value of Y when the value of X is zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

b1

A

b1 is the estimated change in the average value of Y as a result of a one-unit change in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SST

A

Total Sum of Squares

Measures the variation of the Yi values around their mean Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SSR

A

Regression Sum of Squares

Explained variation attributable to the relationship between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SSE

A

Error Sum of Squares
Variation attributable to factors other than
the relationship between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Coefficient of Determination, r2

A

The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable
The coefficient of determination is also called r-squared and is denoted as r2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ASSUMPTIONS OF REGRESSION

A

Linearity
Independence of errors
Normality of errors
Equal variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Linearity

A

The underlying relationship between X and Y is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Independence of errors

A

Error values are statistically independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normality of errors

A

Error values (ε) are normally distributed for any given value of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Equal variance

A

The probability distribution of the errors has constant variance

17
Q

residual for observation

A

The residual for observation i, ei, is the difference between its observed and predicted value

18
Q

Check the assumptions of regression by examining the residuals:

A

Examine for linearity assumption
Evaluate independence assumption
Evaluate normal distribution assumption
Examine for constant variance for all levels of X (homoscedasticity)

Graphical Analysis of Residuals
Can plot residuals vs. X

19
Q

Pitfalls of regression analysis

A

Lacking an awareness of the assumptions underlying least-squares regression

Not knowing how to evaluate the assumptions

Not knowing the alternatives to least-squares regression if a particular assumption is violated

Using a regression model without knowledge of the subject matter

Extrapolating outside the relevant range

Concluding that a significant relationship in observational study is due to a cause and effect relationship

20
Q

Types of relationships

A

1-2

21
Q

Equation

A

3

22
Q

Equation explained

A

4

23
Q

Sample equation and least squares method

A

5-6

24
Q

Example 1

A

7-12

25
Q

Interpolation v extrapolation

A

13

26
Q

Measures of variation

A

14-15

27
Q

rsquared

A

16-19

28
Q

Standard error

A

20-22

29
Q

Residual analysis

A

23-28

30
Q

Slope inferences

A

29-31

31
Q

T test

A

32-34

32
Q

F test

A

35-37

33
Q

Confidence interval

A

38