LEC 11a Simple Linear Regression Flashcards

1
Q

How to decide on appropriate statistical test for regression?

A

Depends on the type of dependent variable

  1. Continuous variable
    - linear regression
  2. Ordinal variable
    - ordinal regression
  3. Nominal variable
    - logistic regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple vs Multiple regression

A

Applies to continuous, ordinal and nominal variables

Simple regression
- only 1 independent variable

Multiple regression
- more than 1 independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation vs Simple linear regression (2)

  • definition
  • symmetry
A

Correlation

  • quantifies the degree to which 2 random variables are related, provided that the relationship is linear
  • makes no distinction between the 2 variables (symmetrical)

Simple linear regression

  • determines the best-fitting straight line for a dataset to investigate the change in 1 variable (dependent variable Y) that corresponds to a given change in the other variable (independent variable X), provided that there is significant correlation
  • X and Y are asymmetrical
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Applications of simple linear regression (2)

A
  1. Describe the linear relationship between the 2 variables

2. Predict or estimate the value of Y associated with a fixed value of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can extrapolate values of Y beyond the observed range?

A

Cannot extrapolate beyond the observed range as the relationship between X and Y may not be linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Simple linear regression model

A

Y = alpha + beta(X)

alpha = y-intercept
beta = slope
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Alpha meaning

A

Mean value of Y when X=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Beta meaning

A

The change in the mean value of Y that corresponds to a one-unit change in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Does linear regression test for linear relationship between the 2 variables?

A

No.
- it assumes linear relationship
- finds the best-fitting straight line with the y-intercept and slope
Hence, always plot scatter plot to determine if there is any linear relationship

Linear relationship : linear regression
Non-linear relationship : non-linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Scatter plot to determine use of linear regression

A

Scatter plot must suggest :

  1. Linear relationship
  2. Significant correlation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Assumptions of simple linear regression model (4)

A
  1. There is linear relationship between the variables
  2. Each observations are independent of one another
  3. For any specified values of X, the distribution of the Y values is normal
  4. For any set of values of X, the variance is constant (equal variance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to determine the best-fitting straight line?

A

Method of Least Squares

- best-fitting line = line with the smallest residual sum of squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Residual plot

A

Residual against Y values

Each residual data is randomly scattered above and below ei=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Test statistics for beta (slope)

Ho & H1

A

Two-tailed test

Ho :

  • there is no effect of the independent variable X on the dependent variable Y
  • beta = 0
  • equivalent to testing correlation = 0

H1 :

  • there is an effect of the independent variable X on the dependent variable Y
  • beta =/ 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Test statistics for alpha (constant)

A

Seldom done cos not really important

Two-tailed test

Ho :
- alpha = 0

H1 :
- alpha =/ 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Evaluation of the goodness-of-fit regression model

A

Coefficient of determination (R^2)

In simple linear regression, R^2 = r^2,
r = Pearson product-moment correlation coefficient

17
Q

R^2

  • meaning
  • range
A
  • the proportion of the variability among the observed values of Y that is explained by the linear regression of Y on X
  • range : 0 =< R^2 =< 1
18
Q

R^2 = 1

A

All data points lie exactly on the best-fitting line

19
Q

R^2 = 0

A

There is no linear relationship between X and Y

20
Q

R^2

A

Coefficient of determination

21
Q

Significance level for constant value in statistical report

A

Not important

22
Q

Significance level for ANOVA report (2)

A
  • p-value for overall significance of regression model

- same significance level value for Coefficients report

23
Q

Sum of squares (regression) in ANOVA report

A
  • variability in Y that is explained by the regression model
24
Q

Sum of squares (residual) in ANOVA report

A
  • variability in Y that is unexplained by the regression model
25
Q

Sum of squares (total) in ANOVA report

A
  • total variability in Y
26
Q

Steps to analyse linear regression (4)

A
  1. Check if assumptions of linear regression fulfilled
    - independent observations
    - for each set of X values, there is equal variance
    - for each set of X values, the distribution of Y values is normal
    - linear relationship between both variables
  2. Scatter plot to determine linear relationship
  3. Correlation
    - Pearson Product Moment Correlation
    - Spearman Rank Correlation
    - must show significant correlation (proceed to step 4)
  4. Conduct linear regression analysis