Week 9 - Regression models Flashcards

1
Q

Regression

A

Analysis of two variables on a scatterplot

The regression of Y on x is the conditional mean:
E(Y | x) = µ(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple linear regression

A

Form of a straight line:
E(Y | x) = β0 + β1x

We also assume constant variance:
var(Y | x) = σ^2

There are 3 parameters:
- β0
- β1
- σ^2

Explains how the response variable, y, changes (linearly) in relation to an explanatory variable, x, on average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Estimation goals for regression

A

We wish to estimate the slope (β1), the intercept (β0), the variance of the errors (σ^2), their standard errors and construct confidence intervals for these quantities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Least squares estimation

A

Using sum of squared deviations, we need to find β0 and β1 that minimises the sum

This gives the least squares estimators

Method is called ordinary least squares (OLS)

Difference between the actual vs predicted values is called residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Properties of the estimators (B0, B1, σ^2)

A
  • All of the estimators are unbiased
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Coefficient of determination (R^2)

A

It quantifies the proportion of variation of the response variables (Yi’s) that is explained by the regression model

“This model explain about <50%> of the variation in the data”

R^2 ranges from 0 to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Maximum likelihood estimation on regression

A

Assuming a normal distribution

The β0 and β1 that maximise the likelihood (minimise the log-likelihood) are the same as those that minimise the sum of squared deviations, H(β0, β1)

The OLS estimates are the same as the MLEs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Predicting a future value

A

Point prediction is given directly from fitted regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Prediction interval

A

Estimate where future observations are likely to fall, given a certain level of confidence

Similar to CI, but is for estimating a random quantity Y, rather than a fixed quantity u(x)

Will be wider than confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Assumptions of linear regression

A

Linear model for the mean
Equal variance for all observations (homoscedasticity)
Normally distributed residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Descriptive statistics (4 moments)

A
  • Centre
  • Spread
  • Skew
  • Kurtosis/Outlier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Characteristics of a regression line

A
  • Direction: Positive or Negative
  • Strength: Strong or Weak (R^2 value)
  • Form: Linear or Non-linear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly