Topic 8 Linear Regression Flashcards

1
Q

What is the goal of linear regression in statistics?

A

To explain variability in a dependent variable using a linear relationship with an independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True/False
Linear regression only applies to physical processes.

A

False — it applies to any measurable variable relationships, even random ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In the linear model
Y = a + bX + Z, what is Z?

A

The error term, assumed to be independent of X and with zero mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The values of a and b are chosen to minimize the sum of _________.

A

squared residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula for the parameters a and b?

A

b̂ = Cov(X, Y) / Var(X)
â = Ȳ − b̂ X̄

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

True/False
The regression coefficient estimates â and b̂ are biased?

A

False - they are unbiased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the formula for Var(b̂)?

A

Var(b̂) = σ_Z² / (n · Var(X))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the formula for Var(â)?

A

Var(â) = (1/n + (X̄² / (n · Var(X)))) · σ_Z²

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the coefficient of determination r² represent?

A

The proportion of total variance in Y explained by the linear model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following is NOT a typical inference question in linear regression?
a) Is slope b ≠ 0?
b) What is the mean Y at X = x₀?
c) Is X normally distributed?
d) What is the predicted Y at X = x₀?

A

c) Is X normally distributed?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the estimate of the standard deviation of Z?

A

σ̂_Z = sqrt(Var̂(Z))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the unbiased estimate of the variance of Z (the error term)?

A

Var̂(Z) = (n * var(Y) - cov(X, Y)^2 / var(X)) / (n - 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the coefficient of determination (r-squared)?

A

r² = cov(X, Y)^2 / (var(X) * var(Y))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the confidence interval for the slope b when population variance is known?

A

b̂ ± z_(α/2) * sqrt(Var(Z) / (n * var(X)))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the t-statistic used for confidence interval of b when variance is unknown?

A

T = (b̂ - b) / sqrt(Var̂(Z) / (n * var(X)))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the confidence interval for the slope b using Student’s t-distribution?

A

b̂ ± t_(n-2, α/2) * sqrt(Var̂(Z) / (n * var(X)))

17
Q

What is the t-statistic for the intercept a?

A

T = (â - a) / sqrt(Var̂(Z) * (1/n + mean(X)^2 / (n * var(X))))

18
Q

How does the prediction interval differ from the mean response interval?

A

It includes an extra variance term for the individual prediction (+1 term), making it wider.

19
Q

What is the null hypothesis in a t-test on slope?

20
Q

What is the test used to assess linear correlation between X and Y?

A

Correlation test using sample coefficient cxy.

21
Q

What does the F-test evaluate in linear regression?

A

Whether the regression model explains a significant amount of variance in Y.

22
Q

What does Matlab’s regress(Y, X) return?
a) Only slope
b) Only intercept
c) Regression stats including CIs
d) Histograms

A

c) Regression stats including CIs

23
Q

What is the confidence interval for a mean response at X = x₀?

A

â + b̂ * x₀ ± t_(n-2, α/2) * sqrt(Var̂(Z) * (1/n + (x₀ - mean(X))^2 / (n * var(X))))

24
Q

What is the confidence interval for a prediction of Y at X = x₀ (including Z)?

A

â + b̂ * x₀ ± t_(n-2, α/2) * sqrt(Var̂(Z) * (1/n + (x₀ - mean(X))^2 / (n * var(X)) + 1))

25
Q

What is the t-statistic for testing the slope (b = b*)?

A

T = (b̂ - b*) / sqrt(Var̂(Z) / (n * var(X)))

26
Q

What is the V-statistic for testing correlation Corr(X,Y) = 0?

A

V = sqrt((n - 3)/2) * ln[((1 + c_xy)(1 - Corr(X, Y))) / ((1 - c_xy)(1 + Corr(X, Y)))]

27
Q

What is the regression sum of squares (ssreg)?

A

ssreg = sum[(â + b̂ * xᵢ - (â + b̂ * mean(X)))^2] = n * var(â + b̂X)

28
Q

What is the residual sum of squares (ssres)?

A

ssres = sum[(yᵢ - (â + b̂ * xᵢ))^2] = (n - 2) * Var̂(Z)

29
Q

What is the total sum of squares (sstot)?

A

sstot = sum[(yᵢ - mean(Y))^2] = ssreg + ssres = n * var(Y)

30
Q

What is the F-statistic for testing regression significance?

A

F = (ssreg / 1) / (ssres / (n - 2))