3 - Regressions Flashcards

1
Q

What is the general form of a simple linear regression model?

A

Yi = β0 + β1 * Xi + εi, where Yi is the dependent variable, β0 is the intercept, β1 is the slope, Xi is the independent variable, and εi is the random error term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two components of the response variable Yi in a simple linear model?

A

Yi is composed of a deterministic component (β0 + β1 * Xi) and a random component (εi), where β0 is the intercept, β1 is the slope, Xi is the independent variable, and εi is the random error term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the term E(Yi) represent in the context of linear regression?

A

E(Yi) = β0 + β1 * Xi, where E(Yi) is the expected value of Yi given Xi, β0 is the intercept, and β1 is the slope of the regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the variance Var(Yi) in a simple linear model.

A

Var(Yi) = σ² = RSS/(n-2), where σ² is the variance of the random error term εi and Yi is the dependent variable, RSS is the Residual Sum of Squares = Σεi²

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the intercept β0 signify when X = 0?

A

β0 represents the mean value of Yi when the independent variable Xi equals zero, provided the model includes X = 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What method is used to estimate the parameters β0 and β1?

A

The least squares method is used to estimate the parameters β0 and β1, minimizing the sum of squared residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the least square estimators of β0 and β1?

A

The least squares estimators are β1 = Σ(Xi - X̄) * (Yi - Ȳ) / Σ(Xi - X̄)² for the slope, and β0 = Ȳ - β1 * X̄ for the intercept, where X̄ is the mean of Xi and Ȳ is the mean of Yi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the Gauss-Markov theorem about?

A

The Gauss-Markov theorem states that under the assumptions of linearity, independence, homoscedasticity, and zero-mean errors, the least squares estimators of β0 and β1 are the best linear unbiased estimators (BLUE).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for the coefficient of determination R²?

A

R² = 1 - (SSRes / SSTot), where SSRes is the residual sum of squares and SSTot is the total sum of squares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In a simple linear regression model, what are residuals?

A

Residuals are the differences between the observed values Yi and the predicted values Ŷi, calculated as εi = Yi - Ŷi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for the least squares estimator of the slope?

A

β1 = (Σ(Xi - X̄) * (Yi - Ȳ)) / Σ(Xi - X̄)², where β1 is the slope, Xi is the independent variable, X̄ is the mean of Xi, Yi is the dependent variable, and Ȳ is the mean of Yi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for the least squares estimator of the intercept?

A

β0 = Ȳ - β1 * X̄, where β0 is the intercept, Ȳ is the mean of Yi, β1 is the slope, and X̄ is the mean of Xi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula for the fitted regression line?

A

Ŷ = β0 + β1 * Xi, where Ŷ is the predicted value of Yi, β0 is the intercept, β1 is the slope, and Xi is the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for the coefficient of determination R²?

A

R² = 1 - (SSRes / SSTot), where R² is the coefficient of determination, SSRes is the residual sum of squares, and SSTot is the total sum of squares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula for the residual sum of squares (SSRes)?

A

SSRes = Σ(Yi - Ŷi)², where SSRes is the residual sum of squares, Yi is the observed value, and Ŷi is the predicted value from the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula for the total sum of squares (SSTot)?

A

SSTot = Σ(Yi - Ȳ)², where SSTot is the total sum of squares, Yi is the observed value, and Ȳ is the mean of Yi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the formula for the variance of the slope estimator?

A

Var(β1) = σ² / Σ(Xi - X̄)², where Var(β1) is the variance of the slope estimator, σ² is the variance of the errors, σ² = RSS/(n-2), where RSS is the Residual Sum of Squares = Σεi²

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the formula for the mean squared error (MSE)?

A

MSE = SSRes / (n - 2), where MSE is the mean squared error, SSRes is the residual sum of squares, and n is the number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the formula for the confidence interval of the slope coefficient?

A

β1 ± t(1-α/2) * SE(β1),
where CI is the confidence interval, β1 is the slope coefficient, t(1-α/2) is the critical t-value, and SE(β1) is the standard error of the slope coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Explain the difference between the constant term and the random term in the simple linear model.

A

The constant term is β0 + β1 * Xi, representing the deterministic part of the model, where β0 is the intercept and β1 is the slope; the random term is εi, representing the error or unexplained variation in Yi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How would you interpret the slope β1 in a simple linear regression model?

A

The slope β1 represents the change in the expected value of Yi for a one-unit increase in Xi, assuming all other variables remain constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Why is the assumption of independent and identically distributed (i.i.d.) errors important in regression analysis?

A

The assumption of i.i.d. errors ensures that each error term εi has the same variance σ² and that errors are independent, which is crucial for the validity of least squares estimators and hypothesis tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does it mean when we say that the least squares estimators β̂0 and β̂1 are “unbiased”?

A

The estimators β̂0 and β̂1 are unbiased if their expected values E(β̂0) = β0 and E(β̂1) = β1, meaning that, on average, they correctly estimate the true parameters of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does the Gauss-Markov theorem support the use of least squares estimators in linear regression?

A

The Gauss-Markov theorem states that under the assumptions of linearity, homoscedasticity, and uncorrelated errors, the least squares estimators β̂0 and β̂1 are the best linear unbiased estimators (BLUE), having the minimum variance among all unbiased estimators.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Describe the role of residuals in determining the fit of a regression model.

A

Residuals, calculated as εi = Yi - Ŷi, represent the difference between observed values Yi and predicted values Ŷi. Smaller residuals indicate a better fit of the regression model to the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How would you interpret a coefficient of determination R² value of 0.85 in a regression model?

A

An R² value of 0.85 means that 85% of the variance in the dependent variable Yi is explained by the independent variable Xi in the regression model, indicating a strong relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

In the context of regression, what is the purpose of partitioning the total sum of squares (SSTot) into SSReg and SSRes?

A

Partitioning SSTot into SSReg (explained variance) and SSRes (unexplained variance) helps in understanding how much of the total variation in Yi is explained by the regression model (SSReg) and how much remains unexplained (SSRes).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How does the concept of homoscedasticity affect the interpretation of regression results?

A

Homoscedasticity means that the variance of the error terms εi is constant across all levels of Xi. If this assumption is violated, the standard errors of the coefficients β̂0 and β̂1 may be biased, affecting hypothesis tests and confidence intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Explain the significance of confidence intervals for the regression coefficients β0 and β1.

A

Confidence intervals for β0 and β1 provide a range of plausible values for these parameters, reflecting the precision of their estimates. A wider interval indicates more uncertainty, while a narrower interval suggests more precise estimates.

30
Q

Given a dataset of paired values (Xi, Yi), how would you apply the method of least squares to estimate the parameters β0 and β1?

A

To apply the least squares method, estimate β1 using β1 = (Σ(Xi - X̄) * (Yi - Ȳ)) / Σ(Xi - X̄)², where Xi is the independent variable, Yi is the dependent variable, X̄ is the mean of Xi, and Ȳ is the mean of Yi. Then, estimate β0 using β0 = Ȳ - β1 * X̄.

31
Q

If you observe a high variance in the residuals after fitting a regression line, how would you modify or interpret your model to account for this?

A

High variance in the residuals suggests heteroscedasticity. To address this, you could apply transformations to the dependent variable Yi, such as logarithmic or square root transformations, or use weighted least squares to account for varying variance.

32
Q

How would you apply the Gauss-Markov theorem to justify using least squares estimates in a scenario where the errors are uncorrelated and have constant variance?

A

The Gauss-Markov theorem justifies the use of least squares estimators because, under the assumptions of linearity, uncorrelated errors, and homoscedasticity (constant variance of εi), the least squares estimators β̂0 and β̂1 are the best linear unbiased estimators (BLUE).

33
Q

How would you compute the predicted value of Y for a given X using the regression equation Ŷ = β̂0 + β̂1 * X?

A

The predicted value of Y for a given X is computed using the regression equation Ŷ = β̂0 + β̂1 * X, where Ŷ is the predicted value of the dependent variable Y, β̂0 is the estimated intercept, and β̂1 is the estimated slope.

34
Q

How would you interpret and apply the coefficient of determination R² to assess the fit of your linear regression model?

A

The coefficient of determination R² = 1 - (SSRes / SSTot) measures the proportion of the variance in Yi explained by Xi. An R² close to 1 indicates a good fit, meaning the model explains most of the variation in Yi, while an R² close to 0 indicates a poor fit.

35
Q

If the errors in your model are not normally distributed, how would this affect the interpretation of hypothesis tests on β0 and β1?

A

If the errors εi are not normally distributed, the validity of hypothesis tests on β0 and β1 may be compromised, as the standard t-tests and confidence intervals rely on normality. Non-normal errors may lead to incorrect conclusions about the significance of the estimates.

36
Q

Apply the concept of confidence intervals to evaluate the precision of your estimates for β0 and β1. How would you interpret wide vs. narrow confidence intervals?

A

Confidence intervals for β0 and β1 are calculated as
β̂ ± t(1-α/2) * SE(β̂),
where t(1-α/2) is the critical t-value and SE(β̂) is the standard error. A narrow interval suggests high precision and confidence in the estimate, while a wide interval indicates more uncertainty in the estimate.

37
Q

Given that the residuals are not independent, how would you adapt or reconsider your model to account for this issue?

A

If the residuals εi are not independent, this violates the assumption of independence in the Gauss-Markov theorem. You could use methods such as generalized least squares (GLS) or include autocorrelation models, such as AR(1), to adjust for correlated errors.

38
Q

If the model’s assumptions of homoscedasticity are violated, what steps would you take to apply a different modeling approach, such as weighted least squares?

A

If homoscedasticity is violated (heteroscedasticity), weighted least squares (WLS) can be applied by assigning weights inversely proportional to the variance of the residuals, reducing the impact of observations with higher variance.

39
Q

How would you use the regression sum of squares (SSReg) and the residual sum of squares (SSRes) to test whether the explanatory variable X significantly explains the variation in Y?

A

To test whether X significantly explains the variation in Y, use the F-test: F = (SSReg / 1) / (SSRes / (n - 2)), where SSReg is the regression sum of squares, SSRes is the residual sum of squares, and n is the number of observations. A high F-value compared to the critical value indicates that X significantly explains the variation in Y.

40
Q

What is the formula for the simple linear regression model Yi?

A

Yi = β0 + β1 * Xi + εi, where Yi is the dependent variable, β0 is the intercept, β1 is the slope, Xi is the independent variable, and εi is the random error term.

41
Q

Write the formula for the least squares estimator of the slope β̂1.

A

β̂1 = (Σ(Xi - X̄) * (Yi - Ȳ)) / Σ(Xi - X̄)², where β̂1 is the slope, Xi is the independent variable, X̄ is the mean of Xi, Yi is the dependent variable, and Ȳ is the mean of Yi.

42
Q

What is the formula for the least squares estimator of the intercept β̂0?

A

β̂0 = Ȳ - β̂1 * X̄, where β̂0 is the intercept, Ȳ is the mean of Yi, β̂1 is the slope, and X̄ is the mean of Xi.

43
Q

Provide the formula for the fitted regression line Ŷ.

A

Ŷ = β̂0 + β̂1 * Xi, where Ŷ is the predicted value of Yi, β̂0 is the intercept, β̂1 is the slope, and Xi is the independent variable.

44
Q

What is the formula for the coefficient of determination R²?

A

R² = 1 - (SSRes / SSTot), where R² is the proportion of variance in Yi explained by Xi, SSRes is the residual sum of squares, and SSTot is the total sum of squares.

45
Q

Write the formula for the residual sum of squares (SSRes).

A

SSRes = Σ(Yi - Ŷi)², where SSRes is the sum of squared residuals, Yi is the observed value, and Ŷi is the predicted value from the regression line.

46
Q

Provide the formula for the total sum of squares (SSTot).

A

SSTot = Σ(Yi - Ȳ)², where SSTot is the total sum of squares, Yi is the observed value, and Ȳ is the mean of Yi.

47
Q

What is the formula for the variance of the slope estimator β̂1?

A

Var(β̂1) = σ² / Σ(Xi - X̄)², where Var(β̂1) is the variance of the slope estimator,
σ² = RSS/(n-2), where RSS is the Residual Sum of Squares = Σεi²

48
Q

Give the formula for the mean squared error (MSE) of the residuals.

A

MSE = SSRes / (n - 2), where MSE is the mean squared error, SSRes is the residual sum of squares, and n is the number of observations.

49
Q

Write the formula for a (1-α)% confidence interval for β1.

A

CI for β1 = β̂1 ± t(1-α/2) * SE(β̂1), where CI is the confidence interval, β̂1 is the slope estimate, t(1-α/2) is the critical t-value, and SE(β̂1) is the standard error of the slope.

50
Q

Relationship between Variance of the slope estimator, MSE and the residual sum of squares.

A

Var(β̂1)

σ² / Σ(Xi - X̄)²
=
SSRes / (n-2) * (Σ(Xi - X̄)²)
=
MSE / Σ(Xi - X̄)²

51
Q

How would you assess whether the residuals in your regression model are normally distributed, and what are the implications if they are not?

A

To assess normality of the residuals εi, you can use visual methods like Q-Q plots or statistical tests like the Shapiro-Wilk test. If the residuals are not normally distributed, hypothesis tests and confidence intervals based on t-distributions may not be valid, especially in small samples, which could lead to incorrect conclusions about significance.

52
Q

Compare the method of least squares with the method of maximum likelihood in estimating the parameters β0 and β1. What are the key differences in their assumptions and outcomes?

A

The method of least squares minimizes the sum of squared residuals, assuming homoscedasticity and uncorrelated errors. The method of maximum likelihood estimates parameters by maximizing the likelihood of observing the data, assuming the errors εi are normally distributed. Least squares is a simpler method with fewer assumptions, while maximum likelihood can provide more efficient estimates under normality but is more sensitive to model misspecification.

53
Q

Given the variance formula for the slope β̂1, how does the spread of the Xi values influence the precision of the slope estimate?

A

The variance of the slope estimator β̂1 is Var(β̂1) = σ² / Σ(Xi - X̄)², where σ² is the error variance and Σ(Xi - X̄)² represents the spread of Xi values. A larger spread (greater variance in Xi) reduces the variance of β̂1, leading to more precise (less variable) slope estimates. A narrow spread increases the uncertainty of the slope estimate.

54
Q

Analyze the impact of multicollinearity between predictor variables on the estimates of β0 and β1 in multiple regression. How does it affect the reliability of these estimates?

A

Multicollinearity occurs when two or more predictors are highly correlated, making it difficult to separate their individual effects on Yi. This leads to inflated standard errors of the estimates β̂0 and β̂1, making them less reliable. Small changes in the data can cause large swings in the parameter estimates, reducing the precision and stability of the model.

55
Q

Examine the partitioning of the total sum of squares (SSTot) into SSReg and SSRes. How does the magnitude of SSReg compared to SSRes inform you about the fit of the model?

A

The total sum of squares (SSTot) is partitioned into SSReg, the regression sum of squares, which explains the variation due to the independent variable Xi, and SSRes, the residual sum of squares, which represents unexplained variation. A large SSReg relative to SSRes indicates that the model fits well and explains a large portion of the variance in Yi.

56
Q

How would you evaluate the trade-off between bias and variance in the context of choosing estimators for β0 and β1?

A

In choosing estimators, a low-bias estimator such as least squares provides accurate estimates on average. However, if the estimator has high variance, it may be sensitive to sample fluctuations. A biased estimator may be preferred if it has much lower variance (e.g., ridge regression), balancing the trade-off between bias and variance to minimize mean squared error.

57
Q

If the regression residuals show a pattern or trend, what might this indicate about your model, and how could you address it analytically?

A

A pattern or trend in the residuals indicates that the model is misspecified, possibly due to non-linearity or omitted variables. You can address this by adding polynomial terms, transforming the variables (e.g., log or square root), or including additional relevant predictors to capture the underlying relationship.

58
Q

Analyze how outliers can influence the least squares estimates. How would you detect and address outliers in your regression analysis?

A

Outliers can exert disproportionate influence on least squares estimates, distorting the estimated coefficients β̂0 and β̂1. You can detect outliers using diagnostic tools like residual plots, leverage statistics, or Cook’s distance. To address outliers, you may remove or transform them, or apply robust regression techniques that reduce their impact.

59
Q

Compare and contrast one-sided and two-sided hypothesis tests in linear regression. In which situations would a one-sided test provide more analytical insight than a two-sided test?

A

A one-sided test assesses whether a parameter (e.g., β1) is greater or less than a specified value, providing insight when directionality is of interest. A two-sided test evaluates whether the parameter is different from a value in either direction. A one-sided test is useful when there is a theoretical basis for expecting an effect in a particular direction, such as testing whether an intervention increases performance.

60
Q

What is the formula for the total sum of squares (SSTot) in regression analysis?

A

SSTot = Σ(Yi - Ȳ)²,
where Yi is the observed value and Ȳ is the mean of the observed values.

61
Q

How is the total sum of squares (SSTot) partitioned in regression analysis?

A

SSTot = SSReg + SSRes,
where SSReg is the regression sum of squares and SSRes is the residual sum of squares. This is called Partitioning

62
Q

What is the formula for the regression sum of squares (SSReg)?

A

SSReg = Σ(Ŷi - Ȳ)² = (β̂1)² Σ(Xi - X̄)²,
where Ŷi is the predicted value and Ȳ is the mean of the observed values.

63
Q

How is the residual sum of squares (SSRes) calculated?

A

SSRes = Σ(Yi - Ŷi)²,
where Yi is the observed value and Ŷi is the predicted value from the regression model.

64
Q

How is the regression sum of squares (SSReg) expressed in terms of the regression coefficient (β̂1)?

A

SSReg = (β̂1)² Σ(Xi - X̄)²,
where β̂1 is the estimated slope and Xi is the independent variable with mean X̄.

65
Q

What are the degrees of freedom for the regression, residual, and total sum of squares in a regression model?

A

Regression SS: df = 1,
Residual SS: df = n - 2,
Total SS: df = n - 1,
where n is the number of observations in the dataset.

66
Q

How would you derive the formula for β̂1 in simple linear regression using covariance and variance?

A

β̂1 = Cov(X, Y) / Var(X)

67
Q

t-stat formula for β̂1

A

t* = (β̂1 - β1⁰) / √(Var(β̂1)) ~ t_(n-2)

68
Q

t-stat formula for β̂0

A

t* = (β̂0 - β0⁰) / √(Var(β̂0)) ~ t_(n-2)

69
Q

confidence intervals for a parameter θ̂

A

θ̂ ± q_(1-α/2) * √(Var(θ̂))

70
Q

General interpretation of a confidence interval:

A

If we collect a random
sample from a population a large number of times, and each time
we compute a (1 − α)% confidence interval for a parameter, then
(1−α)% of the confidence intervals will include the true value of the
parameter.

71
Q

What three assumptions are usually made about the unobservable error terms in the classical
linear regression model?

A

E(εi) = 0
Var(εi) = σ^2 < ∞
Cov(εi, εj ) = 0, ∀i != j