Regression through MT 1 Flashcards

1
Q

For assessing the normality assumption of the ANOVA model, we can only use the quantile-quantile normal plot of the residuals.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The constant variance assumption is diagnosed using the histogram

A

false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The estimator sigma^2 is a random variable

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The regression coefficients are used to measure the linear dependence between two variables

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The mean sum of square errors in ANOVA measures variability within groups

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Beta-hat-1 is an unbiased estimator for Beta-0

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Under the normality assumption, the estimator Beta-1 is a linear combination of normally distributed random variables

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In simple linear regression models, we lose three degrees of freedom because of the estimation of the three model parameters B-0, B-1, Sigma^2

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The assumptions to diagnose with a linear regression model are independence, linearity, constant variance, and normality

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The sampling distribution for the variance estimator in ANOVA is chi-square regardless of the assumptions of the data

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If the constant variance assumption in ANOVA does not hold, the inference on the equality of the means will not be reliable

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The negative value of B-1 is consistent with an inverse relationship between x and y

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If one confidence interval in the pairwise comparison does not include zero, we conclude that the two means are plausible equal

A

False. if it DOES include zero, we conclude the two means are plausibly equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The mean sum of square errors in ANOVA measures variability between groups

A

False (to be confirmed by EH, it measures the variability within groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The linear regression model with a qualitative predicting variable with k levels/classes will have k+1 parameters to estimate

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

We assess the assumption of constant-variance by plotting the response variable against fitted values

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The number of degrees of freedom of the chi-square distribution for the variance estimator is N-K+1 where k is the number of samples

A

False (it’s n-k-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The prediction interval will never be smaller than the confidence interval for data points with identical predictor values

A

True (add ‘because….’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If one confidence interval in the pairwise comparison includes only positive values, we conclude that the difference in means is statistically significantly positive

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Conducting t-tests on each beta parameter in a multiple regression model is the best way for testing the overall significance of the model

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In the case of a multiple linear regression model containing 6 quantitative predicting variables and an intercept, the number of parameters to estimate is 7

A

False (parameters are coefficients + variance + intercept which would be 8)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The regression coefficient corresponding to one predictor in multiple linear regression is interpreted in terms of the estimated expected change in the response variable when there is a change of one unit in the corresponding predicting variable holding all other predictors fixed

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The proportion of variability in the response variable that is explained by the predicting variables is called correlation

A

False (R^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Predicting values of the response variable for values of the predictors that lie within the data range is known as extrapolation

A

False. It is extrapolation if the predictor values are outside of the known data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

In multiple linear regression, we study the relationship between a single response variable and several predicting quantitative and/or qualitative variables

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The sampling distribution used for estimating confidence intervals for the regression coefficients is the normal distribution

A

False (confidence interval uses a t-dist)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A partial f-test can be used to test whether a subset of regression coefficients are all equal to zero

A

True

28
Q

Prediction is the only objective of multiple linear regression

A

False, estimation is also a goal

29
Q

The equation to find the estimated variance of the error terms of a multiple linear regression model with intercept can be obtained by summing up the squared residuals and dividing that by n-p, where n is the sample size and p is the number of predictors

A

False (it is n-p-1)

30
Q

For a given predicting variable, the estimated coefficient of regression associated with it will likely be different in a model with other predicting variables or in the model with only the predicting variable alone

A

True

31
Q

Observational studies allow us to make causal inference

A

False

32
Q

In the case of multiple linear regression, controlling variables are used to control for sample bias

A

True

33
Q

In the case of a multiple regression model with 10 predictors, the error term variance estimator follows a chi-squared distribution with n-10 degrees of freedom

A

False (it is n-10-1)

34
Q

The estimated coefficients obtained by using the method of least squares are unbiased estimators of the true coefficients

A

True

35
Q

Before making statistical inference on regression coefficients, estimation of the variance of the error terms is necessary

A

True

36
Q

An example of a multiple regression model is Analysis of Variance (ANOVA)

A

True

37
Q

Given a qualitative predicting variable with 7 categories in a linear regression model with intercept, 7 dummy variables need to be included in the model

A

False

38
Q

It is good practice to create a multiple linear regression model using a linearly dependent set of predictor variables

A

False

39
Q

If the confidence interval for a regression coefficient contains the value zero, we interpret that regression coefficient is definitely equal to zero

A

False

40
Q

The larger the coefficient of determination (r-squared), the higher the variability explained by the simple linear regression model

A

True

41
Q

The estimators of the error term variance and of the regression coefficients are random variables

A

True

42
Q

The one-way ANOVA is a linear regression model with one qualitative predicting variable

A

True

43
Q

We can assess the assumption of contant-variance in multiple linear regression by plotting the standardized residuals against fitted values

A

True

44
Q

If one confidence interval in the pairwise comparison includes zero under ANOVA, we conclude that the two corresponding means are plausibly equal

A

True

45
Q

We do not need to assume independence between data points for making inference on the regression coefficients

A

False

46
Q

Assuming the model is a good fit, the residuals in simple linear regression have constant variance

A

True

47
Q

We cannot estimate a multiple linear regression model if the predicting variables are linearly independent

A

False

48
Q

If a predicting variable is categorical with 5 categories in a linear regression model without intercept, we will include 5 dummy variables in the model

A

True

49
Q

In ANOVA, the number of degrees of freedom of the chi-squared distribution for the variance estimator is N-k-1 where k is the number of groups

A

False

50
Q

The only assumptions for a simple linear regression model are linearity, constant variance, and normality

A

False

51
Q

In simple linear regression, the confidence interval of the response increases as the distance between the predictor values and the mean value of the predictors decreases

A

False

52
Q

The sampling distribution of the estimated variance of the error terms of a multiple linear regression model with k predictors and an intercept is a t-distribution with n-k-1 degrees of freedom

A

false (it is a chi-square dist)

53
Q

The assumption of normality in simple linear regression is required for the derivation of confidence intervals, prediction intervals, and hypothesis testing

A

True

54
Q

Outliers will always have a significant influence on the estimated slope in simple linear regression

A

False

55
Q

In simple linear regression we can assess whether the errors in the model are correlated using the plot of residuals vs. fitted values

A

True

56
Q

In the ANOVA test for equal means, the alternative hypothesis is that all means are not equal

A

True or False

57
Q

If a predicting variable is a qualitative variable with three categories in a linear regression model with intercept, we should include all three dummy variables

A

False

58
Q

The estimation of the mean response has a wider confidence interval than for the prediction of a new response

A

False

59
Q

The assumption of normality is required in ANOVA in order to assess the null hypothesis of equal means

A

True

60
Q

The box-cox transformation is applied to the predicting variables if the normality assumption does not hold

A

False

61
Q

If the confidence interval of a regression coefficient does not include zero, we interpret the regression coefficient to be statistically significant

A

True

62
Q

A nonlinear relationship between the response variable and a predicting variable cannot be modeled using regression

A

False

63
Q

R-squared is the best measure to check if a linear regression model is a good fit to the data

A

False (error terms and stuff)

64
Q

The F-test in ANOVA compares the between variability versus the within variability

A

True

65
Q

In ANOVA testing, the variance of the response variable is different for each sub-population

A

False

66
Q

If one or more of the regression assumptions does not hold, then the model does not fit the data well, thus it is not useful in modeling the response

A

False