Midterm Flashcards

Question

Degree of freedom | dfSSTO=dfSSR+dferror

Answer 1

dfsst=dferror+dfreg | n-1=n-p+p-1

Answer 2

Null: Constant Variance Alter: Non-constant Variance E.g., Age with p value 0.3, we fail to reject the null hypothesis at the 0.05 level and claim there is NOT enough evidence to conclude non-constant variance E.g., Even though BMI has borderline significant p-value, the overall test suggests constant variance. We claim that no remedial actions are necessary at this point

Answer 3

You can test outlier by scatter plotting semi-studentized residual Any observation with greater than plus or minus 4 semi-studentized residuals suggest outlier.

Answer 4

SYSBP^(-1)=0.01276-0.00007775(BMI)-0.00003659(AGE)-0.000001463(TOTCHOL)

Answer 5

Null: Residuals are normal Alt: Residuals are NOT normal At the 0.05 level, we fail to reject the null hypothesis and claim there NOT enough evidence to conclude the residuals are NOT normal

Answer 6

Decision Rule If F*≤2.63149, conclude H0 If F*＞2.63149 conclude HA (One predictor: There is a linear relationship, Multiple predictors=At least one of the coefficient is non-zero)

Answer 7

No, BMI, age, total cholesterol of 0 do not make any clinical sense and is biologically impossible. The range of data does not include 0 for any of the independent variables, so we cannot draw inference there

Answer 8

-Statistic = t test -Hypotheses H0=B1=0 (Parameter is zero) HA=B1=/ 0 (Parameter is NOT equal to zero) - We reject the null hypothesis that the coefficient of age is significantly different from 0

Answer 9

For every 1 unit increase in BMI, the average inverse systolic blood pressure decrease by 0.00007775 mmHg^(-1) while considering age and serum cholesterol total and holding them constant.

Answer 10

False: Boxplot is designed for examining quantiles, and describe symmetry of the data so that it doesn't reveal whole information of distribution.

Answer 11

For every 1 year increase in Age, the average inverse systolic blood decrease by 0.00003659 mmHG^-1 while considering BMI and serum cholesterol total and holding them constant

Answer 12

For every 1 unit increase in serum cholesterol total, average inverse systolic blood will decrease by 0.000001463 mmHG-1 while considering BMI and Age and holding them constant.

Answer 13

We are 90% confident that the true parameter of BMI falls between -0.00010325 -0.00005225

Answer 14

We are 95% confident that the true parameter of Age falls between -0.00004738 -0.00002581

Answer 15

95% confidence interval Y ̂h±t(1-α⁄2;n-p) s{Y ̂h } S(Y ̂h )=MSE(Xh^' (X^' X)^(-1) Xh )=Xh^' s^2 {b} Xh SYSBP^(-1)=0.01276-0.00007775(25)-0.00003659(54)-0.000001463(200) =0.007914 The predicted average expected INVERSE systolic blood pressure is 0.007914 with a 95% prediction interval of (0.007362; 0.000.008092)mmHg-1

Answer 16

1. Visualize your data using Scatter plot, Histogram | 2.

Answer 17

ei=Yi-Y^i | ei is best estimate of Ei

Answer 18

Fine the line through the data that has the smallest sum of squared perpendicular distance from the line to each point

Answer 19

residual (ei)√(MSE) | MSE is in the table

Answer 20

1. Non linear regression function 2. Non constant variance of the error term (ei) 3. Non-independence of error terms (Ei) 4. Identify outliers 5. Non-normality of errors (Ei) 6. Omission of other important predictors from the model

Answer 21

Check normal distribution | BUT sample size can affect

Answer 22

plot the quantile, if data line on equator line, it means residuals have normal distribution

Answer 23

Check via scatterplot Plot residual vs predicted or Plot residual vs xi

Answer 24

Plot residual vs Y^ (predicted value) is preferred for multiple regression

Answer 25

Non-constant variance -No pattern (megaphone shape) Check for outlier (plot semi resid vs predicted Yi^) (Outliers are >4 or

Answer 26

It can be affected by many things such as non-constant variance

Answer 27

ei vs x e2i vs x (it magnify the relationship, it is preferred) Ieil vs x

Answer 28

ei vs predicted

Answer 29

semi-stud residuals vs predicted

Answer 30

- Large sample test - Assumes the error terms Ei are independent - Assume the error terms Ei are normally distributed

Answer 31

- Adding nonlinear terms (like X2) | * Use a transform of Y and/or X

Answer 32

- Variance stabilizing transformation | - Weighted least squares

Answer 33

-Adding time covariate to the model

Answer 34

- Check if they are errors in the data and correct them | - Robust linear regression methods

Answer 35

1. Can linearize non-linear relationships 2. Can stabilize non-constant variance 3. Can reduce non-normality

Answer 36

1. Start transform Y 2. If you see non-linear trend (quadratic, logarithmic) in scatter plot, add terms like X2 or log(X) in the model form start

Answer 37

Used to find best transformation from the family of power transformations on Y Chooses the best ramda from the data

Answer 38

LOWESS (locally weighted regression scatter plot smoother) is non-parametric regression curve that can check linearity. -If LOWESS curve is inside of confidence bands, then linearity is satisfied Smoothing technique, Non-parametric regression curves can help: -Fit the data with a smooth curve -Does not assume the shape of the curve

Answer 39

Yi=B0+B1Xi1+Bp-1Xip-1+Ei

Answer 40

Y=XB+E E(Y)=XB Var (Y)=σ2I

Answer 41

b=(X'X)-1X'Y

Answer 42

var(b)=σ2(X'X)-1

Answer 43

Y~MVN (M, ∑)

Answer 44

1. 3 paramters 2. Only meaningful if X1 and X2 include 0. It means the mean response, E(Y) at X1=0 and X2=0 3. B1 is change in the mean response, E(Y) per unit increase in X1 while holding X2 constant 4. B2 is the change in the mean response, E(Y) per unit increase in X2 while holding X1 constant

Answer 45

Estimate for σ

Midterm Flashcards

(72 cards)