Statistic Variables Flashcards

Question 1

Q

What is the formula of RSS and what is its purpose?

Answer

A

RSS: Residual Sum of Squares
Formula: sum of (yi - yi_estimated)**2 with i=1..n
Purpose: amount of variability that is left unexplained after performing the fit

Question 2

Q

What is the standard error of an estimated variable?

Answer

A

It is the average amount that this estimate differs from the actual value of the variable

Question 3

Q

What is a 95% confidence interval for estimated variable µ? (formula and meaning)

Answer

A

Formula: [µ - 2SE(µ) ; µ + 2SE(µ)]
Meaning: the true value of µ has 95% chance of being in this interval

Question 4

Q

What is the t-statistic of an estimated variable µ? (formula and meaning)

Answer

A

Formula: t_statistic = (µ - 0)/SE(µ)
Meaning: number of standard deviations that µ is away from 0

Question 5

Q

What does a small p-value indicate? What is a small enough p-value?

Answer

A

A small p-value indicates that, in the absence of any real association between a predictor and a response, it is unlikely to observe such a substantial association due to chance.
A p-value under 5% usually justifies the rejection of the null hypothesis.

Question 6

Q

What is the RSE? (formula and meaning)

Answer

A

RSE: Residual Standard Error
Formula: sqrt(sum over i=1..n of (yi-yi_estimated)**2/(n-2)) = sqrt(RSS/(n-2))
Meaning: average amount that a prediciton will deviate from the true regression line

Question 7

Q

What it the R**2-statistic? (formula and meaning)

In a simple linear regression setting, what is an equivalent?

Answer

A

Formula: 1 - RSS/TSS
Meaning: the proportion of variance explained
Equivalent: R2 = r2 = Cor(X,Y)

Question 8

Q

What is the TSS? (formula and meaning)

Answer

A

TSS: Total Sum of Squares
Formula: sum((yi - y_mean)**2)
Meaning: the amount of variability inherent in the response before the regression is performed

Question 9

Q

What is the F-statistic? (formula and meaning)

What does the value indicate? What does the interpretation of the value depend upon?

Answer

A

Formula: ((TSS-RSS)/p) / (RSS/(n-p-1))
Meaning: show the strength of the relationship between the response and the predictors
Values:
- close to 1: no relationship
- superior to 1: some evidence of a relationship
Interpretation: it depends on the value of n and p. If n is large, a value > 1 but close to one might still indicate evidence of a relationship

Question 10

Q

Why do we look at F-statistic and not simply all p-values?

Answer

A

Because the number of predictors has an influence. The more predictors, the more chance that we will incorrectly conclude that there is a relationship because statistically, some p-values will be under 5% while they shouldn’t.

Statistic Variables Flashcards

(10 cards)