Lecture 6: p-values and confidence intervals Flashcards
What is a point estimate?
An inference made about the population based on the sample.
Left-skewed distribution
a distribution that has a concentration of data on the upper end and the tail on the left
Skewness depends on where the tail is (so tail on left is left skewness)
What does correlation measure?
Correlation measures the degree of relationship between two or more variables
It looks at association
What is the goodness of fit of a model?
Goodness of fit describes how well data fits a set of observations
What is standard error of regression?
The standard error of the regression (S), also known as the standard error of the estimate, represents the average distance that the observed values fall from the regression line. Conveniently, it tells you how wrong the regression model is on average using the units of the response variable
R^2 (Goodness of Fit)
R^2 is a goodness-of-fit statistic.
Values: between 0 and 1.
Interpretation: The larger the better.
Meaning: Proportion of the outcome’s variability that the model explains.
What are some questions to think about for generalising?
*Would I get the same coefficient if I built my model using different data?
*How likely am I to estimate the correct value?
confidence interval
statistical range, with a given probability, that takes random error into account
A confidence interval refers to the probability that a population parameter will fall between a set of values for a certain proportion of times.
What is meant by interval width?
Interval width = boundaries of your sample’s estimate
When should you be concerns about confidence intervals?
Be concerned about confidence intervals if they include a contradictory estimate
Match the confidence level with the interval width
99%
90%
10%
99% - very wide
90% - wide
10% - narrow
Confidence level is proportional to interval width
ANALOGY
Analogy: A bigger net is more likely to catch the fish
you are looking for.
If we assume there is no association, what will you expect?
Assuming there is no association, you will expect:
- a zero coefficient is very likely
- tiny coefficients are somewhat likely
- and big coefficients are unlikely
What is p-value?
the probability of a coefficient at least as big as estimated assuming the coefficient is actually zero
Small p-value
1.When we assume the True coefficient is zero, the probability of sampling 0.32 is small, i.e. our estimate is unlikely.
2.But we know our sample estimate is 0.32.
3.Therefore, we concede that our assumption is probably wrong.
4.We conclude that an association is likely