Stata Lecture 2 Flashcards
Within how many standard deviations does 95% of the data lie, in a normal distribution?
1.96 standard deviations
How can you calculate the 95% range for a normally distributed variable?
Lower bound = mean - (1.96 X SD)
Upper bound = mean + (1.96 x SD)
What is the definition for population?
The full set of people which the results of the study will be applicable to
What is the definition for sample?
The subset of the population who will take part in the study
What are descriptive statistics vs inferential statistics?
Descriptive = describes data in the sample Inferential = make inferences about the population (standard error, confidence intervals, p values)
Define the ‘sampling distribution’
The distribution of all the different estimates we would get of the popluation, if we did the study many times with different samples OF THE SAME SIZE
Basically a measure of how accurately one sample represents the population
What is standard error of the mean?
The standard deviation of the sample mean
A measure of how close the estimate from a sample is to the true parameters of the population (if you did the study many times with different samples of the same size)
Precision of the estimate
Do larger sample sizes have smaller or larger standard errors? Why?
Smaller. Because the estimate is more precise to the true parameters of the population
Compare Standard error vs Standard deviation
Standard error = precision
Standard deviation = variability
What is a confidence interval?
How confident we are that the results of the sample relate to the population
Does a larger sample size give a wider or narrower confidence interval?
Why?
Narrower
Because as sample size increases, the parameter is estimated with greater precision
If the p value is close to zero, how would you phrase your conclusion?
There is moderate evidence to suggest that in the population X, the mean score for group Y is not the same as the mean score for group Z.
What is a two-tailed hypothesis?
Tests for changes in either direction; specifies ‘difference’ rather than ‘increase’ or ‘decrease’
What does the p value indicate?
The probability that your result would have been obtained if the null hypothesis is true
Value between 0 and 1
Lower value indicates more evidence the null hypothesis is false
What is a type 1 error?
Rejecting the null hypothesis when it is actually true
False positive