Statistical Methods- Lecture 5-6-7 Flashcards
What is a contingency table?
A table that summarizes data for two categorical variables is called a contingency table.
What are two differences between bar plots and histograms?
1) Bar plots can be used for displaying distributions of categorical variables, while histograms are used for numerical variables
2) The x-axis in a histogram is a number line, hence the order of the bars cannot be changed, while in a bar plot the categories can be listed in any order
What is the null and alternative hypothesis?
Null hypothesis means that there is no difference
Alternative hypothesis is when there is a difference
What do we use for the unknown population parameters of interest?
We use sample statistics as point estimates for the unknown population parameters of interest.
What are sample proportions?
Sample proportions will be nearly normally distributed with mean equal to the population proportion, p, and s.d. is Equal to (p(1-p)/n)^1/2
What is the standard deviation of the sampling distribution called?
The standard deviation of the sampling distribution of a point estimate is called the standard error of the point estimate.
What conditions need to be met for Central Limit Theorem to apply?
Sampled observations must be independent, which they are more likely be if random sampling is used and if sampling without replacement, n<10% of the population
There should be at least 10 expected “successes” and 10 expected “failures” in the observed sample.
What does the CLT state?
The CLT states SE=(p(1-p)/n)^1/2
With the condition that np and n(1-p) are at least 10
What happens when the conditions are not met?
- if either np or n(1-p) is small, the distribution is more discrete
- when np or n(1-p)<10, the distribution is more skewed
- the larger both np and n(1-p), the more normal the distribution
- When np and n(1-p) are both very large, the discreteness of the distribution is hardly evident, and the distribution looks more like a normal distribution