Stats- C5,6,7,8 Flashcards
An event E is a subset of the set of all outcomes of an experiment; the set of all outcomes of an experiment is called a ____ and is usually denoted by S.
P21
sample space
The Poisson probability distribution gives the probability of a number of events occurring in a fixed interval of time or space if these events happen with a known average rate and independently of the time since the last event. Give an example of poisson distribution.
P24
A book editor might be interested in the number of words spelled incorrectly in a particular book. It might be that, on the average, there are five words spelled incorrectly in 100 pages. The interval is the 100 pages.
Exp
The Poisson distribution may be used to approximate the binomial if the probability of success is ____ the number of trials is ____.
P24
small (such as 0.01) and
large (such as 1,000)
In normal distribution:
About ____ %of its values lie within one standard deviation of the mean.
About ____ % of its values lie within two standard deviations of the mean.
Almost all of its values (about ____ percent of them) lie within three standard deviations
of the mean.
P27
68
95
99.7
What does the Central limit theorem say?
P29
The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement,
then the distribution of the sample means will be approximately normally distributed. This will hold true regardless of whether the source population is normal or skewed, provided the sample size is sufficiently large (usually n > 30).
If the population is normal, then the theorem holds true even for samples smaller than 30.
When does CLT hold true in case of binomial population distribution?
P29
In fact, this (CLT) holds true even if the population is binomial, provided that min(np, n(1-p))> 5, where n is the sample size and
p is the probability of success in the population.
If the Y variable is continuous and X variable is discrete or categorical, we perform ____, ____, ____, ____, ____, ____, among others (in case of normal data). Likewise, for non-normal data, we perform, ____, ____, ____, ____, ____, ____, among others.
P35
1-sample t-test
2-sample t-test
paired t-test
one-way ANOVA
F-test
Homogeneity of Variance (HOV)
Mann-Whitney Test
Kruskal Wallis test
Moods Median test
Friedman test
1-Sample sign test
1-Sample Wilcoxon test
The major drawback is that parametric tests assume ____. If these assumptions
are violated, the resultant test statistics will not be valid, and the tests will not be as
powerful as for cases when assumptions are met. When your data does not meet this condition you should consider a Nonparametric test. Also called distribution-free tests because they dont assume that your data follow a specific distribution.
P36
Normality
Their nonparametric nature makes them appropriate for data that dont meet the assumptions of parametric analyses. These include data that are skewed, non-normal, contain outliers, or possibly are censored (Censored data is data where there is an upper or lower limit to values. For example, if ages under 5 are reported as under 5 .)
The t-test is almost similar to the Z-test. Its using the t-table instead of the Z-table for finding the critical value. It is used when ____ and it uses ____ instead. The t-table value talpha,n-1 is depending on the sample size. When you have a very large sample size the t value is approaching the Z value. (Central Limit Theorem)
P38
the population standard deviation is unknown
the sample standard deviation S