Module 3 Flashcards
How does the binomial distribution work?
In a population, and a fixed proportion of individuals are assigned to 1/2 groups: successes or failures.
Define “Binomial distribution” (2 things that it provides)
Provides the probability distribution for the number of successes in a fixed number of independent trials aka the proportion of successes.
-Has a Bernoulli trial (not many)
BD = X successes in n trials.
*look at formula (no need to mem)
What are the assumptions of the binomial distribution?
- The number of trials is fixed
- Separate trials are independent
3 The probability of success is the same in every trial.
How does n affect the binomial distribution?
N is large = narrow distribution
N is small = broad distribution
What are the assumptions for a Goodness-of-fit test?
- Individuals in a data set are random samples from a population
- Individuals are chosen independently from one another.
- No category has expected frequency of <1.
- No more than 20% have a frequency of <5.
What is the purpose of the Poisson distribution?
Tests whether successes occur randomly in time or space. Describes the number of successes over time or space.
-Has many Bernoulli trials
What is a goodness-of-fit test used for? (ex. Chi-squared)
To compare observed and expected variables to see if they’re significantly different from one another.
What is a goodness-of-fit test used for? (ex. Chi-squared)
To compare observed and expected variables to see if they’re significantly different from one another.
Define “extrinsic data”
When expected info is derived from info other than the data you are analysing
Define “Intrinsic data”
When expected info is derived from the data you are analysing
What variables does the GOF test use?
1 categorical variable
How do you calculate the degrees of freedom for a GOF test?
df= k-p-1
How do you calculate the degrees of freedom for an extrinsic data test?
df= k-p-1
- k = # of categories
- p = # of parameters estimated (columns-1)
What are the alternative distributions to Poisson?
- Clumped - successes occur closer together than expected by chance. Variance > mean.
- Dispersed - successes occur more evenly spread out than expected by chance. Mean > Variance.
What is a Bernoulli trial ??
A trial with only 2 possible outcomes.
What is a contingency test used for?
Used to test the dependence of a (categorical) variable to another.
How do you calculate the degrees of freedom of an intrinsic data test?
df=rc-[(r-1)+(c-1)]-1
r = row
c=column
What type of test uses intrinsic data to generate expected values?
Contingency
What type of test uses intrinsic data to generate expected values?
Contingency
What are the conditions that need to be met for a Poisson distribution to occur?
- The probability of two or more occurrences in a single sample subdivision is negligibly small.
- The probability of one occurrence in a sample subdivision is proportional to the size of the subdivision (in time or space). Ex. 3 times area = 3 times probability.
- The outcome in one subdivision of the sample unit is independent of the outcome in all other Bernoulli.
- The probability of an occurrence is identical for all sample subdivisions.
What is the purpose of a G-test?
Testing the null hypothesis (of no association) between two or more categorical variables.
What are the assumptions of a G-test?
- Random samples
- No more than 20% of cells have a frequency of <5
- No cell has a frequency of <1.
What are the degrees of freedom for a G-test?
(r-1)(c-1) ***not sure if this is correct