Categorical Data Flashcards
What is the response data for categorical data?
- Binary (0’s or 1’s), denoting the presence or absence of some feature/event
- Proportions (which are bounded by 0 and 1)
What is the range for a probability?
Must lie between 0 and 1
If an event will never occur, what probability is it?
Probability of 0
If the probability of an event F occuring is Pr(F), what is the probability of it’s complement, Pr(F with line)
Pr(F with line) = 1 - Pr(F)
What does the probability mass function tell us?
- Gives the probability that a discrete random variable is exactly equal to some value
What does the expected value give us an idea about?
The centre or location of a probability distribution
What does the variance give us an idea about?
The spread of a probability distribution
What happens if the variance is large?
The values of X will vary from the expectation a lot
What does the Binomial distribution characterize?
Binary outcomes for a repeated event
What are the two parameters that a Binomial distribution has?
- fixed number of trials (n)
- probability of a success (p)
X is said to have a binomial distribution if….
- there are only two possible outcomes
- there are a fixed number of trials
- p is constant for all trials
- binomial variable is the total number of successes in n trials
- each trial is independent on other trials
What does the PMF of a binomial distribution give us?
The probability of seeing X successes out of the N trials
Describe the shape of the binomial distribution if p is low
- small number of successes
- distribution is skewed to the right
Describe the shape of the binomial distribution if p is 0.5
- half of the trials are successful
- distribution is symmetrical
Describe the shape of the binomial distribution if p is high
- large number of successes
- distribution is skewed to the left