Probability Distributions Flashcards
What is a probability distribution?
It describes how probabilities are distributed over the values of a random variable.
What properties define a normal distribution?
Symmetrical, bell-shaped curve, mean=median=mode, and fully described by mean and standard deviation.
How do you calculate probabilities in a binomial distribution?
Using the formula involving the number of trials, success probability per trial, and desired number of successes.
What is a Poisson distribution and where is it commonly used?
It models the number of events in a fixed interval of time or space and is used in fields like telecommunications and insurance.
Describe the characteristics of a uniform distribution.
All outcomes are equally likely over the interval.
How does the exponential distribution differ from the normal distribution?
It is used for modeling time between events and has a rapid decrease in probability from a starting point.
What are the key features of a geometric distribution?
Models the number of trials until the first success and is memoryless like the exponential distribution.
How is the standard deviation calculated in a normal distribution?
It is the square root of the variance, which is a parameter of the distribution.
What does the shape parameter signify in a Weibull distribution?
Indicates the concentration of failure times or the shape of the distribution’s tail.
Why is the normal distribution important in statistics?
Because many statistical methods assume data follow a normal distribution due to the central limit theorem.
How can you use the central limit theorem with non-normal distributions?
By using it to approximate distributions of sample means from any distribution as normal when sample sizes are large.
What is the difference between discrete and continuous distributions?
Discrete distributions count occurrences, while continuous distributions measure outcomes.
Explain the concept of the cumulative distribution function (CDF).
It describes the probability that a random variable is less than or equal to a certain value.
What is the significance of the mean in a probability distribution?
It’s the expected value or the balance point of the distribution.
How do you find the median in a probability distribution?
By finding the value at which half of the observations lie above and half below.
What role does variance play in probability distributions?
It measures how spread out the values are around the mean.
What is a hypergeometric distribution?
It models the number of successes in a sample without replacement from a finite population.
How do z-scores relate to normal distributions?
They measure the number of standard deviations an element is from the mean in a normal distribution.
What is a negative binomial distribution?
It counts the number of failures before a specified number of successes occurs.
Why might one use a log-normal distribution?
To model variables that are positively skewed, such as income or city sizes.
How do you determine if a distribution is skewed?
By looking at the skewness coefficient or comparing the mean and median.
What is the difference between probability and density in distributions?
Probability applies to discrete cases; density applies to continuous cases.
How are percentiles calculated in a distribution?
By finding the values below which a certain percentage of the data fall.
What is a hazard function in survival analysis?
It describes the rate at which subjects fail or die over time.
Why are probability distributions essential in machine learning?
They model the underlying distributions of data features necessary for algorithms like classification and regression.
How do you estimate parameters of a distribution?
Through maximum likelihood estimation, method of moments, or Bayesian estimation.
What is a beta distribution, and when is it used?
Used to model variables that are constrained to intervals like percentages or proportions.
Describe the process of hypothesis testing using a normal distribution.
Set up a null hypothesis about the mean, calculate the z-score for the sample mean, and compare it to a critical value.
How do parameter estimates affect the shape of a distribution?
Changes in estimates can shift or stretch/compress the distribution curve.
What is the law of large numbers, and how does it relate to distributions?
It states that averages of samples converge to the population mean as sample sizes increase.
What is the difference between parametric and non-parametric distributions?
Parametric involves specific distribution with set parameters; non-parametric does not assume a specific distribution.
How can you model dependencies between variables using distributions?
Through copulas or joint distributions that express how variables influence each other.
What is a mixture model in statistics?
It represents a combination of two or more probability distributions.
How do outliers impact the assumptions of normal distribution?
They can cause violations of the assumption that data are normally distributed.
How can transformations help in normalizing distributions?
By using log, square root, or other transformations to make data more symmetric.
What is the importance of tail behavior in risk assessment?
Tail behavior helps assess the risk of extreme outcomes in distributions.
How do you handle overdispersion in count data?
By using models like negative binomial when Poisson assumptions do not hold.
Why might someone use Monte Carlo simulations in studying distributions?
To model and understand the behavior of random variables and uncertainties in complex systems.
What are the implications of heavy-tailed distributions in finance?
They can indicate larger risks of extreme outcomes, important for risk management.
How do you interpret a probability plot?
By comparing the data’s distribution with a theoretical distribution to assess normality or other characteristics.