3. Mathematical Foundations/Probability Theory, II Flashcards
An introduction to continuous distributions
What is the probability density function (PDF)?
The PDF is a function that decribes the relative likelihood of a continuous random variable, 𝑋, taking on different values. The PDF curve represents the “concentration” or “density” of probabilities across the variable’s possible values. The height of the PDF at a specific point indicates how densely the probability is packed around that value.
* Relative likelihood: How “likely” the variable is to be near a specific value compared to other values.
* Unlike a PMF, the PDF is not limited to values not greater than 1 because it does not directly describe probabilities at specific points.
* Formally the PDF is written as: f(x) ≥ 0 forallx
While the PDF itself does not give the probability of specific values, it allows us to calculate probabilities over intervals by calculating the integral of the PDF over the specific intervals. This integral gives us the area under the curve (between points a and b), which corresponds to the probability of the variable falling within that interval.
* The total area under the PDF curve is 1 (to satisfy the rules of probability).
* P(X ∈ [a,b]) = ∫a,b f(x) dx
What is the cumulative distribution function (CDF) for continuous variables?
The CDF for a continuous random variable, 𝑋, describes the probability that, 𝑋, will take on a value less than or equal to a certain point 𝑥 –> cummulative probability.
* It is defined as: F(x)=P(X≤x)=∫−∞,x f(t)dt
* The CDF is the integral of the PDF and the PDF is the derivative of the CDF.
* The CDF is constrained to take values between 0 and 1.
What is the difference between PDF, area under the curve for the PDF and CDF ?
- PDF: Tells where the variable is more or less likely (density).
- Area under the PDF: Converts the density into a meaningful probability over a range.
- CDF: Tells the probability that the variable is below or equal to a specific value.
What are the parameters of a PDF?
A parameter is a constant value that defines a characteristic of a mathematical function or statistical model. In the context of probability distributions, parameters determine the specific shape, location, or spread of the distribution.
- Location parameter: Specifies the center of the distribution, often represented empirically as the mean (𝜇).
- Scale parameter: Describes the spread or width of the distribution around its center. It is often empirically related to the standard deviation (𝜎).
- Dispersion parameter: The square of the scale parameter, empirically corresponding to the variance (σ^2) in statistics. It emphasizes how much the values deviate from the center, particularly highlighting extreme values.
- Shape parameter: Describes the overall shape of the distribution, such as its skewness, peakedness, or kurtosis. Not all distributions have shape parameters.
Not all PDFs require all these parameters, and the specific parameters depend on the type of distribution.
What are joint distributions for continous variables?
A joint distribution describes the probability behavior of two or more random variables occurring together. It captures how the values or probabilities of one variable relate to those of another.
1. An empirical joint distribution is based on observed data. When we have two variables, we want to see how they relate to one another. This is typically done using a scatter plot for continuous variables or for one continuous and one discrete variable.
2. A theoretical joint distribution is a mathematical model that describes how two (or more) random variables are related in theory. For continuous random variables, the joint distribution is described by a joint probability density function (PDF), denoted f(x,y). This represents the density of probabilities for X near
x and Y near y.
* If X and Y are independent, their joint distribution is just the product of their individual distributions: f(x,y)=fX(x)⋅fY(y).
* If X is conditional on Y, their joint distribution is.. : f(x,y)=fX∣Y(x∣y)⋅fY(y)
To calculate the probability that variables X and Y fall within specific ranges, we integrate the joint PDF over those ranges.
What is the expectation of a random (continous) variable?
The expectation of a random variable, denoted E(X) is is the weighted average of the values that a variable can take, where the weights are given by the probability distribution of X.
For a continuous random variable, we calculate the expectation E(X) as E(X)=∫−∞,∞ x⋅f(x)dx
* x: Represents all possible values of X along a continuum.
* f(x): The probability density function (PDF) of X, which describes how probabilities are distributed.
What are moments of a distribution (continous variables)
Moments of a distribution are numerical measures that describe various charachteristics of a probability distribution.
* 1st moment (mean) measures the central tendency or average value of the distribution. Moment about zero (raw).
* 2nd moment (variance) measures the spread or dispersion of the distribution around the mean. A larger variance indicates more spread; a smaller variance indicates tighter clustering around the mean. Moment about the mean (central).
* 3rd moment (skewness) describes the asymmetry of the distribution, where 1. positive skewness: right tail (higher values) is longer or fatter. 2. negative skewness: left tail (lower values) is longer or fatter. 3. zero skewness: symmetrical distribution. Moment about the mean (central) or stadardized?.
* 4th moment (kurtosis) measures the “tailedness” or the extent of extreme values in the distribution, where 1. high kurtosis: heavy tails, more extreme values or outliers. 2. low kurtosis: Light tails, fewer extreme values. 3. kurtosis = 3: the kurtosis of a normal distribution. Values above or below this indicate deviation from normality. Moment about the mean (central) or stadardized?
What is the difference between covariance and the correlation coefficient, and how do they measure the relationship between two random variables?
Covariance and the correlation coefficient both measure the relationship between two random variables X and
Y - relevant moments when working with joint distributions (related to the 2nd moment).
Covariance measures the extent to which two variables change together. If X and Y tend to increase or decrease together, the covariance is positive. If one variable increases while the other decreases, the covariance is negative. If there is no consistent linear relationship, the covariance is zero.
Correlation coefficient (Pearson’s r) standardizes the covariance to provide a dimensionless measure of the linear relationship between two variables. It ranges from -1 to 1, where -1 is a perfect negative linear relationship, 1 is a perfect positive linear relationship and 0 is no linear relationship.
What are common continous distributions? - Uniform distribution
A continuous uniform distribution describes a random variable X that takes values in a specified interval [a,b], where all outcomes within the interval are equally likely. It is characterized by a constant probability density over the range [a,b]
* Parameter(s): a: the lower bound of the interval, b: the upper bound of the interval.
* Conditions: a<b and X takes values in [a,b].
* The uniform distribution can also be applied to discrete variables
What are common continous distributions? - Normal distribution
A normal distribution describes a random variable X that can take any value on the real number line (−∞<X<∞). The probability of X is highest near the mean μ, and decreases symmetrically as values move further away from 𝜇. It is characterized by a bell-shaped curve that is fully determined by its mean 𝜇 and standard deviation 𝜎. It is often denoted N(μ,σ^2 )
* Parameter(s): μ is the mean, which defines the center of the distribution and σ is the standard deviation, which defines the spread of the distribution.
* Conditions: X can take any real value (−∞<X<∞). The distribution is symmetric about the mean 𝜇.
The standard normal distribution is a special case of the normal distribution where: μ=0 and σ=1. It is often used in statistical calculations and denoted as N(0,1).
What are common continous distributions? - Logistic distribution
The logistic distribution describes a random variable X that can take any real value (−∞<𝑋<∞). Its probability is highest near the location parameter 𝜇, and it decreases symmetrically as values move further away, with heavier tails compared to the normal distribution. It is often used in modeling binary outcomes.
* Parameter(s): μ, location parameter, which defines the center of the distribution. σ, scale parameter, which defines the spread of the distribution.
* Conditions: X is a continuous random variable on
(−∞,∞). The distribution is symmetric about 𝜇.