2: Statistical Review Flashcards
sample space
set of all possible outcomes
event
subset of the sample space
random variable
variable whose possible values are numerical outcomes of some random phenomenon
function defined on the set of possible outcomes that assigns a real number to every possible outcome
two major classes of random variables
discrete and continuous
probability distribution
number between 0 and 1 that quantified how likely an event is to occur
probability function
describes/characterises discrete random variable
probability for each possible discrete outcome
cumulative distribution function
describes/characterises distribution of a random variable
lists the probability that a random variable is less than or equal to a specific value
also called the distribution function or cumulative risk profile
continuous random variable
random variable that can take on any real value within some range
probability density function
determines probabilities associated with continuous random variable
mode
value occurring with the greatest probability
median
value such that the probability of the random variable being less than or equal to that value is at least 50% and the probability of the random variable being greater than or equal to that value is at least 50%
mean/expected value
weighted average of all possible outcomes, weighted by probabilities of outcomes
variance
measures the spread or dispersion of the variable around its mean
standard deviation^2
characteristics of normal distribution
defined by mean and standard deviation
single-peaked
symmetric around the mean
standardised to have mean 0 and variance 1
joint probability distribution
probability that two random variables can simultaneously take on particular values
conditional distribution
distribution of a random variable conditional on another random variable taking on a specific value
conditional expectation
expected value of a random variable, conditional on the realised value of another random variable
independence
X ⊥Y
knowing X tells you nothing about Y
going distribution is the product of the marginals
covariance
variance of X - measures how X alone varies
covariance of X and Y - measures how X and Y vary together
correlation
covariance rescaled between -1 and 1 (unit-free)
independence vs uncorrelated
independent random variables are also uncorrelated, but the reverse is not true
population
set of all information of interest to the decision-maker
sample
subset of a population
to be useful, has to be representative of the population
single random sample
if each individual population is equally likely to be included in the sample
e.g. random draws
leads to independent and identically distributed draws
point estimates
estimator computed from sub-sample for the sample of data which is a subset of the population to learn about the population
single number used to estimate an unknown population parameter
the law of large numbers
as n goes to infinity, the expected value of the mean will be very close to the population mean
the central limit theorem
the average from a random sample for any population (with finite variance), when standardised, has an asymptotic standard normal distribution, meaning that it becomes well approximated by a standard normal
properties of estimators
unbiasedness, consistency, efficiency