Statistics Definitions Flashcards
The proportion of people in a defined group who experience an event of interest within a specified period of time.
Absolute risk
A generic term for a single representative value for a set of numbers, for example the mean, median, or mode.
Average
The parameter is random variable (no right answer) – “For me, it’s 50%, for you it’s whatever it is for you.” Interested in probabilities, as opposed to Frequentists which are not. Looks at credible intervals, prior, posterior. They use statistics to change their opinion about opinions to have (prior belief).
If you use this method, you win: you make intuitive definitions, e.g. credible intervals are what you wish confidence intervals were
If you use this method, you you lose the ability to talk about any notion of “right answers” and “method quality” - there’s no such thing as statistically significant or rejecting the null. There’s only “more likely” and “less likely” from your perspective.
Bayesian statistics
The parameter is a fixed quantity (no probability about it) – “The coin has landed. P(Heads) is 0% or 100%, I just don’t know which one.” Looks at confidence intervals, p-value, power, significance.
You win this way: it makes sense to talk about your method’s quality and “getting the answer right.”
If you use this method, you lose because the core objects make no sense to beginners (e.g. p-values and confidence intervals are hard to think about) and lazy thinkers make a hash out them frequently.
Freqentist statistics
A rule of probability that shows how evidence A updates prior beliefs of proposition B to reduce posterior beliefs. Basically Bayes’ theorem gives you a mathematical formulation on how much you should or you shouldn’t believe an evidence. Bayes’ theorem provides a way to revise existing theories given new or additional evidence.
Bayes’ theorem
If X is a random variable which takes on the value 1 with probability p, and 0 with probability 1 - p, it is known as a Bernoulli trial with a Bernoulli distribution. X has mean p and variance 1 - p.
Bernoulli distribution
Variables that can only take on two values, often yes/no responses to a question. Can be mathematically represented by a Bernoulli distribution.
Binary data
The sampling distribution of a statistic becomes closer to the normal distribution as the number of trials increases (this implies MULTIPLE samples).
Central Limit Theorem
If we increase the size of a sample, the sample mean will be getting closer to the true mean of the distribution we are sampling from (this only concerns ONE sample).
Law of large numbers
Events happen at a certain rate, but completely at random.
Poisson process
Describes the probability of some number of events happening over a fixed period of time.
Poisson distribution
Describes the Poisson distribution. Represents the average number of events per time interval. Is also the expected value of the distribution.
Lambda
Represents the probability of a certain time passing between Poisson events. Differently from the Poisson distribution, it is continuous, since it represents time.
Exponential distribution
Its shape is similar to the normal distribution, but not quite the same. Its tails are thicker, hence the observations are more likely to fall further than the mean.
We use it when our sample size is at most 30, and the population SD is unknown.
Student’s t-distribution
Is a parameter which affects the thickness of the tails in a student’s t-distribution.
Degrees of freedom (df)