2. Mathematical Foundations/Probability Theory, I Flashcards

An introduction to probability + an introduction to discrete distribuitions

1
Q

What is probability, and what is it used for?

A

Probability can be thought of as a formal framework to model and quantify uncertainty. It provides a structured way to evaluate how likely events are to occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are key concepts in probability theory?

A

1. Outcome: The smallest subunit that happens - a single possible result (e.g., rolling a 4 on a die).
2. Event: A set of one or more outcomes from the sample space - a portion of outcomes grouped together (rolling an even number: 2, 4 or 6).
- Random events: Occur with a certain probability (e.g., rolling an even number: outcomes 2, 4, 6).
- Deterministic events: Always or never occur under specific conditions (e.g., the sun rising tomorrow).
3. Sample space: The set of all possible outcomes - a complete list of everything that could happen. The probabilities of all outcomes in the sample space add up to 1.
A probability of 0 means the event cannot happen, 1 means it will certainly happen, and probabilities between 0 and 1 indicate uncertainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two main different accounts of probability?

A

1. The obejctive account: Objective probability is based on theory or observations from data (two approaches)

1.1 The classical/theoretical approach focuses on the ratio of an outcome to all possible outcomes. Here each outcome has the same probability.
* P(e) = (Number of outcomes in event e)/(Number of outcomes in the sample space)

Ranges from 0 to 1

1.2. The empirical approach focuses on the ratio of the frequency with which a given outcome occurs relative to all other outcomes that occur. Probability based on data observations in the real word.
* P(A) = (Number of times A occurs in the sample/population)/(Total number of observations in the sample/population)

Ranges from 0 to 1

2. The subjective account: Subjective probability is based on people’s expertise (or lack thereof) and judgments - an educated guess as to the chances of an event occurring. It is based neither on data nor theory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are simple and compound events?

A
  1. A simple event is an event that consists of a single outcome from the sample space (e.g. rolling a 3 on a six-sided die.)
  2. A compound event is an event that consists of two or more outcomes combined from the sample space (e.g. rolling an even number on a six-sided die.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is independence, mutual exclusivity and collective exhaustivity in relation to compound events?

A
  1. Independence: Two events are independent if the occurrence of one does not affect the probability of the other
    - P(A∣B) = P(A), the probability of event A occuring given that event B has occured equals the probability of event A occuring.
  2. Mutual exclusivity: Two events are mutually exclusive when one cannot occur if the other has occurred.
    - P⁡(A│B)=0, the probability of event A occuring given that event B has occured equals 0
    - P⁡(A∩B)=0, the probability of both events A and B occurring together (intersection: “and”) equals 0
  3. Collective exhaustivity: A set of events is collectively exhaustive if every possible outcome belongs to at least one event in the set. For A and B to be collectively exhaustive, one of them must occur, so the entire probability space is covered by A and B
    - For mutually exclusive events A and 𝐵: P(A)+P(B)=1, the probability of A occuring plus the probability of B occuring equals 1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are joint and conditional probabilities?

A

A joint probability is the probability of a compound event. The calculation depends on the relationship between the events:
* If the simple events of a compound event are independent, then their joint probability is the product of the probabilities of each simple event. P(A∩B)=P(A)⋅P(B)
* If the simple event of a compound event are mutually exclusive, then their joint probability is the sum of the simple probabilities. P(A∪B)=P(A)+P(B)

A conditional probability is the probability of one event occurring, given that another event has already occurred.
P(A∣B)= P(A∩B)/P(B), ifP(B)>0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between combinations and permutations, and how are they calculated?

A

A combination is a way of choosing k objects from n objects, when one does not care about the order in which one chooses the k objects.
* (n, k) = (n!)/(k!(n-k)!). “n choose k”.

A permuttation is a way of choosing k objects from n objects, while considering the order in which one chooses the k objects.
* (n, k) = (n!)/(n-k)!.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a random variable and a distribution?

A

A random variable is a variable that can take on different values, with each value having a specific probability based on a random process.

A distribution describes the set of values the variable might take and it assigns probabilities to each possible value of the variable (or range of values for continuous variables) - each possible value has a weight. The probability distribution can be represented as a:
* A probability mass function (PMF) for discrete variables, assigning probabilities to each specific value.
* A probability density function (PDF) for continuous variables, where probabilities are described over intervals rather than individual points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between a probability distribution and a sample distribution?

A

A probability distribution is a theoretical model describing the possible values a variable can take and their associated probabilities, representing the entire population or theoretical outcomes.

In contrast, a sample distribution is a empirical model, that shows the observed distribution of data collected from a subset of the population , reflecting the actual measurements or values in that sample.

The probability distribution is defined by parameters (e.g., mean, variance), while the sample distribution is specific to the data collected and is used to infer properties about the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the probability mass function (PMF)?

A

The PMF is a function that specifies the probabilities of drawing discrete values - it assigns probability to each value being drawn randomly from a population.

Formally the PMF of a discrete variable is written as:
* P(yi)=P(Y=yi), where 0 ≤ P(yi) ≤ 1 and ∑P(yi) = 1. Y is the variable and yi is a specific value of Y.
* You enter a certain value of a variable in a PMF and get its probability.

It is related to the relative frequency distribution and decribes the expected relative frequence distribution
* Relative frequency = (Number of cases with value i)/(Total number of cases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the parameters of a PMF?

A

A parameter is a constant value that defines a characteristic of a mathematical function or statistical model - parameters define the model. In the context of probability distributions, parameters determine the specific shape, location, or spread of the distribution.
* Location parameter: Specifies the center of the distribution, often represented empirically as the mean (𝜇). It determines where the PMF is positioned along the x-axis.
* Scale parameter: Describes the spread or width of the distribution around its center. It is often empirically related to the standard deviation (𝜎) and controls how wide or narrow the PMF appears.
* Dispersion parameter: The square of the scale parameter, empirically corresponding to the variance (σ^2) in statistics. It emphasizes how much the values deviate from the center, particularly highlighting extreme values.

Not all PMFs require all these parameters, and the specific parameters depend on the type of distribution.
The standard form of a distribution (or standard form PMF) is one in which the location parameter is set to zero and the scale parameter is set to one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a data-generating process (DGP)?

A

The DGP is a formal statement of our beliefs about the probability process that produces a phenomenon. It refers to the underlying mechanism or model that generates the data observed in a study. It includes the relationships between variables, the distributional properties of the data, and any random disturbances.

Probability distributions is that they are formal statements of our conjecture about the data-generating process (DGP).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are common discrete distributions?

A

Bernoulli distribution: A discrete probability distribution that models a single trial (experiment) with two possible outcomes: 1 (success) or 0 (failure). A Bernoulli distribution describes the probability of success in a single Bernoulli trial. Common examples include flipping a coin once or answering one true/false question.
* Parameter(s): p, which represent the probability of succes.

Binomial distribution: A discrete probability distribution that models the number of successes in
𝑛 independent Bernoulli trials, each with the same probability of success 𝑝.The binomial distribution describes the probability of achieving a specific number of successes in a fixed number of trials. Common examples include counting the number of heads in
𝑛 coin flips.
* Parameter(s): p, which represent the probability of succes in each trial. n, number of trials.

Poisson distribution: A discrete probability distribution that models the probability of observing a certain number of events (𝑦) in a fixed period, given that events occur at a constant average rate (𝜇>0) and are independent of each other
* Parameter(s): μ, the location parameter, representing the expected number of events in the given interval. It is both the mean and the variance of the distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the expectation of a random (discrete) variable?

A

The expectation of a random variable, denoted E(X), is a measure of the central tendency of its probability distribution. It is the weighted average of the values that a variable can take, where the weights are given by the probability distribution of X.

For a discrete random variable with possible values xi and associeted probabilitiesP(X=xi), the expected value is calculated as E(X)=∑xi⋅P(X=xi), where
* xi: the possible values of the random variable (outcome).
* P(X=xi): the probability of each value 𝑥 (weight). PMF. In the discrete case, it’s the sum of all possible values of 𝑋, each weighted (multiplied) by its probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are moments of a distribution (discrete variables)?

A

Moments of a distribution are numerical measures that describe various characteristics of a probability distribution. They provide insights into the shape, spread, and behavior of the distribution. Moments about zero describe the distribution in relation to the origin (i.e., zero).Moments about the mean measure the distribution relative to its mean
* 1st moment (mean) measures the central tendency or average value of the distribution. Moment about zero (raw).
* 2nd moment (variance) measures the spread or dispersion of the distribution around the mean. A larger variance indicates more spread; a smaller variance indicates tighter clustering around the mean. Moment about the mean (central).
* 3rd moment (skewness) describes the asymmetry of the distribution, where 1. positive skewness: right tail (higher values) is longer or fatter. 2. negative skewness: left tail (lower values) is longer or fatter. 3. zero skewness: symmetrical distribution. Moment about the mean (central) or stadardized?.
* 4th moment (kurtosis) measures the “tailedness” or the extent of extreme values in the distribution, where 1. high kurtosis: heavy tails, more extreme values or outliers. 2. low kurtosis: Light tails, fewer extreme values. 3. kurtosis = 3: the kurtosis of a normal distribution. Values above or below this indicate deviation from normality. Moment about the mean (central) or stadardized?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the cumulative distribution function (CDF) for discrete variables?

A

The Cumulative Distribution Function (CDF) of a discrete random variable X describes the probability that X takes a value less than or equal to a given value
x. It provides a way to understand how probabilities accumulate over the possible values of X. It is calculated by summing up the probabilities of all possible values that X can take that are less than or equal to x.