Fundamentals of Probability Flashcards

1
Q

Three steps to generate a statistical model

A

(1) What is the data-generating process (DGP)?
(2) Build an appropriate probability model that reflects the assumed DGP including
assumptions of how Y is distributed (i.e., stochastic component)
(3) Come-up with a parameterization of the stuff that gets estimated (i.e., systematic component) and theory of inference to derive statistical model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data-generating process

A

This is the joint probability distribution that is supposed to characterize the entire population from which the data set has been drawn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Stochastic Component

A

The assumption about the way Y is distributed, in the case of linear regression it is an assumption about the normal distribution
yi~N(yi|μi, σ^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Systematic Component

A

Parameterization of the stuff that gets estimated
μi=B0+B1Xi+B2X2+….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Population regression function

A

yi=alpha+betax1+ui(error term), i=1,…,n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample regression function

A

yi_hat=(alpha_hat)+(beta_hat)xi, i=1,…,n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Experiment

A

Repeatable procedure for making an observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Outcome

A

possible result of repeatable procedure for making observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The sample space (Ω) of an experiment

A

Set of all possible outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

An event

A

Subset of the sample space, i.e., any set of outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The probability of an event

A

it’s long-run relative frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A ∪ B
Give the operation name, definition and interpretation

A

Union
elements either in A or B or in both occur
either A or B or both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A ∩ B
Give the operation name, definition and interpretation

A

Intersection
elements both in A and B
both A and B occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A_hat
Give the operation name, definition and interpretation

A

Complement
elements not in A
A does not occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A ⊆ B

A
  • If B contains A
    : “when A occurs, so does B (but not
    necessarily vice versa)”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The intuitive definition of probability

A

assigning real numbers to every element of the
sample space in a way that the sum of all such numbers is 1.

17
Q
  • A random variable
A

function that assigns a number to each outcome of the
sample space of an experiment.

18
Q

Probability distributions

A

for all possible
outcomes, it tells us the probabilities for these outcomes to occur.

19
Q

Which types of distributions exist?

A

Discrete, Continuous

20
Q

What is the discrete distribution?

A

e.g., Bernoulli, Binomial, Poisson

21
Q

What are the Continuous distributions?

A

e.g., Uniform, Normal, Logistic, t-distribution

22
Q

Probability density function: definition and formal way to write it

A

Distribution of probabilities for all values of a random variable X
The probabilities p(x) or P(X = x) for all values of a random variable X form the probability density function (PDF).

Probability density function (PDF): What is the probability that we get
* exactly xi
(for discrete distributions)?
* a ≤ xi ≤ b (for continuous distributions)?

23
Q

Cumulative distribution function (CDF):

A

The probability of observing a value less or equal than x

cumulative distribution function (CDF): What is the probability that we get some
value equal to or smaller than xi?

24
Q

Expected Value: definition and formula

A

Specifies the center of the probability distribution
X discrete:
E(X) = ∑(all x) * xp(x)
X continuous:
E(X) = ∫( +∞ −∞) * xp(x)dx

25
Q

The variance of the probability distribution: definition and formula

A

Specifies the spread of the probability distribution
X discrete: Var(X) = ∑(all x) * (x − E(X))^2 * p(x)

X continuous: Var(X) = ∫ (+∞ −∞) * (x − E(X))2 * p(x)d

26
Q

Binomial Distribution

A

Distribution of a binomial random variable K that represents the number of
‘successes’ in n outcomes of a binomial process

27
Q

A binomial process is given by:

A
  1. n independent Bernoulli trials
  2. Only two possible outcomes, which are arbitrarily called ‘success’ and ‘failure
  3. Failure and success probabilities assumed to remain constant over trials
28
Q

Mean and variance of the binominal distribution: definition and formula

A

Let n be the number of trials, p the probability of success
The Binomial distribution has mean (expected value):
E(K) = np
and variance
Var(K) = np(1 − p)

29
Q

The binomial probability mass function formula

A

f(k; n, p) = P(K = k) = (^n k) p^k(1 − p)^(n−k)

30
Q

The binomial cumulative distribution function formula

A

F(k; n, p) = P(K ≤ k)=∑(k above, i=0 below)(^n i)p^1*(1-p)^(n-i)

31
Q

Normal distribution: formal expression

A

Normal distribution N (µ, σ2)

32
Q

Normal distribution: description

A

Continuous distribution that describes data clustered around the mean.
* Uniquely determined by its mean/median/mode µ and variance σ^2
* Importance of the normal distribution because of the Central Limit Theorem.

33
Q

The formula of the normal distribution: Probability Density Function (with two parameters)

A

f(x; µ, σ2) = 1/(√2πσ^2)*exp *[−(x − µ)^2 / 2σ^2]

34
Q

The formula of the normal distribution: Cumulative distribution function

A

F(x; µ, σ^2) =∫ (^x −∞)f(t; µ, σ^2)dt= Φ(x-µ/σ)

35
Q

For what z-score is used? Give formula:

A
  • To compare variables from different distributions, we can standardize them by
    building so called z-scores
    zi =(xi − (x_hat))/( σ)
36
Q

What are the variance and the mean of standard normal distribution?

A

squared variance=1, mean=0

37
Q

Central Limit Theorem

A

1.We have a population distribution (any, not necessarily normal) with mean µ and
variance σ^2 and we are interested in its mean.
2.Repeatedly taking samples from that population and calculating the mean for each
sample yields the sampling distribution of the mean
3.This sampling distribution approaches a normal distribution with mean µ and
variance σ 2/n as n increases.