Econometrics Final Flashcards

1
Q

Measures of Central Tendency + Advantages+Limitations

A

Info on the center/average of the data values

Mean (Most Commonly Used Unless Outliers Exist): arithmetic average, sum divided by number, affected by extreme values (outliers) used
* For a population of N values mean=x1+x2+…+xNN=Population Values/Population Size
* For a sample size (n) mean=x1+x2+…+xn/n=Observed Values/Sample Size
* population mean & sample mean aren’t equal as sample size varies

Median: midpoint of ranked values, 50% above & below, not affected by extreme values (outliers)
Median and Mode put together is useful to visualize distribution (ex.Skewed)

Mode: most frequently observed value not affected by extreme values (outliers) used for discrete, numerical or categorical data (none, one or many)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample Size vs Population Size

A

Sample size is a sample of the population used to generalize the entire population. Data drawn from this sample size is not enough info on random assignment and size of sample to understand certainty of stats.

Population size accounts for every single person in population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Skew of graph if Mean < Median & Mean > Median

A

Left skewed & Right skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Geometric Mean Vs Geometric Mean Rate of Return

A

GM=(X_1 x X_2 x … x X_n)^(1/n)
GMRR=(X_1 x X_2 x … x X_n)^(1/n)-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Suppose you invested $100 in stocks and, after 5 years, the value of stocks becomes $125 worth. What is the average annual compound rate of returns?

A

$100(1+r)^5=125
r=4.6%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Summation Operator + 5 Properties

A

i=1nEx_i=x_1+x_2+…+x_n sum of a sequence of numbers {x1,x2,…,xn}

  1. Common factor/coefficient can be factored out i=1ncxi=cx1+cx2+…+cxn=c(x1+x2+…+xn)=ci=1nxi
  2. If xi=1 then, i=1nc=ci=1n1=c(1+1…+1)=cn
  3. Addition/Subtraction in pattern rule can be split into individual summationsi=1Enxi+yi=i-1nxi+i-1nyi=(x1+y1)+(x2+y2)…(xn+yn)=(x1+x2+…xn)+(y1+y2+…+yn)
  4. Double Summations: i=1nj=1mxiyj=i=1nxij=1myj
    i=12j=12xiyj=i=12xij=12yj=i=12(xiy1+xiy2)=(x1y1+x1y2)+(x2y1+x2y2)
  5. Sum of sequence subtract mean:i=1En(x_i-mean of x)=0
    (x1-x)+(x2-x)+(x3-x)+…+(xn-x)
    (x1+x2+x3+xn)-nx
    i=1nxi-n(i=1nxi)n=i=1nxi-i=1nxi=0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What increases the Certainty/Confidence/Accuracy of Statistical Test

A

Size: larger sample size, more representative of population distribution
Random Assignment: no systematic confounding variable, more representative of population distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2-Sample T-Test

A

Rejects and supports a hypothesis simultaneously by a statistical test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variability + Ways to Measure

A

Info on the spread/variability/distribution of the data values
1. Range: difference between largest & smallest observations=largest x-smalles x
* D: Ignores distribution & sensitive to outliers
2. Interquartile Range: midspread, middle 50%, difference between the 75th and 25th percentiles x_75%-x_25%
3. Variance: dispersion of data points from the mean on average, weighted average distance squared b/w data point & mean. (E (x_i-E[X]))/N vs n-1
* A: each value in data set accounted for as and their weight, avoids -ve data points canceling out
* D: units uninterpretable
4. Standard Deviation: variation about the mean with same units as original data, most common = square root of variance
* D: hard to compare 2+ different datasets with different units, no sense of spread
5. Coefficient of Variation measures variation relative to mean to compare 2+ sets of data in different units as they cancel out and becomes a unit free measure = (standard deviation/mean) x 100%
6. Empirical Rule: without plotting gives lots of info on where the majority of the data distribution is, if data distribution is approximated by normal distribution then the interval
*E[X] +/- 1standard deviation contains 68% of the values in the data set
* E[X] +/- 2standard deviation contains 95% of the values in the data set
* E[X] +/-3standard deviation contains 99.7% of values in the data set
7. Weighted Mean: x=i=1En wixi+x2w2+wnxn,w=weight of ith observation, for data paired into n classes, all weight sums to 100%=1
8. Covariance how dependent to each other, direction of linear relationship b/w 2 variables, sign matters, 0 means unrelated linearly, +ve move in same direction, -ve move in different direction= weighted average of product of x & y from their respective means (-inf,+inf). Cov(x,y)=xy=i=1N(xi-x)(yi-y)/Nor(n-1)
* D: units are meaningless, uninterpretable
9. Coefficient of Correlation: relative strength and direction of linear relationship b/w 2 variables with different units, unit free, deviation from y=x, stronger correlation means data points are close to the line
Sign depends on covariance since standard deviation is always positive (-1,1)
=Cov(x,y)/SxSy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Compare coefficient of Variation:
Stock A: Avg Price=$50, SD=$5
Stock B: Avg Price=$100, SD=$5

A

A: CV=(550)100%=10%

B: CV=(5100)100%=5%

Both have the same standard of deviation, however Stock B is less variable relative to its price.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Avg stock price $800, standard deviation $100, what interval will 95% of stock price be in?

A

mean +/- 2standard deviation contains 95% of the values in the data set
(800-2(100),800+2(100))
(600,1000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Calculate final grade given Exam(45%)=70%, Participation(30%)=90%, Iclicker(5%)=0, Quiz(20%)=100%

A

Final Grade=
i=14wixi=20+0+27+31.5=78.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Illustrate a correlation of r= -1, -0.6, 0, 1, 0.3

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A1: Given (x_1,y_1)=(11,52) (x_2,y_2)=(13,72) (x_1,y_1)=(15,62) calcluate a)Sample Variance b)Sampe Covariance c)Sample correlation coefficient

A

a.4
b.10
c.1/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A1: What will be the price range of 95% of stock given avg price=$650 & standard deviation=$100

A

450-850

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A1: n=5, x_i{1,2,3,4,5} a)sum b)mean c)variance d)sum of x_i-x mean

A

a.15
b.3
c.2.5
d.0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A1: prove using summation operator a)Eax=aEx b)E(x+y)=Ex+Ey
c)E(ax+by)=aEx+bEy d)EEabxy=abExEy

A

See doc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Probability

A

A set of outcomes whose likelihood is defined by a function, relative frequency of an outcome occurring when random experiment repeated infinitely many times (Formal Definition: a function from the space of sets to the space of real values between 0 and 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

RV + Basic Outcome + Sample Space + Event

A

Random Experiment: a process leading to an uncertain outcome (ex.Dice, coin flip)
Basic Outcome: a possible outcome of a random experiment (ex. 1,2,3,4,5,6)
Sample Space: collection of all possible outcomes of a random experiment (ex. S={1,2,3,4,5,6})
Event: any subset of basic outcomes from the sample space (ex. Let A be the event “Number rolled even”, then A={2,4,6}), if outcome of experiment is in A, then event A has occurred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Outcomes of rolling 2 dice. a)Identical Dice b)Diff dice

A

a)36
b)21

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

6 Types of Probability Set Relationships + Draw

A
  1. **Empty Set **, no element is in it, defines **Mutually Exclusive **AB=
  2. Subset AcB any element of A is also in B so AUB=B and AnB=A
  3. Intersection of Events (n): set of all outcomes that belong to A & B in S, if A & B are events in a sample space S
    1.** Union of Events** (u): set of all outcomes that belong to all of A & B in S, if A & B are events in a sample space S
  4. Complement A (hat A): set of all basic outcomes that don’t belong to A in S, so that S=hatA+A
  5. Collectively Exhaustive: collection of events completely cover S, AuB=S
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

4 Properties of Set Operations + Draw

A
  • Commutative (Order): AuB=BuA
  • Associative (Order of Multiplication): (AuB)uC=Au(CuB)
  • Distributive Law: An(BuC)=(AnB)u(AnC)
  • De Morgan’s Law: hat(AuB)=hatAnhatB and hat(AnB)=hatAuhatB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Given S={1,2,3,4,5,6} A={2,4,6},B={4,5,6},C={4,5} find complements, intersections, unions, subset

A

See doc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Probability as Relative Frequency

A

P(A)=limn>inf n_A/n=# of events in population that satisfy A/total # of events in population

Repeating the experiment n approaching times, counting the number of times event A occurred as nA gives the ratio/relative frequency of event A occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Factorial & Combination & Permutation Formula
Factorial Formula: n!, number ways to order n objects - How many ways to order n=#=8 runners in a sequence Combination Formula: Ckn=(n!(n-k)!)k!=n!/k!(n-k)! number of unordered ways in which k objects can be selected from n objects - How many ways to pick 3 (k=3) out of 8 (n=4) runners, who gets a medal - True Combination lock accepts 1-2-3, 2-1-3, 3-2-1 - Has less outcomes, groupings < orders for each grouping Permutation Formula: Pkn=n!/(n-k)!, n=total, k=limited spots = Total # of groupings/Limited # of orders - How many ways to pick k=3=1st,2nd,3rd places, who gets what medal - True Permutation lock only accepts 1-2-3 - Has more outcomes, orders for each grouping > groupings
26
Q. 5 Candidates, 2 positions, 3 men, 2 women, every candidate likely to be chosen, probability that no women will be hired:
1. Total # of combinations: C25=5!2!(5-2)!=10 1. Total combinations that only men are hired: C23=3!/2!(3-2)!=3 1. Probability=# of events in population that satisfy A/total # of events in population=3/10=30%
27
Probablity as a Set Function + 3 Properties
**Probability as a Set Function**: real-valued set function P that assigns to each event A in the sample space S, a number P(A) that satisfies the following 3 properties: 1. Always positive 2. P(S)=100%=1=probability of all outcomes 3. Mutually exclusive events probability is the sum of each (addition rule)P(A1uA2u...uAk)=P(A_1)+P(A_2)+...+P(A_k) or P(AuB)=P(A)+P(B)
28
5 Probabiltiy Rules + Draw
1. Complement Rule: P(A)=1-P(A), 1=P(A)+P(A) 1. Addition Rule: P(AuB)=P(A)+P(B)-P(AnB) draw diagram see notes P((AuB)(AuhatB))=P(AuB)+P(AuhatB) Mutually Exclusive Addition Rule: P(AuB)=P(A)+P(B) For any A & B Addition Rule: P(AuB)=P(A)+P(B)-1 2. P(empty)=0 3. If AcB, P(A)=P(A)+P(B)-1 5. P(AuhatA)=1 6. P(hatA|hatA)=1
29
Draw Probability Table + Table of Cards AcevsnonAce + Table of P(A)=P(AnB)+P(AnhatB)
See doc
30
Conditional Probability & Multiplication Rule
**Conditional Probability**: probability of one event A, given another event B is true/has occurredB new total sample space for A within B to be contained in P(A|B)=P(AnB)/P(B)=# of A that satisfy spacetotal # of events in space P(B|A)=P(AnB)/P(A)=# of B that satisfy space/total # of events in space ** Multiplication Rule**: rearranging conditional probability P(AnB)=P(A|B)P(B) or P(AnB)=P(B|A)P(A)
31
Outcome is an even number. What is the probability of having rolled a 6
S={1,2,3,4,5,6}, A{2,4,6}, B{6},P(A|B)=P(AB)P(B)=P(16)/P(2,4,6)=16/12=1/3
32
Probability that at least one die is equal to 2 when the sum of two numbers is less than or equal to 3
S={1...61...6}, A{2}, B{(1,1),(1,2),(2,1)},P(A|B)=P(AnB)/P(B)=2/36/3/36=1/18/1/12=2/3 Basic Outcomes=36, A{(2,xi)...(yi,2)},B{(1,1)(1,2)(2,1)},P(A|B)=2/36,P(B)=3/36
33
Probability of getting a red ace using multiplication rule. + Does P(A)=P(AnB)+P(AnhatB)=P(A|B)P(B)+P(A|hatB)P(hatB)
P(RednAce)=P(Red|Ace)P(Ace) =2/4 4/52=2/52
34
Statistical Independent:
Not correlated if either is true, both probability are unaffected (ex.Shape of coin vs Flipping heads) P(AnB)=P(A)P(B) P(A|B)=P(A) because the condition of B has no effect on probability of A P(B|A)=P(B) because the condition of A has no effect on probability of B
35
A{2,4,6}, B{1,2,3,4} Statistically independent?
Yes cause P(AnB)=P(A)P(B) 26=3/6 4/6=26
36
Bivariate Probabilities & Joint Distribution of X & Y & Marginal Probabilities Draw Table + Diagram
**Bivariate Probabilities**: probabilities (A & B) that a certain event will occur when there are two independent random variables in your scenario **Joint Distribution of X{xi} & Y{yi}:** described by bivariate probabilities **Marginal Probabilities**: the probability of a single event occurring, independent of other events (Ai,Bi) mutually exclusive Bi collectively exhaustiveP(A)=P(AnB1)+P(AnB2)+...+P(AnBk) See doc for table & diagram
37
Difference b/w Joint Probability, Marginal Probability & Conditional Probability
Joint probability is the probability of two events occurring simultaneously. Marginal probability is the probability of an event irrespective of the outcome of another variable. Conditional probability is the probability of one event occurring in the presence of a second event that has occured.
38
Total Law of Probability + Draw
**Total Law of Probability**: Bi mutually exclusive & exhaustive events partitions A into k number mutually exclusive & exhaustive events such that A=(AnB1)(AnB2)...(AnBk) therefore using addition rule:P(A)=P(AnB1)+P(AnB2)+...+P(AnBk)=kEi=1 P(AnBi) If Bi mutually exclusive & collectively exhaustive(BiBj=, S=B1B2...Bk) subbing in multiplication rule → kE i=1 P(A|Bi)P(Bi) for any A
39
Bayes' Theorem + Proof
**Bayes’ Theorem**: combines all previous concepts into one expression, how old info new info changes probability P(B|A)=P(AB)/P(A)=P(AB)/P(AB)+P(AB)=P(A|B)P(B)/P(A)=P(A|B)P(B)/P(A|B)P(B)+P(A|B)P(B) Proof: 1. **Conditional Probability**: P(B|A)=P(AB)P(A) 1. **Sub in Multiplication Rule **P(AB)=P(A|B)P(B) P(B|A)=P(A|B)P(B)P(A) 1. **Mutually exclusive & collectively exhaustive** B & hatB P(A)=P(AB)+P(AB) since =(AB)(AB), B=(A ∩ B) ∪ (A ∩ B¯) P(B|A)=P(A|B)P(B)/P(AnB)+P(AnhatB) 1. Sub in Multiplication Rule P(AB)=P(A|B)P(B) & P(AB)=P(A|B)P(B) General Theorem: see doc P(Bi|A)=P(A|Bi)P(Bi)P(A|B1)P(B1)+...+P(A|Bk)P(Bk)=P(A|Bi)P(Bi)i=1kP(A|Bi)P(Bi) 1. Conditional Probability: P(B|A)=P(AB)iP(A) 1. Total Law of Probability: P(A)=i=1kP(A|Bi)P(Bi) P(Bi|A)=P(ABi)i=1kP(A|Bi)P(Bi) 1. Multiplication Rule: sub in P(ABi)=P(A|Bi)P(Bi) P(Bi|A)=P(A|Bi)P(Bi)i=1kP(A|Bi)P(Bi)
40
Your probability of having covid antibody ( B) if 10% of population has covid antibody (P(B)=10%) & your test is positive (A is true) True Positive P(A|B)=97.5%, if you have (B) covid antibody → probability of positive test (A) is 97.5% False Positive P(A|B)=12.5%, if you don’t have (no B=B) → probability of positive test (A) is 12.5%
P(B|A)=? P(A|B)=97.5% P(A|B)=12.5%, P(B)=10% P(B)=90% P(B|A)=P(A|B)P(B)P(A|B)P(B)+P(A|B)P(B)=(97.5%)(10%)(97.5%)(10%)+(12.5%)(90%)=46.4%
41
A2: a)Prove that AcB, then P(A)<=P(B) b) For any A & B, P(AnB)>=P(A)+P(B)-1
a)Addition rule, mutually exclusive, Anb=A, first property of probability P(hatAnB)>=0 b)Cancel out, divide both sides, it becomes a true statement/property See doc
42
A2: Given P()\A=0.3 P(B|A)=0.6 P(B|hatA) find P(hatA|hatB)
Find elements of P(hatA|hatB)=P(hatAnhatB)/P(hatB) or P(A|B)=P(BA)P(B)P(A|B)=P(BA)P(B)=(1-P(B|A))P(A)P(B)=(1-P(B|A))(1-P(A))1-P(B)=(1-0.6)(1-0.3)1-0.6=0.7 see doc 0.7
43
A2: 8 candidates, 2 jobs, 4 women, 4 men, 1 set of brothers a) Total combinations where only men are hired b) Total combinations of where brothers are hired c) Total combinations of where only men and only brothers are hired
a)(C 4 2)/(C 8/2)=only men combos/total combos=3/14 b)1/28 c) 1/28 since B c A AnB=B
44
A2: See 5 6 7 in doc
See doc
45
Random Variable
**Random Variable (X)**: a function which maps outcome of an experiment s to X, represents a possible numeral value from a random experiment **Discrete**: with limited countable outcomes (Dice, coin) P(XA)=XAP(X=x)=P(X=0)+P(X=1)+...+P(X=n), X=RV, x=constant **Continuous**: with infinite outcomes (Height) **Space of X**: {x:X(s)=x,sS}=Sx
46
Probability Mass Function vs Cumulative Distribution Function a) Flip 2 coints X=# of heads b)Rolling dice X=# of dice
Probability Mass Function f_x(x)=P(X=x) Discrete **Properties**: 1. Always positive be/w 0-1 1. XSfx(x)=P(XA)=1=100% 2. If results independent, you can sum up probabilities f_x(x)=1/4 if x=0 ... 1/2 if x=1 1/4 if x=2 Cumulative Distribution Function F(x_0)=P(X<=x_0)=XEX<=0 f_x(x) F(x_0)= 1/6 if x=0 2/6 if x=1 ... ... 6/6 if x=6
47
Expected Value Q. 2 Coins + Rolling dice
E(X)=Ex (xfx(x)) Two Coins Expected Value E(X)=0(14)+1(12)+2(14)=1 Rolling Die Expected Value fx(i)=P(X=i)=1/6 E(x)=i=16i(16)=1(16)+2(16)+3(16)+4(16)+5(16)+6(16)=3.5
48
Variance + Standard Deviation Q. 2 Coins + Rolling dice
**Variance**: (sigma)^2=E(X-E[X])^2=Ex (x-E[X])^2f_x(x) measure of spread/distance^2 from mean, uninterpretable units but prevents cancelling **Standard Deviation**: =2=E(X-)2=x(x-)2fx(x) measure of spread/distance from mean, original units interpretable 2 Coins =(0-1)2(14)+(1-1)2(12)+(2-1)2(14)=0.707
49
Functions of Discrete Random Variables Q. 1 coin X=1 heads, X=0 tail, g(1)=100, g(0)=0 Find expected value
E[g(x)]=xEg(x)f_x(x) E[g(X)]=xxfx(x)=g(0)(0.5)+g(1)(0.5)=0+50=50
50
Bernoulli Probability Distribution vs Binomial Probability DIstribution
**Bernoulli Probability Distribution**: random variables with only 2 possibilities **Binomial Distribution**: sequence of n independent Bernoulli Random Variables/multiple sets of 2 possibility random variables Y=i=1EnXi P(Y=y)= probability of y=# of successes in n=sample size trials with p=probability of success on each trial
51
Bernoulli PMF, Cases, Mean, Variance, SD
**Probability Mass Function**: f(x)=px(1-p)1-x X has Bernoulli distribution X=0,1 P(X=x)=P(X=1)+P(X=0)=1 P(X=1)=p P(X=0)=1-p **Cases** Options^Sets=2(options 0,1)3(sets/votes) Power of p & 1-p tell how many successes & failures occur, sums up to the total # of cases **Mean** =p=E(X)=X=0,1xpx(1-p)1-x=0(1-p)+1(p) If X=0, weight=P(0)=1-p If X=1, weight=P(1)=p **Variance** 2=p(1-p)=E[(X-)2]=X=0,1(x-)2px(1-p)1-x=(0-p)2(1-p)+(0-1-p)2p If X=0, distance from mean=(0-p)2, weight=P(0)=1-p, E[0]=p If X=1, distance from mean=(1-p)2, weight=P(0)=p, E[1]=p **Standard Deviation** =(p(1-p))^(1/2)=Var[X]^1/2
52
Y=X_1+X_2, Independent RVs a) Bernoulli Probability Mass Function of Y? b) Expectation & Variance of Y c) Expectation of Y conditional on X1 d) Expectation of X1 conditional on Y
a.p^2 b.2p(1-p) c.1+p d.1/2
53
Binomial Distribution PMF+derivation, Mean, Variance, SD Average Bernoulli PMF, Mean+Proof, Variance+Proof, SD
* **Probability Mass Function of Binomial Distribution**: P(Y=y)=n!/y!(n-y)! p^y(1-p)^n-y * **Mean** =E(Y)=E(i=1nXi)=i=1nE(Xi)=E(X1)+E(X2)+...+E(Xp)=p+p+...+p=np * **Variance** 2=Var(Xi)=Var(X1)+Var(X2)+...+Var(Xn)=p(1-p)+...+p(1-p)=np(1-p) * **Standard Deviation** =np(1-p) * * **Average of n Independent Bernoulli Random Variable** X=1ni=1nXi=1nY=1n(X1+X2+...+Xn) * **Expected Value of Avg n Independent Bernoulli Random Variable** = Avg of Sample Means = Good estimator of Population Fractions E(X)=E(X1n)=E(X2n)+E(X3n)+...+E(Xnn)=npn=p * **Variance of Avg n Independent Bernoulli Random Variable:** np(1-p)
54
Prove p|(X,a+bX)|=1
See doc
55
Prove p|(X,X-E[X]/Var(X))|=1
See doc
56
Prove E[X-E[X]/Var(X)]=0 Var[X-E[X]/Var(X)]=1
See doc
57
Prove Prove X & Y are uncorrelated & mean independent if they are stochastically independent
* f(x,y)=fx(x)fy(y) or pijXY=piXpjY * EY|X[Y|X]= yyf(x,y)fx(x)=yyf(x)f(y)fx(x) =EY[Y] * Cov(X,Y)=0 --> 0/sigmaXsigmaY=0
58
Prove using summation operator Cov(X,Y)=E[XY]-E[X]E[Y]
See doc
59
Prove Law of Iterated Expectation
See doc
60
Var[(X1+X2)/2] if X1 & X2 are Stochastically Independent
=p(1-p)/2
61
Prove Var(ah(X)+bg(Y))=a2Var(X)+b2Var(Y)+2abCov(X,Y)
See doc
62
Stochastic vs Mean vs Uncorrelatdness Independence
1. Stochastically Independent: captures conditional dependency based on mean & variance, does one thing occurring have an impact on the other’s mean & spread f(x,y)=fx(x)fy(y) 1. Mean Independent: captures conditional dependency based on mean only, does one thing occurring have an impact on the other’s mean EX|Y[X|Y]=EX[X] or EY|X[Y|X]=EY[Y] 1. Uncorrelated: captures linear(direction+spread) relations only Cov(X,Y)XY=0
63
Prove that when X & Y are independent, for any function g(x) and h(y) Cov(g(X),h(Y))=0 always holds
See doc
64
Prove Var[a+bX]=b^2Var[X]
see doc
65
Prove Cov(a_1+b_1X,a_2+b_2Y)=b_1b_2Cov(X,Y)
see doc
66
Prove E[a+bg(X)]=a+bE[g(X)]
see doc
67
Flip coin 4 times Y=# of heads. P(Y=2), n=4, p=0.5. Probability of getting 2 heads.
C(yn)n!y!(n-y)!py(1-p)n-y=C(24)4!2!(4-2)!(0.5)2(1-(0.5))4-2=3/8
68
Winning one game p=0.5, Y=# of games won, * a) Probability of winning all 5 games * b) Probability of winning majority of games * c) If won first game, probability they will win majority of the five games=win 2 out of 4 games left
* a.P(Y=y)=(55)n!y!(n-y)!py(1-p)n-y=1/32 * b.P(Y3)=P(Y=3)+P(Y=4)+P(Y=5)C(35)(0.5)2(1-(0.5))4-3+C(45)(0.5)2(1-(0.5))5-4+1/32=1/2 * c. P(W2)=P(W=2)+P(W=3)+P(W=4)=C(24)(0.5)2(1-(0.5))4-2+..+C(44)(1-(0.5))4-4=11/16
69
Joint Probability + Mass Function Marginal Probabilities + Stochastic Independence + Law of Iterated Expectations Draw Table
* **Joint Probability Mass Function:** P(XY)=f(x,y)=P(X=x,Y=y) a function that expresses probability that X=x & simultaneously Y=y * **Marginal Probabilities:** function that expresses probability of an event irrespective of outcome of another variable integrated over all possible values of other variable(s) P(X=x)=fX(x)=yf(x,y), P(Y=y)=fY(x,y)=xf(x,y) * **Stochastic Independence**: all pairs of x & y must satisfy f(x,y)=fX(x)fY(y) or all random variable must satisfy f(x1,x2,...,xk)=fX1(x1)fx2(x2)...fXk(xk). Derived from Statistically Independent P(AB)=P(A)P(B)=P(A|B)P(B|A) * **Law of Iterated Expectations** EX[EY|X[Y|X]]=EY[Y] all cases of X gives all cases of Y
70
Page 24 Table Questions
See doc
71
Conditional PMF, Conditional Mean, Conditional Variance
* **Conditional Probability Mass Functions**: function that expresses probability of a probability mass fY|X(y) or fX|Y(x) given a marginal probability X=x or Y=y has occurred fY|X(y)=f(x,y)fX(x)=P(XY)P(X) or fX|Y(x)=f(x,y)fY(y)=P(XY)P(Y) Derived from Conditional Probability P(A|B)=P(AB)P(B) or P(B|A)=P(AB)P(A) * **Conditional Mean**: is a random variable because it depends on the realization of , essentially there is an input & an uncertain outputY|X=x=EY|X[Y|X=x]=yyfY|X(y) or X|Y=y=EX|Y[X|Y=y]=xxfX|Y(x) Essentially: weight f(x)fX|Y(x)=f(x,y)fY(y) conditional expectation function weight f(y)fY|X(y)=f(x,y)fX(x) conditional expectation function * **Conditional Variance**: 2Y|X=x=EY|X[(Y-Y|X=x)2|X=x]=y(y-Y|X=x)2fY|X(y) or 2X|Y=y=EX|Y[(X-X|Y=y)2|Y=y]=x(x-X|Y=y)2fX|Y(x)
72
Covariance vs Correlation
**Covariance**: average product=direction (+/-) of relation b/w 2 random variables, expected value of the product of the spread of X & Y from their mean (magnitude doesn’t tell you anything, unit is uninterpretable) * Cov(X,Y)=E[(X-X)(Y-Y)]=xy(x-X)(y-Y)f(x,y) * Cov=0 no linear relation *Cov>0 positive linear relation * Cov<0 negative linear relation **Correlation**: unitless measurement of the strength (spread+direction) of the linear relation b/w X & Y (-1 to 1), covariance divided by the product of X & Y standard deviation (weaker/large spread/denominator → closer to 0 vs stronger/small spread/denominator → closer to -1/1) * p=Corr(X,Y)=Cov(X,Y)XY * p=0 no linear relation * p>0 positive linear relation (1=perfect positive linear dependency) *p <0 negative linear relation (-1=perfect negative linear dependency)
73
Applying Covariance to Investment Table
See Doc
74
A4: Prove Cov(g(X),Y)=0
See doc
75
A4: Prove E[(x-b)^2]=E[X^2]-2bE[X]+b^2
See doc
76
A4: Prove Corr(x,z)=1 & Corr(x+a+bx)=+/-1
See doc
77
Prove E[X+Y]=E[X]+E[Y]
See doc
78
Prove Cov(X,c)=0
See doc
79
Prove Cov(X,X)=Var(X)
See doc
80
Prove Cov(X,Y)=E[XY]-E[X]E[Y]
See doc
81
Prove sum(x_i-hatx)=0 and sum(x_ixhatx)(y_i-haty)
See doc
82
Is E[g(X)]=g(E[X]) always true?
Only equal if g(x) is linear
83
A3: 3 Tosses of Coin a)PMF, b)Cumulative Function c)E[Y] d)Var(Y)
See doc
84
A3: Derive Bernoulli a)E(X) & Var(Y) b)PMF c)Stochastically Independent d)E(Y|X_1=1)
85
A3: 100 Tosses of fair coin binomial PMF function
See doc
86
A3: 3,5,6 see doc
See doc
87
Discrete v. Continuous Random Variables 1. PDF 2. CDF 3. Expectation 4. Variance
**Continuous Random Variable**: variable that assumes any value in an infinite interval/outcomes depending on ability to measure accurately (ex.Thickness, time, height) - **Probability Density Function**: probability that X=outcome lies between a & b, differentiating CDF fX(x)=P(axb)=abfX(t)dt=P(X
88
Distributions 1. When to use 3. Expectation 4. Variance 5. Sample --> Unstandardized 6. Confidence Interval
**Uniform Distribution**: probability distribution with equal probabilities for all possible outcomes of the random variable uniformly distributed on [a,b] XU[a,b] **Normal Distribution**: approximate probability distributions of wide range of RV in empirical applications XN(,2) * **Bernoulli** * CLT n big/Normal and population variance known **T-Distribution:** * n too small, population variance unknown, can't be bernoulli **Chi-Square**: * Estimating population variance
89
Transitioning from Z --><-- X
X=mean-Z Z=X-mean/ SD Z-N(0,1) X-N(mean,variance)
90
Random Variables + Linear Combinations
* **Covariance** Cov(X,Y)=E[(X-x)(Y-y)]=E(XY)=xy=0 (if X & Y are independent) * **Correlation** Cor(X,Y)=Cov(X,Y)x y * **Expectation** E[X1+X2+X3+...+Xn]=1+2+3+...+n * **Variance**: Var[X1+...+Xn]=21+...+2n+2Cov(X1,X2)+...+2Cov(Xn-1,Xn)=21+...+2n if independent * **Jointly Normally Distributed** (independent with identical mean & variance): X=N(,2/n) * * W=aX+bY * **Expectation** E[aX+bY]=aX+bY * **Variance**: Var[aX+bY]=a22X+b22Y+2abCov(X,Y)=0 if independent * **Jointly Normally Distributed**: aX+bYN(w,w)N(aX+bY , a22X+b22Y+2abCov(X,Y) )
91
Population + Sample + Inferential Stats (Types) + Sampling Distribution
**Population**: set of all items/individuals/variables of interest **Sample**: subset of population observed, less time consuming & costly than census of entire population **Random Sampling**: every item in population has equal chance of being selected and are selected independently (thrown back into population and could be drawn again) **Inferential Statistics**: making statements about population parameters (unknown) by examining sample results (known) * **Estimation**: make a claim about population mean using sample mean/evidence * **Hypothesis Testing**: test a claim about population using sample mean/evidence **Sampling Distribution**: plots the frequency of all possible sample means, each sample has n=# of items/people, the larger the n the smaller the variance=more accurate to the population mean
92
Sample Mean, Expectation, Variance, Standard Deviation
See ipad
93
Central Limit Theorem
If population is not Normal apply Central Limit Theorem: as n increases the distribution converged to normal distribution, sample means from population will be approximately normal as long as sample size is large enough n(Xn-)dN(0,2)
94
Q. n=14, Var(p)=16, upperlimit? So that probability of exceeding limit is less than 0.05
27.52 See doc
95
Point Estimator v. Estimate
**Point Estimator** omega hat of population parameter is a unbiased random variable; function of random sample, realized value of point estimator (random variable) is point **estimate** f(X1...Xn)=(X1...Xn)
96
Unbiased Estimator + Efficient + Consistent with Graph + Words + Example
**Unbiased Estimator** E()=, mean of estimator=mean of true parameter (E(0)-0=0) **Efficiency**: spread of variance, preferably the smaller=more efficient Var(1)
97
Point & Interval Estimates + Draw
a single number within an interval (range of values) providing info about variability based on observation from 1 sample with limits as functions of sample P(L[X1,...,Xn]<
98
Confidence Interval, Level, Significance Level 1. Formulas 2. Meaning
Confidence Level 1-alpha[0<<1]: percentage of probability that true population parameter is within the interval of values. 95% of time true value will be in interval. 95% of the sample dataset’s point estimate will be within the interval. However, 5% chance true value isn’t in interval & 5% of time sample data set’s point estimate won’t be in interval P(Point Estimate-Reliability Factor(Standard Error)
99
Margin of Error + Formula + How to Reduce
The uncertainty/amount of random sampling error in the results: +/- Z_alpha/2 SD/sqrt(n)=ME\ **Reduce** ME=Z/2n: reduce population standard deviation , reduce confidence interval (1-alpha), increase sample size n
100
Population difference Confidence Interval + Derivation <>= 0 & #
See doc
101
Difference in Differences Confidence Interval
See doc
102
Hypothesis: types & Alt + Null
**Hypothesis** (mew or p or var): claim about population parameter * Population Mean mew * Population Proportion p * Population Variance var **Null Hypothesis**/Counterfactual (H0): assumption to be tested in population parameter, status quo **Alternative Hypothesis**: hypothesis researcher is trying to support, challenges status quo
103
Hypothesis Testing + Methods/Process
Testing: Assume null is true =,, (innocent until proven guilty), where does sample fall within its probability distribution 1) Find distribution 2) Choose technique depending on info given and parameter of interest * Z-Test: normal/CLT + known population variance * T-test: n small or unknown population variance * Chi-Square: estimating population variance 3) Choose upper/lower/double tail rejection region, compare realized sample with: * Significance level or Critical value * P-value * Confidence Interval
104
Rejection Region, Critical Value, Significance Level
* Significance Level (alpha=%) * Critical Value (C=X_C) determined by significance level * Rejection Region [X=+Zn,]/Range of Values Unstandardized
105
P-Value + Info Required + Process
P-Value/Observed Level of Significance: probability of getting more extreme test statistic than the realized sample within the null hypothesis H0 true probability distribution, smallest value of that can be rejected **Required** Information To Calculate: a realized sample and a distribution even if n=small or distribution is not normal P(ZZX=X-/n)=p value P(Z1.96)=5% **Process**: 1. Convert sample into test statistic 1. Use Z-table to find P(ZX)=p value 1. Compare p value & significance level p value>=do not reject outside of rejection region p value<=reject within rejection region
106
2 Types of Errors + Graph + Process
**Type 1 Error (α)** - False Positive: rejecting true H0 which we can never know given just a realized sample therefore there is always a probability that it exists of level of significance=α (ex.1%, 5%, 10%) Guilty before innocent = Serious, convicting innocent person. **Calculating** Type 2 Error: is the significance level chosen % P(reject H0|true H0)=alpha **Type 2 Error (β)** - False Negative: failing to reject false H0 with probability β P(fail to reject H0|false H0)=β Innocent before guilty=less serious, letting go guilty person **Calculating** Type 2 Error: P(ZXc=X-/n|true)=n=64, =6, =0.05, H0: μ52, H1: μ<52, True μ*=50 1. Calculate Critical Value 1. Standardize Critical Value in terms of true distribution 1. Using Z-table find probability/integral/area under the curve
107
Error/Power Tradeoff + How to Reduce
Type 1 & 2 Tradeoff: moving rejection region alters the size of error, however cannot decrease both errors at same time, decreasing one will increase the other as Type 1 Error occurs when H0 is true, Type 2 Error occurs when H0 is false Smaller Rejection Region → smaller type 1 error → larger type 2 error → smaller power Larger Rejection Region → larger type 1 error → smaller type 2 error → Larger power Larger Rejection Region: larger type 2, type 1 same, large power (more evident what is true rejection) n Sample Size/Variance Increase: smaller type 2 error, type 1 error same, larger power
108
Power + Calculate
**Power** (1-) - True Positive: probability to successfully rejecting false H0 P(reject H0|false H0)=1-P(fail to reject H0|false H0)=1-β=type 2 error Power of the test increases, sample size increases **Calculate**: 1. Find Critical Value & standardize into Z score 1. Use Z-table to find probability of rejection area created by critical value in true distribution
109
A4-A7 + Worksheets
lol goodluck bro