Econometrics Final Flashcards
Measures of Central Tendency + Advantages+Limitations
Info on the center/average of the data values
Mean (Most Commonly Used Unless Outliers Exist): arithmetic average, sum divided by number, affected by extreme values (outliers) used
* For a population of N values mean=x1+x2+…+xNN=Population Values/Population Size
* For a sample size (n) mean=x1+x2+…+xn/n=Observed Values/Sample Size
* population mean & sample mean aren’t equal as sample size varies
Median: midpoint of ranked values, 50% above & below, not affected by extreme values (outliers)
Median and Mode put together is useful to visualize distribution (ex.Skewed)
Mode: most frequently observed value not affected by extreme values (outliers) used for discrete, numerical or categorical data (none, one or many)
Sample Size vs Population Size
Sample size is a sample of the population used to generalize the entire population. Data drawn from this sample size is not enough info on random assignment and size of sample to understand certainty of stats.
Population size accounts for every single person in population.
Skew of graph if Mean < Median & Mean > Median
Left skewed & Right skewed
Geometric Mean Vs Geometric Mean Rate of Return
GM=(X_1 x X_2 x … x X_n)^(1/n)
GMRR=(X_1 x X_2 x … x X_n)^(1/n)-1
Suppose you invested $100 in stocks and, after 5 years, the value of stocks becomes $125 worth. What is the average annual compound rate of returns?
$100(1+r)^5=125
r=4.6%
Summation Operator + 5 Properties
i=1nEx_i=x_1+x_2+…+x_n sum of a sequence of numbers {x1,x2,…,xn}
- Common factor/coefficient can be factored out i=1ncxi=cx1+cx2+…+cxn=c(x1+x2+…+xn)=ci=1nxi
- If xi=1 then, i=1nc=ci=1n1=c(1+1…+1)=cn
- Addition/Subtraction in pattern rule can be split into individual summationsi=1Enxi+yi=i-1nxi+i-1nyi=(x1+y1)+(x2+y2)…(xn+yn)=(x1+x2+…xn)+(y1+y2+…+yn)
-
Double Summations: i=1nj=1mxiyj=i=1nxij=1myj
i=12j=12xiyj=i=12xij=12yj=i=12(xiy1+xiy2)=(x1y1+x1y2)+(x2y1+x2y2) -
Sum of sequence subtract mean:i=1En(x_i-mean of x)=0
(x1-x)+(x2-x)+(x3-x)+…+(xn-x)
(x1+x2+x3+xn)-nx
i=1nxi-n(i=1nxi)n=i=1nxi-i=1nxi=0
What increases the Certainty/Confidence/Accuracy of Statistical Test
Size: larger sample size, more representative of population distribution
Random Assignment: no systematic confounding variable, more representative of population distribution
2-Sample T-Test
Rejects and supports a hypothesis simultaneously by a statistical test
Variability + Ways to Measure
Info on the spread/variability/distribution of the data values
1. Range: difference between largest & smallest observations=largest x-smalles x
* D: Ignores distribution & sensitive to outliers
2. Interquartile Range: midspread, middle 50%, difference between the 75th and 25th percentiles x_75%-x_25%
3. Variance: dispersion of data points from the mean on average, weighted average distance squared b/w data point & mean. (E (x_i-E[X]))/N vs n-1
* A: each value in data set accounted for as and their weight, avoids -ve data points canceling out
* D: units uninterpretable
4. Standard Deviation: variation about the mean with same units as original data, most common = square root of variance
* D: hard to compare 2+ different datasets with different units, no sense of spread
5. Coefficient of Variation measures variation relative to mean to compare 2+ sets of data in different units as they cancel out and becomes a unit free measure = (standard deviation/mean) x 100%
6. Empirical Rule: without plotting gives lots of info on where the majority of the data distribution is, if data distribution is approximated by normal distribution then the interval
*E[X] +/- 1standard deviation contains 68% of the values in the data set
* E[X] +/- 2standard deviation contains 95% of the values in the data set
* E[X] +/-3standard deviation contains 99.7% of values in the data set
7. Weighted Mean: x=i=1En wixi+x2w2+wnxn,w=weight of ith observation, for data paired into n classes, all weight sums to 100%=1
8. Covariance how dependent to each other, direction of linear relationship b/w 2 variables, sign matters, 0 means unrelated linearly, +ve move in same direction, -ve move in different direction= weighted average of product of x & y from their respective means (-inf,+inf). Cov(x,y)=xy=i=1N(xi-x)(yi-y)/Nor(n-1)
* D: units are meaningless, uninterpretable
9. Coefficient of Correlation: relative strength and direction of linear relationship b/w 2 variables with different units, unit free, deviation from y=x, stronger correlation means data points are close to the line
Sign depends on covariance since standard deviation is always positive (-1,1)
=Cov(x,y)/SxSy
Compare coefficient of Variation:
Stock A: Avg Price=$50, SD=$5
Stock B: Avg Price=$100, SD=$5
A: CV=(550)100%=10%
B: CV=(5100)100%=5%
Both have the same standard of deviation, however Stock B is less variable relative to its price.
Avg stock price $800, standard deviation $100, what interval will 95% of stock price be in?
mean +/- 2standard deviation contains 95% of the values in the data set
(800-2(100),800+2(100))
(600,1000)
Calculate final grade given Exam(45%)=70%, Participation(30%)=90%, Iclicker(5%)=0, Quiz(20%)=100%
Final Grade=
i=14wixi=20+0+27+31.5=78.5
Illustrate a correlation of r= -1, -0.6, 0, 1, 0.3
A1: Given (x_1,y_1)=(11,52) (x_2,y_2)=(13,72) (x_1,y_1)=(15,62) calcluate a)Sample Variance b)Sampe Covariance c)Sample correlation coefficient
a.4
b.10
c.1/2
A1: What will be the price range of 95% of stock given avg price=$650 & standard deviation=$100
450-850
A1: n=5, x_i{1,2,3,4,5} a)sum b)mean c)variance d)sum of x_i-x mean
a.15
b.3
c.2.5
d.0
A1: prove using summation operator a)Eax=aEx b)E(x+y)=Ex+Ey
c)E(ax+by)=aEx+bEy d)EEabxy=abExEy
See doc
Probability
A set of outcomes whose likelihood is defined by a function, relative frequency of an outcome occurring when random experiment repeated infinitely many times (Formal Definition: a function from the space of sets to the space of real values between 0 and 1)
RV + Basic Outcome + Sample Space + Event
Random Experiment: a process leading to an uncertain outcome (ex.Dice, coin flip)
Basic Outcome: a possible outcome of a random experiment (ex. 1,2,3,4,5,6)
Sample Space: collection of all possible outcomes of a random experiment (ex. S={1,2,3,4,5,6})
Event: any subset of basic outcomes from the sample space (ex. Let A be the event “Number rolled even”, then A={2,4,6}), if outcome of experiment is in A, then event A has occurred
Outcomes of rolling 2 dice. a)Identical Dice b)Diff dice
a)36
b)21
6 Types of Probability Set Relationships + Draw
- **Empty Set **, no element is in it, defines **Mutually Exclusive **AB=
- Subset AcB any element of A is also in B so AUB=B and AnB=A
-
Intersection of Events (n): set of all outcomes that belong to A & B in S, if A & B are events in a sample space S
1.** Union of Events** (u): set of all outcomes that belong to all of A & B in S, if A & B are events in a sample space S - Complement A (hat A): set of all basic outcomes that don’t belong to A in S, so that S=hatA+A
- Collectively Exhaustive: collection of events completely cover S, AuB=S
4 Properties of Set Operations + Draw
- Commutative (Order): AuB=BuA
- Associative (Order of Multiplication): (AuB)uC=Au(CuB)
- Distributive Law: An(BuC)=(AnB)u(AnC)
- De Morgan’s Law: hat(AuB)=hatAnhatB and hat(AnB)=hatAuhatB
Given S={1,2,3,4,5,6} A={2,4,6},B={4,5,6},C={4,5} find complements, intersections, unions, subset
See doc
Probability as Relative Frequency
P(A)=limn>inf n_A/n=# of events in population that satisfy A/total # of events in population
Repeating the experiment n approaching times, counting the number of times event A occurred as nA gives the ratio/relative frequency of event A occurring
Factorial & Combination & Permutation Formula
Factorial Formula: n!, number ways to order n objects
- How many ways to order n=#=8 runners in a sequence
Combination Formula: Ckn=(n!(n-k)!)k!=n!/k!(n-k)! number of unordered ways in which k objects can be selected from n objects
- How many ways to pick 3 (k=3) out of 8 (n=4) runners, who gets a medal
- True Combination lock accepts 1-2-3, 2-1-3, 3-2-1
- Has less outcomes, groupings < orders for each grouping
Permutation Formula: Pkn=n!/(n-k)!, n=total, k=limited spots = Total # of groupings/Limited # of orders
- How many ways to pick k=3=1st,2nd,3rd places, who gets what medal
- True Permutation lock only accepts 1-2-3
- Has more outcomes, orders for each grouping > groupings
Q. 5 Candidates, 2 positions, 3 men, 2 women, every candidate likely to be chosen, probability that no women will be hired:
- Total # of combinations: C25=5!2!(5-2)!=10
- Total combinations that only men are hired: C23=3!/2!(3-2)!=3
- Probability=# of events in population that satisfy A/total # of events in population=3/10=30%
Probablity as a Set Function + 3 Properties
Probability as a Set Function: real-valued set function P that assigns to each event A in the sample space S, a number P(A) that satisfies the following 3 properties:
1. Always positive
2. P(S)=100%=1=probability of all outcomes
3. Mutually exclusive events probability is the sum of each (addition rule)P(A1uA2u…uAk)=P(A_1)+P(A_2)+…+P(A_k) or P(AuB)=P(A)+P(B)
5 Probabiltiy Rules + Draw
- Complement Rule: P(A)=1-P(A), 1=P(A)+P(A)
- Addition Rule: P(AuB)=P(A)+P(B)-P(AnB) draw diagram see notes
P((AuB)(AuhatB))=P(AuB)+P(AuhatB)
Mutually Exclusive Addition Rule: P(AuB)=P(A)+P(B)
For any A & B Addition Rule: P(AuB)=P(A)+P(B)-1 - P(empty)=0
- If AcB, P(A)<P(B)
- P(AnB)>=P(A)+P(B)-1
- P(AuhatA)=1
- P(hatA|hatA)=1
Draw Probability Table + Table of Cards AcevsnonAce + Table of P(A)=P(AnB)+P(AnhatB)
See doc
Conditional Probability & Multiplication Rule
Conditional Probability: probability of one event A, given another event B is true/has occurredB new total sample space for A within B to be contained in
P(A|B)=P(AnB)/P(B)=# of A that satisfy spacetotal # of events in space
P(B|A)=P(AnB)/P(A)=# of B that satisfy space/total # of events in space
** Multiplication Rule**: rearranging conditional probability P(AnB)=P(A|B)P(B) or P(AnB)=P(B|A)P(A)
Outcome is an even number. What is the probability of having rolled a 6
S={1,2,3,4,5,6}, A{2,4,6}, B{6},P(A|B)=P(AB)P(B)=P(16)/P(2,4,6)=16/12=1/3
Probability that at least one die is equal to 2 when the sum of two numbers is less than or equal to 3
S={1…61…6}, A{2}, B{(1,1),(1,2),(2,1)},P(A|B)=P(AnB)/P(B)=2/36/3/36=1/18/1/12=2/3
Basic Outcomes=36, A{(2,xi)…(yi,2)},B{(1,1)(1,2)(2,1)},P(A|B)=2/36,P(B)=3/36
Probability of getting a red ace using multiplication rule.
+
Does P(A)=P(AnB)+P(AnhatB)=P(A|B)P(B)+P(A|hatB)P(hatB)
P(RednAce)=P(Red|Ace)P(Ace) =2/4 4/52=2/52
Statistical Independent:
Not correlated if either is true, both probability are unaffected (ex.Shape of coin vs Flipping heads)
P(AnB)=P(A)P(B)
P(A|B)=P(A) because the condition of B has no effect on probability of A
P(B|A)=P(B) because the condition of A has no effect on probability of B
A{2,4,6}, B{1,2,3,4} Statistically independent?
Yes cause
P(AnB)=P(A)P(B)
26=3/6 4/6=26
Bivariate Probabilities & Joint Distribution of X & Y & Marginal Probabilities
Draw Table + Diagram
Bivariate Probabilities: probabilities (A & B) that a certain event will occur when there are two independent random variables in your scenario
Joint Distribution of X{xi} & Y{yi}: described by bivariate probabilities
Marginal Probabilities: the probability of a single event occurring, independent of other events
(Ai,Bi) mutually exclusive Bi collectively exhaustiveP(A)=P(AnB1)+P(AnB2)+…+P(AnBk)
See doc for table & diagram
Difference b/w Joint Probability, Marginal Probability & Conditional Probability
Joint probability is the probability of two events occurring simultaneously.
Marginal probability is the probability of an event irrespective of the outcome of another variable.
Conditional probability is the probability of one event occurring in the presence of a second event that has occured.
Total Law of Probability + Draw
Total Law of Probability: Bi mutually exclusive & exhaustive events partitions A into k number mutually exclusive & exhaustive events such that
A=(AnB1)(AnB2)…(AnBk) therefore using addition rule:P(A)=P(AnB1)+P(AnB2)+…+P(AnBk)=kEi=1 P(AnBi)
If Bi mutually exclusive & collectively exhaustive(BiBj=, S=B1B2…Bk) subbing in multiplication rule → kE i=1 P(A|Bi)P(Bi) for any A
Bayes’ Theorem + Proof
Bayes’ Theorem: combines all previous concepts into one expression, how old info new info changes probability
P(B|A)=P(AB)/P(A)=P(AB)/P(AB)+P(AB)=P(A|B)P(B)/P(A)=P(A|B)P(B)/P(A|B)P(B)+P(A|B)P(B)
Proof:
1. Conditional Probability: P(B|A)=P(AB)P(A)
1. **Sub in Multiplication Rule **P(AB)=P(A|B)P(B) P(B|A)=P(A|B)P(B)P(A)
1. Mutually exclusive & collectively exhaustive B & hatB P(A)=P(AB)+P(AB) since =(AB)(AB), B=(A ∩ B) ∪ (A ∩ B¯)
P(B|A)=P(A|B)P(B)/P(AnB)+P(AnhatB)
1. Sub in Multiplication Rule P(AB)=P(A|B)P(B) & P(AB)=P(A|B)P(B)
General Theorem: see doc P(Bi|A)=P(A|Bi)P(Bi)P(A|B1)P(B1)+…+P(A|Bk)P(Bk)=P(A|Bi)P(Bi)i=1kP(A|Bi)P(Bi)
1. Conditional Probability: P(B|A)=P(AB)iP(A)
1. Total Law of Probability: P(A)=i=1kP(A|Bi)P(Bi) P(Bi|A)=P(ABi)i=1kP(A|Bi)P(Bi)
1. Multiplication Rule: sub in P(ABi)=P(A|Bi)P(Bi)
P(Bi|A)=P(A|Bi)P(Bi)i=1kP(A|Bi)P(Bi)
Your probability of having covid antibody (
B) if 10% of population has covid antibody (P(B)=10%) & your test is positive (A is true)
True Positive
P(A|B)=97.5%, if you have (B) covid antibody → probability of positive test (A) is 97.5%
False Positive P(A|B)=12.5%, if you don’t have (no B=B) → probability of positive test (A) is 12.5%
P(B|A)=?
P(A|B)=97.5%
P(A|B)=12.5%,
P(B)=10%
P(B)=90%
P(B|A)=P(A|B)P(B)P(A|B)P(B)+P(A|B)P(B)=(97.5%)(10%)(97.5%)(10%)+(12.5%)(90%)=46.4%
A2:
a)Prove that AcB, then P(A)<=P(B)
b) For any A & B, P(AnB)>=P(A)+P(B)-1
a)Addition rule, mutually exclusive, Anb=A, first property of probability P(hatAnB)>=0
b)Cancel out, divide both sides, it becomes a true statement/property
See doc
A2: Given P()\A=0.3 P(B|A)=0.6 P(B|hatA) find P(hatA|hatB)
Find elements of P(hatA|hatB)=P(hatAnhatB)/P(hatB)
or
P(A|B)=P(BA)P(B)P(A|B)=P(BA)P(B)=(1-P(B|A))P(A)P(B)=(1-P(B|A))(1-P(A))1-P(B)=(1-0.6)(1-0.3)1-0.6=0.7 see doc
0.7
A2: 8 candidates, 2 jobs, 4 women, 4 men, 1 set of brothers
a) Total combinations where only men are hired
b) Total combinations of where brothers are hired
c) Total combinations of where only men and only brothers are hired
a)(C 4 2)/(C 8/2)=only men combos/total combos=3/14
b)1/28
c) 1/28 since B c A AnB=B