Bi-variate Analysis Flashcards

1
Q

Bi-variate relationship

A

Evaluate relationship between 2 variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Line of best fit

A

Distance between line and scatterplot points should be as small as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

^y = ^B(0)+^B(1)x+U

A

expected value of y equals the estimated slope time x plus the estimated y-intercept, plus the error (u)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Residuals

A
  • difference between actual and estimated values

- minimizing sum of squared residuals suggests actual and estimated are close

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

U

A

omitted variables that impact y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Regression

A

At the end of the day this is an estimate / prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

B(0)

A

estimated y-intercept, predicted value of y when x = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

B(1)

A

estimated slope, predicted change in y, when x changes by 1 unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Correlation Coefficient

A

measures how variances move / tightness of relationship
covariance / variance(x)*variance(y)
always between -1 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Derivative

A

taking derivative then setting formula equal to 0 will minimize sum of least squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Percentage Point

A
  • A % change relative to what you had before

- of a share

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

covariance(x,u)

A
  • Measure of if x is correlated with any omitted variables that influence the regression
  • cov(x,u) must = 0 or regression is biased
  • biased because we can’t establish casual effect, if y has a relationship w another variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

R-squared

A

AKA coefficient of determination, measures if line through a scatter is a good fit
The percent of variation in y that is explained by the model - what % of the variation in y are we getting
The higher the better, but only valuable if we want to precisely predict y
No effect with a higher sample size, but adding new regressors (k) will increase r-squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ceterus Paribus

A

-Rule out the possibility of other factors changing casual relationship, by holding other factors affecting dependent variable constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

E[^B(1)] formula

A
  • β1 + covariance(x,u) / variance(x)
  • bias will depend on whether or not cov(x,u) will be positive or not
  • if it is positive and ignored, then B1 is most likely biased down and vice-versa
  • for the expected value of B1-hat to equal B1, the cov(x,u) must be 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

4 Conditions for OLS to Yield Unbiased Estimator

A
  1. random sample
  2. y and x relationship must be linear in parameters
  3. variation in x
  4. cov(x,u) = 0

These four conditions being met, means that E(B1^) = B1

17
Q

P-value Approach

A

Alternative to finding t-critical

Probability of obtaining ^B(1) or a more extreme slope estimate

18
Q

P-value final answer

A

p-value > sig-level = |t-actual| < t-critical

p-value < sig-level = |t-actual|>t-critical

19
Q

Statistical Significance

A

Rejecting null hypothesis that slope = 0 means x has some level of non-zero effect on y

20
Q

cov(x,u) = 0

A

NO correlation between x and all the other variables that affect y

21
Q

Interpreting slope / y-int

A

Don’t forget this are predicted / estimated values and impacts

22
Q

Omitted Variables Bias (^B1)

A

if cox(x,u) > 0 - biased down because you are ignoring a positive correlation, and the estimated value of ^B1 is bigger b/c we are ignoring the positive correlation

if cov(x,u) < 0 - biased up because ignoring the negative correlation would be the expected value of ^B1 is likely smaller than what we thought

23
Q

Linear in Parameters

A

In order for us to run a regression the relationship between two variables, has to be a straight-line

24
Q

Variation in X

A

To tell the relationship between x and y, you need different x-values, no vertical line

No variation in x would make var(x) = 0

25
Q

Homoscedastic Assumption

A
  • v.var(u|x) = sigma-squared, if this condition of constant variance holds, we can estimate variance on ^B1
  • needs to hold to get BLUE
  • spread is the same as x-changes
26
Q

v.var(u|x)

A

variance of u conditional on x

27
Q

Multivariate Bias Conditions

A
  1. random sample
  2. linear in-parameters
  3. no perfect correlation between variables
  4. cov(x,u) = 0

perfect correlation only occurs if you include the same variable 2x or if you have two variables that sum to another variable.