Week 2: Ordinary least squares (OLS) and bivariate regression Flashcards

1
Q

The 6 steps of hypothesis testing

A
  1. Ensure that assumptions are met
  2. Formulate hypotheses
  3. Determine the critical area from the appropriate sampling distribution
  4. Calculate the test statistic
  5. Make decision
  6. State conclusions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Step 1: Ensure that assumptions are met

A
  • Random samples
  • Independt samples
  • Interval-ratio level of measurement
  • Sampling distribution is normally distributed

What is the probability that we would observe a particular sample statistic given this population? How unusual is this? Does this suggest the null hypothesis is false?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Step 2: Formulate hypotheses

A
  • Null is always no difference, no (positive/negative) effect etc.
  • Alternative is a difference OR a particular difference (greater than, less than, etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Step 3: Determine the critical area from the appropriate sampling distribution

A

Decision rule: Reject H0 if the found Z-Score (Z*) is less than the critical z-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Critical value of z

A

A critical value of z (Z-score) is used when the sampling distribution is normal, or close to normal. Z-scores are used when the population standard deviation is known or when you have larger sample sizes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to find critical value

A

Step 1: Subtract the confidence level from 100% to find the α level: 100% – 90% = 10%.

Step 2: Convert Step 1 to a decimal: 10% = 0.10.

Step 3: Divide Step 2 by 2 (this is called “α/2”).
0.10 = 0.05. This is the area in each tail.

Step 4: Subtract Step 3 from 1 (because we want the area in the middle, not the area in the tail):
1 – 0.05 = .95.

Step 5: Look up the area from Step in the z-table. The area is at z=1.645. This is your critical value for a confidence level of 90%.

For a 90% confidence level (Two-tailed test), for a one-tailed test; step 3 can be skipped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Step 4(1): Calculate the test statistic

A
  • For every sample statistic, there is a formula for its test statistic (not so important)
  • The test statistic allows us to make probability statements in terms of the “standard” distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Step 4(2): Using P-values

A

Reject H0 if p < 0,05 (ignore critical value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

P-value

Definition

A

The probability under the assumption of no effect or no difference (null hypothesis), of obtaining a result equal to or more extreme than what was actually observed.

P-value < 0.05 lower is generally considered statistically significant

The P stands for probability ands measures how likely it is that any observed difference between groups is due to chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Step 5 and Step 6: Decision and conclusion

A
  • Decision: H0 can/cannot be rejected
  • Conclusion: With 95% certainty… (this only counts if an alpha-level was chosen of 0.05)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Regression equation

A

Y = a + bX + e
- a = intercept
- b = slope
- X = the amount of the independent variable used
- Y = the amount of the dependent variable used
- e = error term; the error in predictiong the value of Y, given the value of X

Minimizes squared distances between point and line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ordinary Least Squares Estimation

A
  • Logic: Minimize the sum of the squared residuals
  • Residual is the difference between the actual value of Y and the predicted value of Y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Limits of the Ordinary Leat Squares regression

A
  • Collinearity between X-variables leads to misinterpretation of the coefficients
  • More observations than X-variables are required
  • Only one Y-variable can be modeled
How well did you know this?
1
Not at all
2
3
4
5
Perfectly