Econometrics Flashcards
Typical Problems Estimating Economic Models - High multicollinearity
Definition: Two or more independent variables in a regression model exhibit a close linear relationship.
Consequences:
- Large standard errors and insignificant t-statistics
- Coefficient estimates sensitive to minor changes in model specification
- Nonsensical coefficient signs and magnitudes
Detection:
- Pairwise correlation coefficients
- Variance inflation factor (VIF)
Solution:
- Collect additional data.
- Re-specify the model.
- Drop redundant variables.
Typical Problems Estimating Economic Models - Heteroskedasticity
Definition: The variance of the error term changes in response to a change in the value of the independent variables.
Consequences:
- Inefficient coefficient estimates
- Biased standard errors
- Unreliable hypothesis tests
Detection:
- Park test
- Goldfeld-Quandt test
- Breusch
- Pagan test
- White test
Solution:
- Weighted least squares (WLS)
- Robust standard errors
Typical Problems Estimating Economic Models - Autocorrelation
Definition: An identifiable relationship (positive or negative) exists between the values of the error in one period and the values of the error in another period.
Consequences:
- Inefficient coefficient estimates
- Biased standard errors
- Unreliable hypothesis tests
Detection:
- Geary or runs test
- Durbin-Watson test
- Breusch-Godfrey test
Solution:
- Cochrane-Orcutt transformation
- Prais-Winsten transformation
- Newey-West robust standard errors
Rules for the mean
Rules for the variance
Rules for the covariance
- Let X, Y, and V be random variables; let mX and s2X be the mean and variane of X and let sXY be the covariance between X and Y; and let a, b, and c constants. The following rules follow:
- E(a + bX + cY) = a + bmX + cmy
- Var(a + bY) = b2s2Y
- Var(aX + bY) = as2X + 2absXY + b2s2Y
- E(Y2) = s2Y + m2y
- Cov(a + bX + cV, Y) = bsXY + csVY
- E(XY) = sXY + mxmy
corr(X,Y) | ≤ 1 and |sXY| ≤ Ös2Xs2Y (correlation inequality)
Formulas variance, covariance, sd
Rules correlation
Limit for unusual data
Below : µ-2σ
Above: µ+2σ
Empirical rule for normal distribution
About 68% of the data falls within: µ-σ to µ+σ
About 95%: µ-2σ to µ+2σ
About 99.7%:µ-3σ to µ+3σ
Least-squares line coefficients
Modified boxplot outliers
Why use panel data in regression?
- using panel data is one way of controlling for some types of omitted variables without actually observing them
panel data definition
- panel data: data in which each observatioal unit, or entity. is observed at two or more time periods; by studying changes in the dependent variable over time, it is possible to eliminate th effect of omitted varibales that differ across entities but are constant over time; more formally: data for n different entities observed at T different time periods
- example: effet of alcohol taxes and drunk driving laws on traffic fatalities in the US: use data across states over multiple years - this lets us control for unobserved variables that differ from one state to the next but do not change over time, e.g. cultural attitudes toward drinking and driving. It also allows us to control for variables that vary through time, but do not vary across states, e.g. improvements in the safety of new cars.
cross-sectional data
- Cross-sectional data, or a cross section of a study population, is data collected by observing many subjects at one point or period of time. Analysis of cross-sectional data usually consists of comparing the differences among selected subjects.
- Cross-sectional data differs from time series data, in which the entity is observed at various points in time. Another type of data, panel data (or longitudinal data), combines both cross-sectional and time series data ideas and looks at how the subjects (firms, individuals, etc.) change over a time series.
balanced / unbalanced panel
balanced: has all its observations, i.e. varibales are observed for each entity and each time period
panel data: before / after comparisons
by focusing on changes in the dependent variable over time, this differences comparison holds constant the onobserved factors tht differ from one state to the next but do not change over time within the state
how panel data eliminates effect of unobserved variables that do not change over time
because Zi (e.g. attitude toward drinking and driving) does not change over time, it will not produce any change in the fatality rate between two time periods. Thus, in the regression model, the influence of Zi can be eliminated by analyzing the change in the dependent variable between the two periods. If there is a difference between the two y-values, the change must have come from ohter sources, e.g. your independent variables or your error terms
why include an intercept?
allows for the possibility that the mean change in e.g. the fatality rate, in the abscence of a change in the real beer tax, is nonzero. For example, a negative intercept could reflect improvements in auto safety between two time periods that reduced the average fatality rate
does “before and after” method work for T>2?
not directly; to analyze all the observations in a panel data set, use the method of fixed effets regression
fixed effects regression
is a method for controlling for omitted variables in panel data when the omitted variables vary across entities, but do not change over time; T can be greater than 2
fixed effects regression model
entitity-specific intercepts as binary variables
entity-demeaned OLS algorithm
regression with time fixed effects only
entity and time fixed effects
regression error in panel data
can be correlated over time within an entity. Like heteroskedasticity, this correlation does not introduce bias in the fixed effects estimator, but it affects the variance of the fixed effects estimator, and therefore how one computes the standard errors
difference in regression assumptions between panel data and cross-sectional data
cross-sectional: each observation is independent, which arises under simple random sampling; in contrast, with panel data the variables are independent across entities but makes no such restriction within an entity; Xit can be correlated over time within an entity; if this applies to Xit then it is also known as autocorrelated or serially correlated; this is a pervasieve eture of time series data: what happens in one year tends to be correlated with what hapens in the next year; same applies to uit
standard errors for fixed effects regression
if regression errors are autocorrelated, then the usual heteroskedasticiy-robust SE formula for cross-section regression is not valid; SE that are valid if uit is potentially heteroskedastic and potentially correlated over time within an entity, are referred to as heteroskedasticity-and-autocorrelation-robust SE; we use one type of those, clustered SEs
clustered SEs
- Solution to issue that errors might be correlated over time: compute HAR- or Clustered-se’s
- Heteroskedasticity-and Autocorrelation-robust (also consistent, HAC)
- Allows for arbitrary correlation within clusters (entities i), but assumes no correlation across entities
- HAR se’s also consistent if no heteroskedasticity and/or no autocorrelation present
- HAR is biased, however, when number of entities is small (i.e. below 42), even with large T
- In stata:
command Y X, cluster(entity)
- in the context of panel data, each cluster consists of an entity; thus clustered SEs allow for heteroskedasticity and for arbitrary autocorrelation within an entity but treat the errors as uncorrelated across entities
- if the number of entities n is large, inference using clustered SEs can proceed using the usual large-sample normal critical values for t-statistics and F critical values for F-statistics testing q restrictions
- Not correcting for autocorrelation, i.e. not clustering in panel data regression, leads to standard errors which (usually) too low (can see this in regression outputs - compare SEs for regression with and without clustering
when 0 is included in CI
hypothesis that the independent variable has no effect on y cannot be rejected at the x% significance level
Quasi experiments: when real experiments aren’t feasible
- having a control group is unethical (e.g. giving ill people a placebo medication)
- examining effects that rely on person-factors
- cannot randomly assign peopel to be introverted etc.
- any experiment examining person-factors is not a true experiment (because such factors cannot be randomly assinged)
confounding variable
- “extra” variable that you didn’t account for. They can ruin an experiment and give you useless results. Confounding variables are any other variables besides your independent variable that have an effect on your dependent variable
- example: estimate effect of activity level on weight gain, a counfounding variable would be age, how much you eat etc.
- two major problems
- increase variance
- introduce bias
Confounding bias
- result of having confounding variables in your model. It has a direction, depending on if it over- or underestimates the effects of your model:
- Positive confounding: observed association is biased away from the null, i.e. it overestimates the effect.
- Negative confounding: observed association is biased toward the null, i.e. it underestimates the effect.
how to reduce confounding variables
- Bias can be eliminated with random samples.
- Introduce control variables to control for confounding variables, e.g. control for age by only measuring 30 year olds
- Counterbalancing can be used if you have paired designs. In counterbalancing, half of the group is measured under condition 1 and half is measured under condition 2.
internal validity
way to measure if research is sound. It is related to how many confounding variables you have in your experiment
external vs. internal validity
Internal validity is a way to gauge how strong your research methods were. External validity helps to answer the question: can the research be applied to the “real world”?
things that can affect validity
- Regression to the mean. This means that subjects in the experiment with extreme scores will tend to move towards the average.
- Pre-testing subjects. This may have unexpected consequences as it may be impossible to tell how the pre-test and during-tests interact. If “logical reasoning” is your dependent variable, participants may get clues from the pre-test.
- Changing the instruments during the study.
- Participants dropping out of the study. This is usually a bigger threat for experimental designs with more than one group.
- Failure to complete protocols.
- Something unexpected changes during the experiment, affecting the dependent variable.
measurement error
- difference between a measured quantity and its true value. It includes random error (naturally occurring errors that are to be expected with any experiment) and systematic error (caused by a mis-calibrated instrument that affects all measurements).
- For example, let’s say you were measuring the weights of 100 marathon athletes. The scale you use is one pound off: this is a systematic error that will result in all athletes body weight calculations to be off by a pound. On the other hand, let’s say your scale was accurate. Some might have wetter clothing or a 2 oz. candy bar in a pocket. These are random errors and are to be expected. In fact, all collected samples will have random errors — they are, for the most part, unavoidable.
different measures of error
- Absolute Error: the amount of error in your measurement. For example, if you step on a scale and it says 150 pounds but you know your true weight is 145 pounds, then the scale has an absolute error of 150 lbs – 145 lbs = 5 lbs.
- Greatest Possible Error: defined as one half of the measuring unit, e.g. if measures in whole yards, then the greatest possible error is one half yard.
- Instrument Error: error caused by an inaccurate instrument (like a scale that is off or a poorly worded questionnaire).
- Margin of Error: an amount above and below your measurement. For example, you might say that the average baby weighs 8 pounds with a margin of error of 2 pounds (± 2 lbs).
- Measurement Location Error: caused by an instrument being placed somewhere it shouldn’t, like a thermometer left out in the sun
- Operator Error: human factors that cause error, like reading a scale incorrectly.
- Percent Error: another way of expressing measurement error. Defined as: percent-error = (measured value – actual value)/actual value
- Relative Error: the ratio of the absolute error to the accepted measurement. As a formula, that’s: E(relative) = E(absolute)/E(measured)
ways to reduce measurement error
- Double check all measurements & formulas
- Make sure observers are well trained.
- Make the measurement with the instrument that has the highest precision.
- Take measurements under controlled conditions.
- Pilot test your measuring instruments, e.g. put together a focus group and ask how easy or difficult the questions were to understand.
- Use multiple measures for the same construct. For example, if you are testing for depression, use two different questionnaires.
statistical procedures to assess measurement error
- Standard error of measurement (SEM): estimates how repeated measurements taken on the same instrument are estimated around the true score.
- Coefficient of variation (CV): a measure of the variability of a distribution of repeated scores or measurements. Smaller values indicate a smaller variation and therefore values closer to the true score.
- Limits of agreement (LOA): gives an estimate of the interval where a proportion of the differences lie between measurements.
simultaneity bias
- where the explanatory variable is jointly determined with the dependent variable, i.e. X causes Y but Y also causes X. It is one cause of endogeneity (the other two are omitted variables and measurement error).
- A similar bias is reverse causation, where Y causes X (but X does not cause Y).
- Simultaneity bias is a term for the unexpected results that happen when the explanatory variable is correlated with the regression error term, ε (sometimes called the residual disturbance term), because of simultaneity. It’s so similar to omitted variables bias that the distinction between the two is often very unclear and in fact, both types of bias can be present in the same equation.
- The standard way to deal with this type of bias is with IV regression (e.g. two stage least squares).
simultaneity bias causes
- Changes in a RHS variable are causing changes in a LHS variable.
- Variables on LHS and RHS are jointly determined.
reverse causality
Instead of X causing a change in Y, it is really the other way around: Y is causing changes in X
estimator properties
multicollinearity
- occurs when there are high correlations between two or more predictor variables. In other words, one predictor variable can be used to predict the other. This creates redundant information, skewing the results in a regression model.
- Examples: a person’s height and weight, age and sales price of a car
how to detect multicollinearity
- calculate correlation coefficients for all pairs of predictor variables. If the correlation coefficient, r, is exactly +1 or -1, this is called perfect multicollinearity. If r is close to or exactly -1 or +1, one of the variables should be removed from the model if at all possible.
- Variance inflation factor (VIF)
consequences of multicollinearity
- The partial regression coefficient may be an imprecise estimate; SEs may be very large.
- Partial regression coefficients may have sign and/or magnitude changes as they pass from sample to sample.
- makes it difficult to gauge the effect of independent variables on dependent variables
- The t-statistic will generally be very small, i.e. insignificant, and coefficient CIs will be very wide. This means that it is harder to reject the null hypothesis.
- Coefficient estimates sensitive to minor changes in model specification
reasons for multicollinearity
- Data-based multicollinearity: caused by poorly designed experiments, data that is 100% observational, or data collection methods that cannot be manipulated. In some cases, variables may be highly correlated (usually due to collecting data from purely observational studies) and there is no error on the researcher’s part. For this reason, you should conduct experiments whenever possible, setting the level of the predictor variables in adance.
- Structural multicollinearity: caused by you, the researcher, creating new predictor variables.
- Dummy variables may be incorrectly used. For example, the researcher may fail to exclude one category, or add a dummy variable for every category (e.g. spring, summer, autumn, winter).
- Including a variable in the regression that is actually a combination of two other variables, e.g. including “total investment income” when total investment income = income from stocks and bonds + income from savings interest.
- Including two (almost) identical variables, e.g. weight in pounds and weight in kilos
- Insufficient data. In some cases, collecting more data can resolve the issue.
heteroskedasticity
- The variance of the error term changes in response to a change
in the value of the independent variables, i.e. the variance of the conditional distribution of u given X is constant- example: if x is higher social class of father and y is earnings of son, homoskedasticity implies that the variance of the error term is the same for people with father from higher socioeconomic class and for those whose father’s socioeconomic classfiication was lower
- Heteroscedastic data tends to follow a cone shape on a scatter graph.
- if you’re running any kind of regression analysis, having data that shows heteroscedasticity can ruin your results (at the very least, it will give you biased coefficients).
- In regression, an error is how far a point deviates from the regression line. Ideally, your data should be homoscedastic (i.e. the variance of the errors should be constant). This rarely happens. Most data is heteroscedastic by nature, e.g. predicting women’s weight from their height. In a Stepford Wives world, where everyone is a perfect dress size 6, this would be easy: short women weigh less than tall women. But it’s practically impossible to predict weight from height. Younger women (in their teens) tend to weigh less, while post-menopausal women often gain weight. But women of all shapes and sizes exist over all ages. This creates a cone shaped graph for variability. Plotting variation of women’s height/weight would result in a funnel that starts off small and spreads out as you move to the right of the graph. However, the cone can be in either direction:
- Cone spreads out to the right: small values of X give a small scatter while larger values of X give a larger scatter with respect to Y.
- Cone spreads out to the left: small values of X give a large scatter while larger values of X give a smaller scatter with respect to Y.
how to detect heteroskedasticity
- A residual plot can suggest (but not prove) heteroscedasticity. Residual plots are created by:
- Calculating the square residuals.
- Plotting the squared residuals against an explanatory variable (one that you think is related to the errors).
- Make a separate plot for each explanatory variable you think is contributing to the errors.
- Several tests can also be run:
- Park Test
- White Test
- Goldfeld-Quandt test
- Breusch
- Pagan test
consequences of heteroskedasticity
- OLS will not give you the estimator with the smallest variance (i.e. your estimators will not be useful).
- Significance tests will run either too high or too low.
- Standard errors will be biased, along with their corresponding test statistics and confidence intervals.
how to deal with heteroskedastic data
- Give data that produces a large scatter less weight, i.e. weighted least squares
- Transform the Y variable to achieve homoscedasticity. For example, use the Box-Cox normality plot to transform the data.
- robust standard errors
how to deal with multicollinearity
- Collect additional data.
- Re-specify the model.
- Drop redundant variables
autocorrelation
An identifiable relationship (positive or negative) exists between the values of the error in one period and the values of the error in another period.
autocorrelation consequences
- Inefficient coefficient estimates
- Biased standard errors
- Unreliable hypothesis tests
how to detect autocorrelation
- Geary or runs test
- Durbin-Watson test
- Breusch-Godfrey test
how to deal with autocorrelation
- Cochrane-Orcutt transformation
- Prais-Winsten transformation
- Newey-West robust standard errors
intuition behind variance & bias
covariance vs. correlation
- covariance: measure used to indicate the extent to which two random variables change in tandem.
- correlation: measure used to represent how strongly two random variables are related
- Covariance is nothing but a measure of correlation. On the contrary, correlation refers to the scaled form of covariance.
- The value of correlation takes place between -1 and +1. Conversely, the value of covariance lies between -∞ and +∞.
- Correlation is not affected by change in scale, but covariance is, i.e. if all the value of one variable is multiplied by a constant and all the value of another variable are multiplied, by a similar or different constant, then the covariance is changed.
- Correlation is dimensionless, i.e. it is a unit-free measure of the relationship between variables. Unlike covariance, where the value is obtained by the product of the units of the two variables.
- Covariances are hard to compare: when you calculate the covariance of a set of heights and weights, as expressed in meters and kilograms, you will get a different covariance from when you do it in other units, but also, it will be hard to tell if (e.g.) height and weight ‘covary more’ than, say the length of your toes and fingers, simply because the ‘scale’ the covariance is calculated on is different.
- The solution to this is to ‘normalize’ the covariance: you divide the covariance by something that represents the diversity and scale in both the covariates, and end up with a value that is assured to be between -1 and 1: the correlation. Whatever unit your original variables were in, you will always get the same result, and this will also ensure that you can, to a certain degree, compare whether two variables ‘correlate’ more than two others, simply by comparing their correlation.
what is a hypothesis
an educated guess about something in the world around you. It should be testable, either by experiment or observation.
a good hypothesis should contain:
- Include an “if” and “then” statement
- Include both the independent and dependent variables.
- Be testable by experiment, survey or other scientifically sound technique.
- Be based on information in prior research (either yours or someone else’s).
- Have design criteria (for engineering or programming projects).
hypothesis testing
- a way for you to test the results of a survey or experiment to see if you have meaningful results. You’re basically testing whether your results are valid by figuring out the odds that your results have happened by chance. If your results may have happened by chance, the experiment won’t be repeatable and so has little use.
- approach
- Figure out your null hypothesis,
- State your null hypothesis,
- Choose what kind of test you need to perform,
- Either support or reject the null hypothesis.
what is a null hypothesis?
- set the null-Hypothesis to the outcome you do not want to be true i.e. the outcome whose direct opposite you want to show.
- Basic example: Suppose you have developed a new medical treatment and you want to show that it is indeed better than placebo. So you set Null-Hypothesis H0:=new treament is equal or worse than placebo and Alternative Hypothesis H1:=new treatment is better than placebo.
- This because in the course of a statistical test you either reject the Null-Hypothesis (and favor the Alternative Hypothesis) or you cannot reject it. Since your “goal” is to reject the Null-Hypothesis you set it to the outcome you do not want to be true.
- The null hypothesis, H0 is the commonly accepted fact; it is the opposite of the alternate hypothesis. Researchers work to reject, nullify or disprove the null hypothesis. Researchers come up with an alternate hypothesis, one that they think explains a phenomenon, and then work to reject the null hypothesis.
- null comes from nullifiable, i.e. something you can invalidate
p-value
- it’s the smallest significance level at which the null hypothesis could be rejected
- used in hypothesis testing to help you support or reject the null hypothesis. The p value is the evidence against a null hypothesis. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis, i.e. a 0.02 (2%) p-value means that there is a 2% chance that your results could be random
- p-value is the probability of drawing a statistic at least as adverse to the null hypothesis as the one you actually computed. Equivalently, the p-value is the smallest significance level at which you can reject the null hypothesis.
- When you run a hypothesis test, you compare the p value from your test to the alpha level you selected when you ran the test. Alpha levels can also be written as percentages.
- Graphically, the p value is the area in the tail of a probability distribution. It’s the area to the right of the test statistic (if you’re running a two-tailed test, it’s the area to the left and to the right).
p-value vs alpha level
- Alpha levels are controlled by the researcher and are related to confidence levels. You get an alpha level by subtracting your confidence level from 100%, e.g. if you want to be 98% confident in your research, the alpha level would be 2%. When you run the hypothesis test, the test will give you a value for p. Compare that value to your chosen alpha level, e.g. say you chose alpha=5%. If the results from the test give you:
- A small p (≤ 0.05), reject the null hypothesis. This is strong evidence that the null hypothesis is invalid.
- A large p (> 0.05) means the alternate hypothesis is weak, so you do not reject the null.
p-values and critical values
- The p value is just one piece of information you can use when deciding if your null hypothesis is true or not. You can use other values given by your test to help you decide, e.g. if you run an f test two sample for variances, you’ll get a p value, an f-critical value and a f-value.
- Large p-value–>not reject the null. However, there’s also another way you can decide: compare your f-value with your f-critical value. If the f-critical value is smaller than the f-value, you should reject the null hypothesis
critical value
A critical value is a line on a graph that splits the graph into sections. One or two of the sections is the “rejection region”; if your test value falls into that region, then you reject the null hypothesis.
It’s the value of the statistic for which the test just rejects the null hypothesis at the given significance level
critical value of z
- is term linked to the area under the standard normal model. Critical values can tell you what probability any particular variable will have.
- the graph has two regions
- Central region: The z-score is equal to the number of sds from the mean. A score of 1.28 indicates that the variable is 1.28 sds from the mean. If you look in the z-table for a z of 1.28, you’ll find the area is .3997. This is the region to the right of the mean, so you’ll double it to get the area of the entire central region: .3997*2 = .7994 or about 80%.
- Tail region: The area of the tails (the red areas) is 1 minus the central region. In this example, 1-.8=.20, or about 20 percent. The tail regions are sometimes calculated when you want to know how many variables would be less than or more than a certain figure.
when are critical values of z used?
A critical value of z (Z-score) is used when the sampling distribution is normal, or close to normal. Z-scores are used when the population standard deviation is known or when you have larger sample sizes. While the z-score can also be used to calculate probability for unknown standard deviations and small samples, many statisticians prefer to use the t distribution to calculate these probabilities.
other uses of z-score
- Every statistic has a probability, and every probability calculated for a sample has a margin of error. The critical value of z can also be used to calculate the margin of error.
- Margin of error = Critical value * Standard deviation of the statistic
- Margin of error = Critical value * Standard error of the sample
finding z-score for a CI example
- Find a critical value for a 90% confidence level (Two-Tailed Test).
- Step 1: Subtract the confidence level from 100% to find the α level: 100% – 90% = 10%.
- Step 2: Convert Step 1 to a decimal: 10% = 0.10.
- Step 3: Divide Step 2 by 2 (this is called “α/2”).
- 0.10 = 0.05. This is the area in each tail.
- Step 4: Subtract Step 3 from 1 (because we want the area in the middle, not the area in the tail):
- 1 – 0.05 = .95.
- Step 5: Look up the area from Step in the z-table. The area is at z=1.645. This is your critical value for a confidence level of 90%
find a critical value: two-sided test
- Find the critical value for alpha of .05.
- Step 1: Subtract alpha from 1: 1 – .05 = .95
- Step 2: Divide Step 1 by 2 (because we are looking for a two-tailed test): .95 / 2 = .475
- Step 3: Look at your z-table and locate the answer from Step 2 in the middle section of the z-table.
- Step 4: In this example, you should have found the number .4750. Look to the far left or the row, you’ll see the number 1.9 and look to the top of the column, you’ll see .06. Add them together to get 1.96. That’s the critical value!
- Tip: The critical value appears twice in the z table because you’re looking for both a left hand and a right hand tail, so don’t forget to add the plus or minus sign! In our example you’d get ±1.96.
find a critical value: right-tailed test
- Find a critical value in the z-table for an alpha level of 0.0079.
- Step 1: Draw a diagram, like the one above. Shade in the area in the right tail. This area represents alpha, α. A diagram helps you to visualize what area you are looking for (i.e. if you want an area to the right of the mean or the left of the mean).
- Step 2: Subtract alpha (α) from 0.5: 0.5-0.0079 = 0.4921.
- Step 3: Find the result from step 2 in the center part of the z-table: The closest area to 0.4921 is 0.4922 at z=2.42.
find a critical value: left-sided test
- find the critical value in the z-table for α=.012 (left-tailed test).
- Step 1: Draw a diagram, like the one above. Shade in the area in the left tail (because you’re looking for a critical value for a left-tailed test). This area represents alpha, α.
- Step 2: Subtract alpha (α) from 0.5: 0.5 – 0.012 = 0.488.
- Step 3: Find the result from step 2 in the center part of the z-table. The closest area to 0.488 is at z=2.26. If you can’t find the exact area, just find the closest number and read the z value for that number.
- Step 4: Add a negative sign to Step 3 (left-tail critical values are always negative): -2.26.
types of critical values
- Various types of critical values are used to calculate significance, including: t scores from student’s t-tests, chi-square, and z-tests. In each of these tests, you’ll have an area where you are able to reject the null hypothesis, and an area where you cannot. The line that separates these two regions is where your critical values are.
- In the above image, the critical values are at 1.28 or -1.28. The blue area is where you must accept the null hypothesis. The red areas are where you can reject the null hypothesis. How large these areas actually are (and what test you use) is dependent on many factors, including your chosen confidence level and your sample size.
- Significance testing is used to figure out if your results differ from the null hypothesis. The null hypothesis is just an accepted fact about the population.
what is a t test?
- tells you how significant the differences between groups are, i.e. lets you know if those differences (measured in means/averages) could have happened by chance.
- example: Let’s say you have a cold and you try a naturopathic remedy. Your cold lasts a couple of days. The next time you have a cold, you buy an over-the-counter pharmaceutical and the cold lasts a week. You survey your friends and they all tell you that their colds were of a shorter duration (an average of 3 days) when they took the homeopathic remedy. What you really want to know is, are these results repeatable? A t test can tell you by comparing the means of the two groups and letting you know the probability of those results happening by chance.
*
t score
- ratio between the difference between two groups and the difference within the groups. The larger the t score, the more difference there is between groups. The smaller the t score, the more similarity there is between groups. A t score of 3 means that the groups are three times as different from each other as they are within each other. When you run a t test, the bigger the t-value, the more likely it is that the results are repeatable.
types of t test
- An Independent Samples t-test compares the means for two groups.
- A Paired sample t-test compares means from the same group at different times (say, one year apart).
- A One sample t-test tests the mean of a single group against a known mean.
- You probably don’t want to calculate the test by hand (the math can get very messy)
paired t test
- A paired t test (also called a correlated pairs t-test, a paired samples t test or dependent samples t test) is where you run a t test on dependent samples. Dependent samples are essentially connected — they are tests on the same person or thing. For example:
- Knee MRI costs at two different hospitals,
- Two tests on the same person before and after training,
- Two blood pressure measurements on the same person using different equipment.
When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test
- Choose the paired t-test if you have two measurements on the same item, person or thing. You should also choose this test if you have two items that are being measured with a unique condition. For example, you might be measuring car safety performance in Vehicle Research and Testing and subject the cars to a series of crash tests. Although the manufacturers are different, you might be subjecting them to the same conditions.
- With a “regular” two sample t test, you’re comparing the means for two different samples, e.g. you might test two different groups of customer service associates on a business-related test or testing students from two universities on their English skills. If you take a random sample each group separately and they have different conditions, your samples are independent and you should run an independent samples t test (also called between-samples and unpaired-samples).
- The null hypothesis for the for the independent samples t-test is μ1 = μ2. In other words, it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H0: µd = 0). The difference between the two tests is very subtle; which one you choose is based on your data collection method.
One tailed test or two in Hypothesis Testing
- In hypothesis testing, you are asked to decide if a claim is true or not. For example, if someone says “all Floridian’s have a 50% increased chance of melanoma”, it’s up to you to decide if this claim holds merit. One of the first steps is to look up a z-score, and in order to do that, you need to know if it’s a one tailed test or two. You can figure this out in just a couple of steps.
- Example question #1: A government official claims that the dropout rate for local schools is 25%. Last year, 190 out of 603 students dropped out. Is there enough evidence to reject the government official’s claim?
- Example question #2: A government official claims that the dropout rate for local schools is less than 25%. Last year, 190 out of 603 students dropped out. Is there enough evidence to reject the government official’s claim?
- Step 1: Read the question.
- Step 2: Rephrase the claim in the question with an equation. In example question #1, Drop out rate = 25%. In example question #2, Drop out rate < 25%
- Step 3: If step 2 has an equals sign in it, this is a two-tailed test. If it has > or < it is a one-tailed test.
t critical value
- A T critical value is a “cut off point” on the t distribution. It’s almost identical to the Z critical value (which cuts off an area on the normal distribution); The only real difference is that the shape of the t distribution is a different shape than the normal distribution, which results in slightly different values for cut off points.
- You’ll use your t value in a hypothesis test to compare against a calculated t score. This helps you to decide if you should support or reject a null hypothesis.
how to find a t critical value
- Subtract one from your sample size. This is your df, or degrees of freedom. For example, if the sample size is 8, then your df is 8 – 1 = 7.
- Choose an alpha level. The alpha level is usually given to you in the question — the most common one is 5% (0.05).
- Choose either the one tailed T Distribution table or two tailed T Distribution table. This depends on if you’re running a one tailed test or two.
- Look up the df in the left hand side of the t-distribution table and the alpha level along the top row. Find the intersection of the row and column. For this example (7 df, α = .05,) the t crit value is 1.895.
f test
- An “F Test” is a catch-all term for any test that uses the F-distribution. In most cases, when people talk about the F-Test, what they are actually talking about is The F-Test to Compare Two Variances. However, the f-statistic is used in a variety of tests including regression analysis, the Chow test and the Scheffe Test (a post-hoc ANOVA test).
- General steps for an f test: If you’re running an F Test using technology (for example, an F Test two sample for variances in Excel), the only steps you really need to do are Step 1 and 4 (dealing with the null hypothesis). Technology will calculate Steps 2 and 3 for you.
- State the null hypothesis and the alternate hypothesis.
- Calculate the F value. The F Value is calculated using the formula F = (SSE1 – SSE2 / m) / SSE2 / n-k, where SSE = residual sum of squares, m = number of restrictions and k = number of independent variables.
- Find the F Statistic (the critical value for this test). The F statistic formula is: F Statistic = variance of the group means / mean of the within group variances. You can find the F Statistic in the F-Table.
- Support or Reject the Null Hypothesis.
F Test to Compare Two Variances
- A Statistical F Test uses an F Statistic to compare two variances, s1 and s2, by dividing them. The result is always a positive number (because variances are always positive). The equation for comparing two variances with the f-test is:
- F = s21 / s22
- If the variances are equal, the ratio of the variances will equal 1. For example, if you had two data sets with a sample 1 (variance of 10) and a sample 2 (variance of 10), the ratio would be 10/10 = 1.
- You always test that the population variances are equal when running an F Test. In other words, you always assume that the variances are equal to 1. Therefore, your null hypothesis will always be that the variances are equal.
- Assumptions: Several assumptions are made for the test. Your population must be approximately normally distributed (i.e. fit the shape of a bell curve) in order to use the test. Plus, the samples must be independent events. In addition, you’ll want to bear in mind a few important points:
- The larger variance should always go in the numerator (the top number) to force the test into a right-tailed test. Right-tailed tests are easier to calculate.
- For two-tailed tests, divide alpha by 2 before finding the right critical value.
- If you are given standard deviations, they must be squared to get the variances.
- If your degrees of freedom aren’t listed in the F Table, use the larger critical value. This helps to avoid the possibility of Type I errors.
how to do f test
- If you are given standard deviations, go to Step 2. If you are given variances to compare, go to Step 3.
- Square both standard deviations to get the variances. For example, if σ1 = 9.6 and σ2 = 10.9, then the variances (s1 and s2) would be 9.62 = 92.16 and 10.92 = 118.81.
- Take the largest variance, and divide it by the smallest variance to get the f-value. For example, if your two variances were s1 = 2.5 and s2 = 9.4, divide 9.4 / 2.5 = 3.76. Why? Placing the largest variance on top will force the F-test into a right tailed test, which is much easier to calculate than a left-tailed test.
- Find your degrees of freedom. Degrees of freedom is your sample size minus 1. As you have two samples (variance 1 and variance 2), you’ll have two degrees of freedom: one for the numerator and one for the denominator.
- Look at the f-value you calculated in Step 3 in the f-table. Note that there are several tables, so you’ll need to locate the right table for your alpha level. Unsure how to read an f-table? Read What is an f-table?.
- Compare your calculated value (Step 3) with the table f-value in Step 5. If the f-table value is smaller than the calculated value, you can reject the null hypothesis.
two-tailed f test
- The difference between running a one or two tailed F test is that the alpha level needs to be halved for two tailed F tests.
- With a two tailed F test, you just want to know if the variances are not equal to each other. In notation:
- Ha = σ21 ≠ σ22
- Sample problem: Conduct a two tailed F Test on the following samples:
- Sample 1: Variance = 109.63, sample size = 41.
- Sample 2: Variance = 65.99, sample size = 21.
- Step 1: Write your hypothesis statements:
- Ho: No difference in variances.
- Ha: Difference in variances.
- Step 2: Calculate your F critical value. Put the highest variance as the numerator and the lowest variance as the denominator: F Statistic = variance 1/ variance 2 = 109.63 / 65.99 = 1.66
- Step 3: Calculate the degrees of freedom: The degrees of freedom in the table will be the sample size -1, so:
- Sample 1 has 40 df (the numerator).
- Sample 2 has 20 df (the denominator).
- Step 4: Choose an alpha level. No alpha was stated in the question, so use 0.05 (the standard “go to” in statistics). This needs to be halved for the two-tailed test, so use 0.025.
- Step 5: Find the critical F Value using the F Table. There are several tables, so make sure you look in the alpha = .025 table. Critical F (40,20) at alpha (0.025) = 2.287.
- Step 6: Compare your calculated value (Step 2) to your table value (Step 5). If your calculated value is higher than the table value, you can reject the null hypothesis:
- F calculated value: 1.66
- F value from table: 2.287.
- 1.66 < 2 .287.
- So we cannot reject the null hypothesis.