Econometrics Flashcards

1
Q

Acceptance region

A

The set of values of a test statistic for which the null hypothesis is accepted (is not rejected).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Adjusted R2( )

A

A modified version of R2 that does not necessarily increase when a new regressor is added to the regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ADL(p,q)

A

See autoregressive distributed lag model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

AIC

A

See information criterion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Akaike information criterion

A

See information criterion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Alternative hypothesis

A

The hypothesis that is assumed to be true if the null hypothesis is false. The alternative hypothesis is often denoted H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

AR(p)

A

See autoregression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ARCH

A

See autoregressive conditional heteroskedasticity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Asymptotic distribution

A

The approximate sampling distribution of a random variable computed using a large sample. For example, the asymptotic distribution of the sample average is normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Asymptotic normal distribution

A

A normal distribution that approximates the sampling distribution of a statistic computed using a large sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Attrition

A

The loss of subjects from a study after assignment to the treatment or control group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Augmented Dickey-Fuller (ADF) test

A

A regressionbased test for a unit root in an AR(p) model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Autocorrelation

A

The correlation between a time series variable and its lagged value.The jth autocorrelation of Y is the correlation between Yt and Yt2j.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Autocovariance

A

The covariance between a time series variable and its lagged value.The jth autocovariance of Y is the covariance between Yt and Yt2j.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Autoregression

A

A linear regression model that relates a time series variable to its past (that is, lagged) values. An autoregression with p lagged values as regressors is denoted AR(p).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Autoregressive conditional heteroskedasticity (ARCH)

A

A time series model of conditional heteroskedasticity. R2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Autoregressive distributed lag model

A

A linear regression model in which the time series variable Yt is expressed as a function of lags of Yt and of another variable, Xt.The model is denoted ADL(p,q), where p denotes the number of lags of Yt and q denotes the number of lags of Xt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Average causal effect

A

The population average of the individual causal effects in a heterogeneous population. Also called the average treatment effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Balanced panel

A

A panel data set with no missing observations, that is, in which the variables are observed for each entity and each time period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Base specification

A

A baseline or benchmark regression specification that includes a set of regressors chosen using a combination of expert judgment, economic theory, and knowledge of how the data were collected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Bayes information criterion

A

See information criterion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Bernoulli distribution

A

The probability distribution of a Bernoulli random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Bernoulli random variable

A

A random variable that takes on two values, 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Best linear unbiased estimator

A

An estimator that has the smallest variance of any estimator that is a linear function of the sample values Y and is unbiased. Under the Gauss-Markov conditions, the OLS estimator is the best linear unbiased estimator of the regression coefficients conditional on the values of the regressors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Bias
The expected value of the difference between an estimator and the parameter that it is estimating. If is an estimator of mY, then the bias of is E( )2 mY.
26
BIC
See information criterion.
27
Binary variable
A variable that is either 0 or 1.A binary variable is used to indicate a binary outcome. For example,X is a binary (or indicator, or dummy) variable for a person’s gender if X 5 1 if the person is female and X 5 0 if the person is male. mˆY mˆY mˆY
28
Bivariate normal distribution
A generalization of the normal distribution to describe the joint distribution of two random variables.
29
BLUE
See best linear unbiased estimator.
30
Break date
The date of a discrete change in population time series regression coefficient(s).
31
Causal effect
The expected effect of a given intervention or treatment as measured in an ideal randomized controlled experiment.
32
Central limit theorem
A result in mathematical statistics that says that, under general conditions, the sampling distribution of the standardized sample average is well approximated by a standard normal distribution when the sample size is large.
33
Chi-squared distribution
The distribution of the sum of m squared independent standard normal random variables.The parameter m is called the degrees of the freedom of the chi-squared distribution.
34
Chow test
A test for a break in a time series regression at a known break date.
35
Coefficient of determination
See R2.
36
Cointegration
When two or more time series variables share a common stochastic trend.
37
Common trend
A trend shared by two or more time series.
38
Conditional distribution
The probability distribution of one random variable given that another random variable takes on a particular value.
39
Conditional expectation
The expected value of one random value given that another random variable takes on a particular value.
40
Conditional heteroskedasticity
The variance, usually of an error term, depends on other variables.
41
Conditional mean
The mean of a conditional distribution; see conditional expectation.
42
Conditional mean independence
The conditional expectation of the regression error ui, given the regressors, depends on some but not all of the regressors.
43
Conditional variance
The variance of a conditional distribution.
44
Confidence interval (or confidence set)
An interval (or set) that contains the true value of a population parameter with a prespecified probability when computed over repeated samples.
45
Confidence level
The prespecified probability that a confidence interval (or set) contains the true value of the parameter.
46
Consistency
Means that an estimator is consistent. See consistent estimator.
47
Consistent estimator
An estimator that converges in probability to the parameter that it is estimating.
48
Constant regressor
The regressor associated with the regression intercept; this regressor is always equal to 1.
49
Constant term
The regression intercept.
50
Continuous random variable
A random variable that can take on a continuum of values.
51
Control group
The group that does not receive the treatment or intervention in an experiment.
52
Control variable
Another term for a regressor; more specifically, a regressor that controls for one of the factors that determine the dependent variable.
53
Convergence in distribution
When a sequence of distributions converges to a limit; a precise definition is given in Section 17.2.
54
Convergence in probability
When a sequence of random variables converges to a specific value; for example, when the sample average becomes close to the population mean as the sample size increases; see Key Concept 2.6 and Section 17.2.
55
Correlation
A unit-free measure of the extent to which two random variables move, or vary, together.The correlation (or correlation coefficient) between X and Y is sXY/sXsY and is denoted corr(X,Y).
56
Correlation coefficient
See correlation.
57
Covariance
A measure of the extent to which two random variables move together.The covariance between X and Y is the expected value E[(X 2 mX)(Y 2 mY)], and is denoted by cov(X,Y) or by sXY.
58
Covariance matrix
A matrix composed of the variances and covariances of a vector of random variables.
59
Critical value
The value of a test statistic for which the test just rejects the null hypothesis at the given significance level.
60
Cross-sectional data
Data collected for different entities in a single time period.
61
Cubic regression model
A nonlinear regression function that includes X, X2, and X3 as regressors.
62
Cumulative distribution function (c.d.f.)
See cumulative probability distribution.
63
Cumulative dynamic multiplier
The cumulative effect of a unit change in the time series variable X on Y.The h-period cumulative dynamic multiplier is the effect of a unit change in Xt on Yt + Yt+1+ . . . + Yt+h.
64
Cumulative probability distribution
A function showing the probability that a random variable is less than or equal to a given number.
65
Dependent variable
The variable to be explained in a regression or other statistical model; the variable appearing on the left-hand side in a regression.
66
Deterministic trend
A persistent long-term movement of a variable over time that can be represented as a nonrandom function of time.
67
Dickey-Fuller test
A method for testing for a unit root in a first order autoregression [AR(1)].
68
Differences estimator
An estimator of the causal effect constructed as the difference in the sample average outcomes between the treatment and control groups.
69
Differences-in-differences estimator
The average change in Y for those in the treatment group, minus the average change in Y for those in the control group.
70
Discrete random variable
A random variable that takes on discrete values.
71
Distributed lag model
A regression model in which the regressors are current and lagged values of X.
72
Dummy variable
See binary variable.
73
Dummy variable trap
A problem caused by including a full set of binary variables in a regression together with a constant regressor (intercept), leading to perfect multicollinearity.
74
Dynamic causal effect
The causal effect of one variable on current and future values of another variable.
75
Dynamic multiplier
The h-period dynamic multiplier is the effect of a unit change in the time series variable Xt on Yt+h.
76
Endogenous variable
A variable that is correlated with the error term.
77
Error term
The difference between Y and the population regression function, denoted by u in this textbook.
78
Errors-in-variables bias
The bias in an estimator of a regression coefficient that arises from measurement errors in the regressors.
79
Estimate
The numerical value of an estimator computed from data in a specific sample.
80
Estimator
A function of a sample of data to be drawn randomly from a population. An estimator is a procedure for using sample data to compute an educated guess of the value of a population parameter, such as the population mean.
81
Exact distribution
The exact probability distribution of a random variable.
82
Exact identification
When the number of instrumental variables equals the number of endogenous regressors.
83
Exogenous variable
A variable that is uncorrelated with the regression error term.
84
Expected value
The long-run average value of a random variable over many repeated trials or occurrences. It is the probability-weighted average of all possible values that the random variable can take on.The expected value of Y is denoted E(Y) and is also called the expectation of Y.
85
Experimental data
Data obtained from an experiment designed to evaluate a treatment or policy or to investigate a causal effect.
86
Experimental effect
When experimental subjects change their behavior because they are part of an experiment.
87
Explained sum of squares (ESS)
The sum of squared deviations of the predicted values of Yi, ,from their average; see Equation (4.14).
88
Explanatory variable
See regressor.
89
External validity
Inferences and conclusions from a statistical study are externally valid if they can be generalized from the population and the setting studied to other populations and settings.
90
F-statistic
A statistic used to a test joint hypothesis concerning more than one of the regression coefficients.
91
Fm,n distribution
The distribution of a ratio of independent random variables, where the numerator is a chi-squared random variable with m degrees of freedom, divided by m, and the denominator is a chi-squared random variable with n degrees of freedom divided by n.
92
Fm,∞ distribution
The distribution of a random variable with a chi-squared distribution with m degrees of freedom, divided by m.
93
Feasible GLS
A version of the generalized least squares (GLS) estimator that uses an estimator of the conditional variance of the regression errors and covariance between the regression errors at different observations.
94
Feasible WLS
A version of the weighted least squares (WLS) estimator that uses an estimator of the conditional variance of the regression errors.
95
First difference
The first difference of a time series variable Yt is Yt 2 Yt21, denoted DYt.
96
First-stage regression
The regression of an included endogenous variable on the included exogenous variables, if any, and the instrumental variable(s) in two stage least squares.
97
Fitted values
See predicted values.
98
Fixed effects
Binary variables indicating the entity or time period in a panel data regression.
99
Fixed effects regression model
A panel data regression that includes entity fixed effects. ˆYi
100
Forecast error
The difference between the value of the variable that actually occurs and its forecasted value.
101
Forecast interval
An interval that contains the future value of a time series variable with a prespecified probability.
102
Functional form misspecification
When the form of the estimated regression function does not match the form of the population regression function; for example, when a linear specification is used but the true population regression function is quadratic.
103
GARCH
See generalized autoregressive conditional heteroskedasticity.
104
Gauss-Markov theorem
Mathematical result stating that, under certain conditions, the OLS estimator is the best linear unbiased estimator of the regression coefficients conditional on the values of the regressors.
105
Generalized autoregressive conditional heteroskedasticity
A time series model for conditional heteroskedasticity.
106
Generalized least squares (GLS)
A generalization of OLS that is appropriate when the regression errors have a known form of heteroskedasticity (in which case GLS is also referred to as weighted least squares, WLS) or a known form of serial correlation.
107
Generalized method of moments
A method for estimating parameters by fitting sample moments to population moments that are functions of the unknown parameters. Instrumental variables estimators are an important special case.
108
GMM
See generalized method of moments.
109
Granger causality test
A procedure for testing whether current and lagged values of one time series help predict future values of another time series.
110
HAC standard errors
See heteroskedasticity- and autocorrelation-consistent (HAC) standard errors.
111
Hawthorne effect
See experimental effect.
112
Heteroskedasticity
The situation in which the variance of the regression error term ui, conditional on the regressors, is not constant. Heteroskedasticity- and autocorrelation-consistent
113
(HAC) standard errors
Standard errors for OLS estimators that are consistent whether or not the regression errors are heteroskedastic and autocorrelated.
114
Heteroskedasticity-robust standard error
Standard errors for the OLS estimator that are appropriate whether the error term is homoskedastic or heteroskedastic.
115
Heteroskedasticity-robust t-statistic
A t-statistic constructed using a heteroskedasticity-robust standard error.
116
Homoskedasticity
The variance of the error term ui, conditional on the regressors, is constant.
117
Homoskedasticity-only F statistic
A form of the Fstatistic that is valid only when the regression errors are homoskedastic.
118
Homoskedasticity-only standard errors
Standard errors for the OLS estimator that are appropriate only when the error term is homoskedastic.
119
Hypothesis test
A procedure for using sample evidence to help determine if a specific hypothesis about a population is true or false.
120
i.i.d.
Independently and indentically distributed.
121
Identically distributed
When two or more random variables have the same distribution.
122
Impact effect
The contemporaneous, or immediate, effect of a unit change in the time series variable Xt on Yt.
123
Imperfect multicollinearity
The condition in which two or more regressors are highly correlated.
124
Included endogenous variables
Regressors that are correlated with the error term (usually in the context of instrumental variable regression).
125
Included exogenous variables
Regressors that are uncorrelated with the error term (usually in the context of instrumental variable regression).
126
Independence
When knowing the value of one random variable provides no information about the value of another random variable.Two random variables are independent if their joint distribution is the product of their marginal distributions.
127
Indicator variable
See binary variable.
128
Information criterion
A statistic used to estimate the number of lagged variables to include in an autoregression or a distributed lag model. Leading examples are the Akaike information criterion (AIC) and the Bayes information criterion (BIC).
129
Instrument
See instrumental variable.
130
Instrumental variable
A variable that is correlated with an endogenous regressor (instrument relevance) and is uncorrelated with the regression error (instrument exogeneity).
131
Instrumental variables (IV) regression
A way to obtain a consistent estimator of the unknown coefficients of the population regression function when the regressor,X, is correlated with the error term, u.
132
Interaction term
A regressor that is formed as the product of two other regressors, such as X1i 3 X2i.
133
Intercept
The value of b0 in the linear regression model.
134
Internal validity
When inferences about causal effects in a statistical study are valid for the population being studied.
135
J-statistic
Astatistic for testing overidentifying restrictions in instrumental variables regression.
136
Joint hypothesis
A hypothesis consisting of two or more individual hypotheses, that is, involving more than one restriction on the parameters of a model.
137
Joint probability distribution
The probability distribution determining the probabilities of outcomes involving two or more random variables.
138
Kurtosis
A measure of how much mass is contained in the tails of a probability distribution.
139
Lags
The value of a time series variable in a previous time period.The jth lag of Yt is Yt2j.
140
Law of iterated expectations
A result in probability theory that says that the expected value of Y is the expected value of its conditional expectation given X, that is, E(Y) 5 E[E(Y X)].
141
Law of large numbers
According to this result from probability theory, under general conditions the sample average will be close to the population mean with very high probability when the sample size is large.
142
Least squares assumptions
The assumptions for the linear regression model listed in Key Concept 4.3 (single variable regression) and Key Concept 6.4 (multiple regression model).
143
Least squares estimator
An estimator formed by minimizing the sum of squared residuals.
144
Limited dependent variable
A dependent variable that can take on only a limited set of values. For example, the variable might be a 021 binary variable or arise from one of the models described in Appendix 11.3.
145
Linear-log model
A nonlinear regression function in which the dependent variable is Y and the independent variable is ln(X).
146
Linear probability model
A regression model in which Y is a binary variable.
147
Linear regression function
A regression function with a constant slope.
148
Local average treatment effect
A weighted average treatment effect estimated, for example, by TSLS.
149
Log-linear model
A nonlinear regression function in which the dependent variable is ln(Y) and the independent variable is X.
150
Log-log model
A nonlinear regression function in which the dependent variable is ln(Y) and the independent variable is ln(X). @
151
Logarithm
A mathematical function defined for a positive argument; its slope is always positive but tends to zero.The natural logarithm is the inverse of the exponential function, that is, X 5 ln(eX).
152
Logit regression
A nonlinear regression model for a binary dependent variable in which the population regression function is modeled using the cumulative logistic distribution function.
153
Long-run cumulative dynamic multiplier
The cumulative long-run effect on the time series variable Y of a change in X.
154
Longitudinal data
See panel data.
155
Marginal probability distribution
Another name for the probability distribution of a random variable Y, which distinguishes the distribution of Y alone (the marginal distribution) from the joint distribution of Y and another random variable.
156
Maximum likelihood estimator (MLE)
An estimator of unknown parameters that is obtained by maximizing the likelihood function; see Appendix 11.2.
157
Mean
The expected value of a random variable.The mean of Y is denoted mY.
158
Moments of a distribution
The expected value of a random variable raised to different powers.The rth moment of the random variable Y is E(Yr).
159
Multicollinearity
See perfect multicollinearity and imperfect multicollinearity.
160
Multiple regression model
An extension of the single variable regression model that allows Y to depend on k regressors.
161
Natural experiment
See quasi-experiment.
162
Natural logarithm
See logarithm.
163
95% confidence set
A confidence set with a 95% confidence level; see confidence interval.
164
Nonlinear least squares
The analog of OLS that applies when the regression function is a nonlinear function of the unknown parameters.
165
Nonlinear least squares estimator
The estimator obtained by minimizing the sum of squared residuals when the regression function is nonlinear in the parameters.
166
Nonlinear regression function
A regression function with a slope that is not constant.
167
Nonstationary
When the joint distribution of a time series variable and its lags changes over time.
168
Normal distribution
A commonly used bell-shaped distribution of a continuous random variable.
169
Null hypothesis
The hypothesis being tested in a hypothesis test, often denoted by H0.
170
Observation number
The unique identifier assigned to each entity in a data set.
171
Observational data
Data based on observing, or measuring, actual behavior outside an experimental setting. OLS estimator. See ordinary least squares estimator.
172
OLS regression line
The regression line with population coefficients replaced by the OLS estimators.
173
OLS residual
The difference between Yi and the OLS regression line, denoted by in this textbook.
174
Omitted variables bias
The bias in an estimator that arises because a variable that is a determinant of Y and is correlated with a regressor has been omitted from the regression.
175
One-sided alternative hypothesis
The parameter of interest is on one side of the value given by the null hypothesis.
176
Order of integration
The number of times that a time series variable must be differenced to make it stationary.A time series variable that is integrated of order p must be differenced p times and is denoted I(p).
177
Ordinary least squares estimator
The estimator of the regression intercept and slope(s) that minimizes the sum of squared residuals.
178
Outlier
An exceptionally large or small value of a random variable.
179
Overidentification
When the number of instrumental variables exceeds the number of included endogenous regressors.
180
p-value
The probability of drawing a statistic at least as adverse to the null hypothesis as the one actually computed, assuming the null hypothesis is correct. Also called the marginal significance probability, the p-value is the smallest significance level at which the null hypothesis can be rejected.
181
Panel data
Data for multiple entities where each entity is observed in two or more time periods.
182
Parameter
A constant that determines a characteristic of a probability distribution or population regression function.
183
Partial compliance
Occurs when some participants fail to follow the treatment protocol in a randomized experiment.
184
Partial effect
The effect on Y of changing one of the regressors, holding the other regressors constant.
185
Perfect multicollinearity
Occurs when one of the regressors is an exact linear function of the other regressors.
186
Polynomial regression model
A nonlinear regression function that includes X, X2, . . . and Xr as regressors, where r is an integer. uˆi
187
Population
The group of entities—such as people, companies, or school districts—being studied.
188
Population coefficients
See population intercept and slope.
189
Population intercept and slope
The true, or population, values of b0 (the intercept) and b1 (the slope) in a single variable regression. In a multiple regression, there are multiple slope coefficients (b1, b2, . . . , bk), one for each regressor.
190
Population multiple regression model
The multiple regression model in Key Concept 6.2.
191
Population regression line
In a single variable regression, the population regression line is b0 + b1Xi, and in a multiple regression it is b0 + b1X1i + b2X2i + . . . + bkXki.
192
Power
The probability that a test correctly rejects the null hypothesis when the alternative is true.
193
Predicted value
The value of Yi that is predicted by the OLS regression line, denoted by in this textbook.
194
Price elasticity
The percentage change in the quantity demanded resulting from a 1% increase in price.
195
Probability
The proportion of the time that an outcome (or event) will occur in the long run.
196
Probability density function (p.d.f.)
For a continuous random variable, the area under the probability density function between any two points is the probability that the random variable falls between those two points.
197
Probability distribution
For a discrete random variable, a list of all values that a random variable can take on and the probability associated with each of these values.
198
Probit regression
A nonlinear regression model for a binary dependent variable in which the population regression function is modeled using the cumulative standard normal distribution function.
199
Program evaluation
The field of study concerned with estimating the effect of a program, policy, or some other intervention or “treatment.”
200
Pseudo out-of-sample forecast
A forecast computed over part of the sample using a procedure that is as if these sample data have not yet been realized.
201
Quadratic regression model
A nonlinear regression function that includes X and X2 as regressors.
202
Quasi-experiment
A circumstance in which randomness is introduced by variations in individual circumstances that make it appear as if the treatment is randomly assigned. ˆYi
203
R2
In a regression, the fraction of the sample variance of the dependent variable that is explained by the regressors
204
Random walk
A time series process in which the value of the variable equals its value in the previous period, plus an unpredictable error term.
205
Random walk with drift
A generalization of the random walk in which the change in the variable has a nonzero mean but is otherwise unpredictable.
206
Randomized controlled experiment
An experiment in which participants are randomly assigned to a control group, which receives no treatment, or to a treatment group, which receives a treatment.
207
Regressand
See dependent variable.
208
Regression specification
A description of a regression that includes the set of regressors and any nonlinear transformation that has been applied.
209
Regressor
A variable appearing on the right-hand side of a regression; an independent variable in a regression.
210
Rejection region
The set of values of a test statistic for which the test rejects the null hypothesis.
211
Repeated cross-sectional data
A collection of crosssectional data sets, where each cross-sectional data set corresponds to a different time period.
212
Restricted regression
Aregression in which the coefficients are restricted to satisfy some condition. For example, when computing the homoskedasticityonly F-statistic, this is the regression with coefficients restricted to satisfy the null hypothesis.
213
Root mean squared forecast error
The square root of the mean of the squared forecast error.
214
Sample correlation
An estimator of the correlation between two random variables.
215
Sample covariance
An estimator of the covariance between two random variables.
216
Sample selection bias
The bias in an estimator of a regression coefficient that arises when a selection process influences the availability of data and that process is related to the dependent variable.This induces correlation between one or more regressors and the regression error.
217
Sample standard deviation
An estimator of the standard deviation of a random variable.
218
Sample variance
An estimator of the variance of a random variable.
219
Sampling distribution
The distribution of a statistic over all possible samples; the distribution arising from repeatedly evaluating the statistic using a R2 series of randomly drawn samples from the same population.
220
Scatterplot
A plot of n observations on Xi and Yi, in which each observation is represented by the point (Xi,Yi).
221
Serial correlation
See autocorrelation.
222
Serially uncorrelated
A time series variable with all autocorrelations equal to zero.
223
Significance level
The prespecified rejection probability of a statistical hypothesis test when the null hypothesis is true.
224
Simple random sampling
When entities are chosen randomly from a population using a method that ensures that each entity is equally likely to be chosen.
225
Simultaneous causality bias
When, in addition to the causal link of interest from X to Y, there is a causal link from Y to X. Simultaneous causality makes X correlated with the error term in the population regression of interest.
226
Simultaneous equations bias
See simultaneous causality bias.
227
Size of a test
The probability that a test incorrectly rejects the null hypothesis when the null hypothesis is true.
228
Skewness
A measure of the aysmmetry of a probability distribution.
229
Standard deviation
The square root of the variance. The standard deviation of the random variable Y, denoted sY, has the units of Y and is a measure of the spread of the distribution of Y around its mean.
230
Standard error of an estimator
An estimator of the standard deviation of the estimator.
231
Standard error of the regression (SER)
An estimator of the standard deviation of the regression error u.
232
Standard normal distribution
The normal distribution with mean equal to 0 and variance equal to 1, denoted N(0, 1).
233
Standardizing a random variable
An operation accomplished by subtracting the mean and dividing by the standard deviation, which produces a random variable with a mean of 0 and a standard deviation of 1.The standardized value of Y is (Y 2 mY)/sY.
234
Stationarity
When the joint distribution of a time series variable and its lagged values does not change over time.
235
Statistically insignificant
The null hypothesis (typically, that a regression coefficient is zero) cannot be rejected at a given significance level.
236
Statistically significant
The null hypothesis (typically, that a regression coefficient is zero) is rejected at a given significance level.
237
Stochastic trend
A persistent but random long-term movement of a variable over time.
238
Strict exogeneity
The requirement that the regression error has a mean of zero conditional on current, future, and past values of the regressor in a distributed lag model.
239
Student t distribution
The Student t distribution with m degrees of freedom is the distribution of the ratio of a standard normal random variable, divided by the square root of an independently distributed chi-squared random variable with m degrees of freedom divided by m.As m gets large, the Student t distribution converges to the standard normal distribution.
240
Sum of squared residuals (SSR)
The sum of the squared OLS residuals.
241
t-distribution
See Student t distribution.
242
t-ratio
See t-statistic.
243
t-statistic
A statistic used for hypothesis testing. See Key Concept 5.1.
244
Test for a difference in means
A procedure for testing whether two populations have the same mean.
245
Time effects
Binary variables indicating the time period in a panel data regression.
246
Time and entity fixed effects regression model
A panel data regression that includes both entity fixed effects and time fixed effects.
247
Time fixed effects
See time effects.
248
Time series data
Data for the same entity for multiple time periods.
249
Total sum of squares (TSS)
The sum of squared deviations of Yi, from its average, .
250
Treatment effect
The causal effect in an experiment or a quasi-experiment; see causal effect.
251
Treatment group
The group that receives the treatment or intervention in an experiment.
252
TSLS
See two stage least squares. Y
253
Two-sided alternative hypothesis
When, under the alternative hypothesis, the parameter of interest is not equal to the value given by the null hypothesis.
254
Two stage least squares
An instrumental variable estimator, described in Key Concept 12.2.
255
Type I error
In hypothesis testing, the error made when the null hypothesis is true but is rejected.
256
Type II error
In hypothesis testing, the error made when the null hypothesis is false but is not rejected.
257
Unbalanced panel
A panel data set in which some data are missing.
258
Unbiased estimator
An estimator with a bias that is equal to zero.
259
Uncorrelated
Two random variables are uncorrelated if their correlation is zero.
260
Underidentification
When the number of instrumental variables is less than the number of endogenous regressors.
261
Unit root
Refers to an autoregression with a largest root equal to 1.
262
Unrestricted regression
When computing the homoskedasticity-only F-statistic, this is the regression that applies under the alternative hypothesis, so the coefficients are not restricted to satisfy the null hypothesis.
263
VAR
See vector autoregression.
264
Variance
The expected value of the squared difference between a random variable and its mean; the variance of Y is denoted .
265
Vector autoregression
A model of k time series variables consisting of k equations, one for each variable, in which the regressors in all equations are lagged values of all the variables.
266
Volatility clustering
When a time series variable exhibits some clustered periods of high variance and other clustered periods of low variance.
267
Weak instruments
Instrumental variables that have a low correlation with the endogenous regressor(s).
268
Weighted least squares (WLS)
An alternative to OLS that can be used when the regression error is heteroskedastic and the form of the heteroskedasticity is known or can be estimated.