Econometrics Flashcards

Question

Bias

Answer 1

The expected value of the difference between an estimator and the parameter that it is estimating. If is an estimator of mY, then the bias of is E( )2 mY.

Answer 2

See information criterion.

Answer 3

A variable that is either 0 or 1.A binary variable is used to indicate a binary outcome. For example,X is a binary (or indicator, or dummy) variable for a person’s gender if X 5 1 if the person is female and X 5 0 if the person is male. mˆY mˆY mˆY

Answer 4

A generalization of the normal distribution to describe the joint distribution of two random variables.

Answer 5

See best linear unbiased estimator.

Answer 6

The date of a discrete change in population time series regression coefficient(s).

Answer 7

The expected effect of a given intervention or treatment as measured in an ideal randomized controlled experiment.

Answer 8

A result in mathematical statistics that says that, under general conditions, the sampling distribution of the standardized sample average is well approximated by a standard normal distribution when the sample size is large.

Answer 9

The distribution of the sum of m squared independent standard normal random variables.The parameter m is called the degrees of the freedom of the chi-squared distribution.

Answer 10

A test for a break in a time series regression at a known break date.

Answer 11

When two or more time series variables share a common stochastic trend.

Answer 12

A trend shared by two or more time series.

Answer 13

The probability distribution of one random variable given that another random variable takes on a particular value.

Answer 14

The expected value of one random value given that another random variable takes on a particular value.

Answer 15

The variance, usually of an error term, depends on other variables.

Answer 16

The mean of a conditional distribution; see conditional expectation.

Answer 17

The conditional expectation of the regression error ui, given the regressors, depends on some but not all of the regressors.

Answer 18

The variance of a conditional distribution.

Answer 19

An interval (or set) that contains the true value of a population parameter with a prespecified probability when computed over repeated samples.

Answer 20

The prespecified probability that a confidence interval (or set) contains the true value of the parameter.

Answer 21

Means that an estimator is consistent. See consistent estimator.

Answer 22

An estimator that converges in probability to the parameter that it is estimating.

Answer 23

The regressor associated with the regression intercept; this regressor is always equal to 1.

Answer 24

The regression intercept.

Answer 25

A random variable that can take on a continuum of values.

Answer 26

The group that does not receive the treatment or intervention in an experiment.

Answer 27

Another term for a regressor; more specifically, a regressor that controls for one of the factors that determine the dependent variable.

Answer 28

When a sequence of distributions converges to a limit; a precise definition is given in Section 17.2.

Answer 29

When a sequence of random variables converges to a specific value; for example, when the sample average becomes close to the population mean as the sample size increases; see Key Concept 2.6 and Section 17.2.

Answer 30

A unit-free measure of the extent to which two random variables move, or vary, together.The correlation (or correlation coefficient) between X and Y is sXY/sXsY and is denoted corr(X,Y).

Answer 31

See correlation.

Answer 32

A measure of the extent to which two random variables move together.The covariance between X and Y is the expected value E[(X 2 mX)(Y 2 mY)], and is denoted by cov(X,Y) or by sXY.

Answer 33

A matrix composed of the variances and covariances of a vector of random variables.

Answer 34

The value of a test statistic for which the test just rejects the null hypothesis at the given significance level.

Answer 35

Data collected for different entities in a single time period.

Answer 36

A nonlinear regression function that includes X, X2, and X3 as regressors.

Answer 37

See cumulative probability distribution.

Answer 38

The cumulative effect of a unit change in the time series variable X on Y.The h-period cumulative dynamic multiplier is the effect of a unit change in Xt on Yt + Yt+1+ . . . + Yt+h.

Answer 39

A function showing the probability that a random variable is less than or equal to a given number.

Answer 40

The variable to be explained in a regression or other statistical model; the variable appearing on the left-hand side in a regression.

Answer 41

A persistent long-term movement of a variable over time that can be represented as a nonrandom function of time.

Answer 42

A method for testing for a unit root in a first order autoregression [AR(1)].

Answer 43

An estimator of the causal effect constructed as the difference in the sample average outcomes between the treatment and control groups.

Answer 44

The average change in Y for those in the treatment group, minus the average change in Y for those in the control group.

Answer 45

A random variable that takes on discrete values.

Answer 46

A regression model in which the regressors are current and lagged values of X.

Answer 47

See binary variable.

Answer 48

A problem caused by including a full set of binary variables in a regression together with a constant regressor (intercept), leading to perfect multicollinearity.

Answer 49

The causal effect of one variable on current and future values of another variable.

Answer 50

The h-period dynamic multiplier is the effect of a unit change in the time series variable Xt on Yt+h.

Answer 51

A variable that is correlated with the error term.

Answer 52

The difference between Y and the population regression function, denoted by u in this textbook.

Answer 53

The bias in an estimator of a regression coefficient that arises from measurement errors in the regressors.

Answer 54

The numerical value of an estimator computed from data in a specific sample.

Answer 55

A function of a sample of data to be drawn randomly from a population. An estimator is a procedure for using sample data to compute an educated guess of the value of a population parameter, such as the population mean.

Answer 56

The exact probability distribution of a random variable.

Answer 57

When the number of instrumental variables equals the number of endogenous regressors.

Answer 58

A variable that is uncorrelated with the regression error term.

Answer 59

The long-run average value of a random variable over many repeated trials or occurrences. It is the probability-weighted average of all possible values that the random variable can take on.The expected value of Y is denoted E(Y) and is also called the expectation of Y.

Answer 60

Data obtained from an experiment designed to evaluate a treatment or policy or to investigate a causal effect.

Answer 61

When experimental subjects change their behavior because they are part of an experiment.

Answer 62

The sum of squared deviations of the predicted values of Yi, ,from their average; see Equation (4.14).

Answer 63

See regressor.

Answer 64

Inferences and conclusions from a statistical study are externally valid if they can be generalized from the population and the setting studied to other populations and settings.

Answer 65

A statistic used to a test joint hypothesis concerning more than one of the regression coefficients.

Answer 66

The distribution of a ratio of independent random variables, where the numerator is a chi-squared random variable with m degrees of freedom, divided by m, and the denominator is a chi-squared random variable with n degrees of freedom divided by n.

Answer 67

The distribution of a random variable with a chi-squared distribution with m degrees of freedom, divided by m.

Answer 68

A version of the generalized least squares (GLS) estimator that uses an estimator of the conditional variance of the regression errors and covariance between the regression errors at different observations.

Answer 69

A version of the weighted least squares (WLS) estimator that uses an estimator of the conditional variance of the regression errors.

Answer 70

The first difference of a time series variable Yt is Yt 2 Yt21, denoted DYt.

Answer 71

The regression of an included endogenous variable on the included exogenous variables, if any, and the instrumental variable(s) in two stage least squares.

Answer 72

See predicted values.

Answer 73

Binary variables indicating the entity or time period in a panel data regression.

Answer 74

A panel data regression that includes entity fixed effects. ˆYi

Answer 75

The difference between the value of the variable that actually occurs and its forecasted value.

Answer 76

An interval that contains the future value of a time series variable with a prespecified probability.

Answer 77

When the form of the estimated regression function does not match the form of the population regression function; for example, when a linear specification is used but the true population regression function is quadratic.

Answer 78

See generalized autoregressive conditional heteroskedasticity.

Answer 79

Mathematical result stating that, under certain conditions, the OLS estimator is the best linear unbiased estimator of the regression coefficients conditional on the values of the regressors.

Answer 80

A time series model for conditional heteroskedasticity.

Answer 81

A generalization of OLS that is appropriate when the regression errors have a known form of heteroskedasticity (in which case GLS is also referred to as weighted least squares, WLS) or a known form of serial correlation.

Answer 82

A method for estimating parameters by fitting sample moments to population moments that are functions of the unknown parameters. Instrumental variables estimators are an important special case.

Answer 83

See generalized method of moments.

Answer 84

A procedure for testing whether current and lagged values of one time series help predict future values of another time series.

Answer 85

See heteroskedasticity- and autocorrelation-consistent (HAC) standard errors.

Answer 86

See experimental effect.

Answer 87

The situation in which the variance of the regression error term ui, conditional on the regressors, is not constant. Heteroskedasticity- and autocorrelation-consistent

Answer 88

Standard errors for OLS estimators that are consistent whether or not the regression errors are heteroskedastic and autocorrelated.

Answer 89

Standard errors for the OLS estimator that are appropriate whether the error term is homoskedastic or heteroskedastic.

Answer 90

A t-statistic constructed using a heteroskedasticity-robust standard error.

Answer 91

The variance of the error term ui, conditional on the regressors, is constant.

Answer 92

A form of the Fstatistic that is valid only when the regression errors are homoskedastic.

Answer 93

Standard errors for the OLS estimator that are appropriate only when the error term is homoskedastic.

Answer 94

A procedure for using sample evidence to help determine if a specific hypothesis about a population is true or false.

Answer 95

Independently and indentically distributed.

Answer 96

When two or more random variables have the same distribution.

Answer 97

The contemporaneous, or immediate, effect of a unit change in the time series variable Xt on Yt.

Answer 98

The condition in which two or more regressors are highly correlated.

Answer 99

Regressors that are correlated with the error term (usually in the context of instrumental variable regression).

Answer 100

Regressors that are uncorrelated with the error term (usually in the context of instrumental variable regression).

Answer 101

When knowing the value of one random variable provides no information about the value of another random variable.Two random variables are independent if their joint distribution is the product of their marginal distributions.

Answer 102

See binary variable.

Answer 103

A statistic used to estimate the number of lagged variables to include in an autoregression or a distributed lag model. Leading examples are the Akaike information criterion (AIC) and the Bayes information criterion (BIC).

Answer 104

See instrumental variable.

Answer 105

A variable that is correlated with an endogenous regressor (instrument relevance) and is uncorrelated with the regression error (instrument exogeneity).

Answer 106

A way to obtain a consistent estimator of the unknown coefficients of the population regression function when the regressor,X, is correlated with the error term, u.

Answer 107

A regressor that is formed as the product of two other regressors, such as X1i 3 X2i.

Answer 108

The value of b0 in the linear regression model.

Answer 109

When inferences about causal effects in a statistical study are valid for the population being studied.

Answer 110

Astatistic for testing overidentifying restrictions in instrumental variables regression.

Answer 111

A hypothesis consisting of two or more individual hypotheses, that is, involving more than one restriction on the parameters of a model.

Answer 112

The probability distribution determining the probabilities of outcomes involving two or more random variables.

Answer 113

A measure of how much mass is contained in the tails of a probability distribution.

Answer 114

The value of a time series variable in a previous time period.The jth lag of Yt is Yt2j.

Answer 115

A result in probability theory that says that the expected value of Y is the expected value of its conditional expectation given X, that is, E(Y) 5 E[E(Y X)].

Answer 116

According to this result from probability theory, under general conditions the sample average will be close to the population mean with very high probability when the sample size is large.

Answer 117

The assumptions for the linear regression model listed in Key Concept 4.3 (single variable regression) and Key Concept 6.4 (multiple regression model).

Answer 118

An estimator formed by minimizing the sum of squared residuals.

Answer 119

A dependent variable that can take on only a limited set of values. For example, the variable might be a 021 binary variable or arise from one of the models described in Appendix 11.3.

Answer 120

A nonlinear regression function in which the dependent variable is Y and the independent variable is ln(X).

Answer 121

A regression model in which Y is a binary variable.

Answer 122

A regression function with a constant slope.

Answer 123

A weighted average treatment effect estimated, for example, by TSLS.

Answer 124

A nonlinear regression function in which the dependent variable is ln(Y) and the independent variable is X.

Answer 125

A nonlinear regression function in which the dependent variable is ln(Y) and the independent variable is ln(X). @

Answer 126

A mathematical function defined for a positive argument; its slope is always positive but tends to zero.The natural logarithm is the inverse of the exponential function, that is, X 5 ln(eX).

Answer 127

A nonlinear regression model for a binary dependent variable in which the population regression function is modeled using the cumulative logistic distribution function.

Answer 128

The cumulative long-run effect on the time series variable Y of a change in X.

Answer 129

See panel data.

Answer 130

Another name for the probability distribution of a random variable Y, which distinguishes the distribution of Y alone (the marginal distribution) from the joint distribution of Y and another random variable.

Answer 131

An estimator of unknown parameters that is obtained by maximizing the likelihood function; see Appendix 11.2.

Answer 132

The expected value of a random variable.The mean of Y is denoted mY.

Answer 133

The expected value of a random variable raised to different powers.The rth moment of the random variable Y is E(Yr).

Answer 134

See perfect multicollinearity and imperfect multicollinearity.

Answer 135

An extension of the single variable regression model that allows Y to depend on k regressors.

Answer 136

See quasi-experiment.

Answer 137

See logarithm.

Answer 138

A confidence set with a 95% confidence level; see confidence interval.

Answer 139

The analog of OLS that applies when the regression function is a nonlinear function of the unknown parameters.

Answer 140

The estimator obtained by minimizing the sum of squared residuals when the regression function is nonlinear in the parameters.

Answer 141

A regression function with a slope that is not constant.

Answer 142

When the joint distribution of a time series variable and its lags changes over time.

Answer 143

A commonly used bell-shaped distribution of a continuous random variable.

Answer 144

The hypothesis being tested in a hypothesis test, often denoted by H0.

Answer 145

The unique identifier assigned to each entity in a data set.

Answer 146

Data based on observing, or measuring, actual behavior outside an experimental setting. OLS estimator. See ordinary least squares estimator.

Answer 147

The regression line with population coefficients replaced by the OLS estimators.

Answer 148

The difference between Yi and the OLS regression line, denoted by in this textbook.

Answer 149

The bias in an estimator that arises because a variable that is a determinant of Y and is correlated with a regressor has been omitted from the regression.

Answer 150

The parameter of interest is on one side of the value given by the null hypothesis.

Answer 151

The number of times that a time series variable must be differenced to make it stationary.A time series variable that is integrated of order p must be differenced p times and is denoted I(p).

Answer 152

The estimator of the regression intercept and slope(s) that minimizes the sum of squared residuals.

Answer 153

An exceptionally large or small value of a random variable.

Answer 154

When the number of instrumental variables exceeds the number of included endogenous regressors.

Answer 155

The probability of drawing a statistic at least as adverse to the null hypothesis as the one actually computed, assuming the null hypothesis is correct. Also called the marginal significance probability, the p-value is the smallest significance level at which the null hypothesis can be rejected.

Answer 156

Data for multiple entities where each entity is observed in two or more time periods.

Answer 157

A constant that determines a characteristic of a probability distribution or population regression function.

Answer 158

Occurs when some participants fail to follow the treatment protocol in a randomized experiment.

Answer 159

The effect on Y of changing one of the regressors, holding the other regressors constant.

Answer 160

Occurs when one of the regressors is an exact linear function of the other regressors.

Answer 161

A nonlinear regression function that includes X, X2, . . . and Xr as regressors, where r is an integer. uˆi

Answer 162

The group of entities—such as people, companies, or school districts—being studied.

Answer 163

See population intercept and slope.

Answer 164

The true, or population, values of b0 (the intercept) and b1 (the slope) in a single variable regression. In a multiple regression, there are multiple slope coefficients (b1, b2, . . . , bk), one for each regressor.

Answer 165

The multiple regression model in Key Concept 6.2.

Answer 166

In a single variable regression, the population regression line is b0 + b1Xi, and in a multiple regression it is b0 + b1X1i + b2X2i + . . . + bkXki.

Answer 167

The probability that a test correctly rejects the null hypothesis when the alternative is true.

Answer 168

The value of Yi that is predicted by the OLS regression line, denoted by in this textbook.

Answer 169

The percentage change in the quantity demanded resulting from a 1% increase in price.

Answer 170

The proportion of the time that an outcome (or event) will occur in the long run.

Answer 171

For a continuous random variable, the area under the probability density function between any two points is the probability that the random variable falls between those two points.

Answer 172

For a discrete random variable, a list of all values that a random variable can take on and the probability associated with each of these values.

Answer 173

A nonlinear regression model for a binary dependent variable in which the population regression function is modeled using the cumulative standard normal distribution function.

Answer 174

The field of study concerned with estimating the effect of a program, policy, or some other intervention or “treatment.”

Answer 175

A forecast computed over part of the sample using a procedure that is as if these sample data have not yet been realized.

Answer 176

A nonlinear regression function that includes X and X2 as regressors.

Answer 177

A circumstance in which randomness is introduced by variations in individual circumstances that make it appear as if the treatment is randomly assigned. ˆYi

Answer 178

In a regression, the fraction of the sample variance of the dependent variable that is explained by the regressors

Answer 179

A time series process in which the value of the variable equals its value in the previous period, plus an unpredictable error term.

Answer 180

A generalization of the random walk in which the change in the variable has a nonzero mean but is otherwise unpredictable.

Answer 181

An experiment in which participants are randomly assigned to a control group, which receives no treatment, or to a treatment group, which receives a treatment.

Answer 182

See dependent variable.

Answer 183

A description of a regression that includes the set of regressors and any nonlinear transformation that has been applied.

Answer 184

A variable appearing on the right-hand side of a regression; an independent variable in a regression.

Answer 185

The set of values of a test statistic for which the test rejects the null hypothesis.

Answer 186

A collection of crosssectional data sets, where each cross-sectional data set corresponds to a different time period.

Answer 187

Aregression in which the coefficients are restricted to satisfy some condition. For example, when computing the homoskedasticityonly F-statistic, this is the regression with coefficients restricted to satisfy the null hypothesis.

Answer 188

The square root of the mean of the squared forecast error.

Answer 189

An estimator of the correlation between two random variables.

Answer 190

An estimator of the covariance between two random variables.

Answer 191

The bias in an estimator of a regression coefficient that arises when a selection process influences the availability of data and that process is related to the dependent variable.This induces correlation between one or more regressors and the regression error.

Answer 192

An estimator of the standard deviation of a random variable.

Answer 193

An estimator of the variance of a random variable.

Answer 194

The distribution of a statistic over all possible samples; the distribution arising from repeatedly evaluating the statistic using a R2 series of randomly drawn samples from the same population.

Answer 195

A plot of n observations on Xi and Yi, in which each observation is represented by the point (Xi,Yi).

Answer 196

See autocorrelation.

Answer 197

A time series variable with all autocorrelations equal to zero.

Answer 198

The prespecified rejection probability of a statistical hypothesis test when the null hypothesis is true.

Answer 199

When entities are chosen randomly from a population using a method that ensures that each entity is equally likely to be chosen.

Answer 200

When, in addition to the causal link of interest from X to Y, there is a causal link from Y to X. Simultaneous causality makes X correlated with the error term in the population regression of interest.

Answer 201

See simultaneous causality bias.

Answer 202

The probability that a test incorrectly rejects the null hypothesis when the null hypothesis is true.

Answer 203

A measure of the aysmmetry of a probability distribution.

Answer 204

The square root of the variance. The standard deviation of the random variable Y, denoted sY, has the units of Y and is a measure of the spread of the distribution of Y around its mean.

Answer 205

An estimator of the standard deviation of the estimator.

Answer 206

An estimator of the standard deviation of the regression error u.

Answer 207

The normal distribution with mean equal to 0 and variance equal to 1, denoted N(0, 1).

Answer 208

An operation accomplished by subtracting the mean and dividing by the standard deviation, which produces a random variable with a mean of 0 and a standard deviation of 1.The standardized value of Y is (Y 2 mY)/sY.

Answer 209

When the joint distribution of a time series variable and its lagged values does not change over time.

Answer 210

The null hypothesis (typically, that a regression coefficient is zero) cannot be rejected at a given significance level.

Answer 211

The null hypothesis (typically, that a regression coefficient is zero) is rejected at a given significance level.

Answer 212

A persistent but random long-term movement of a variable over time.

Answer 213

The requirement that the regression error has a mean of zero conditional on current, future, and past values of the regressor in a distributed lag model.

Answer 214

The Student t distribution with m degrees of freedom is the distribution of the ratio of a standard normal random variable, divided by the square root of an independently distributed chi-squared random variable with m degrees of freedom divided by m.As m gets large, the Student t distribution converges to the standard normal distribution.

Answer 215

The sum of the squared OLS residuals.

Answer 216

See Student t distribution.

Answer 217

See t-statistic.

Answer 218

A statistic used for hypothesis testing. See Key Concept 5.1.

Answer 219

A procedure for testing whether two populations have the same mean.

Answer 220

Binary variables indicating the time period in a panel data regression.

Answer 221

A panel data regression that includes both entity fixed effects and time fixed effects.

Answer 222

See time effects.

Answer 223

Data for the same entity for multiple time periods.

Answer 224

The sum of squared deviations of Yi, from its average, .

Answer 225

The causal effect in an experiment or a quasi-experiment; see causal effect.

Answer 226

The group that receives the treatment or intervention in an experiment.

Answer 227

See two stage least squares. Y

Answer 228

When, under the alternative hypothesis, the parameter of interest is not equal to the value given by the null hypothesis.

Answer 229

An instrumental variable estimator, described in Key Concept 12.2.

Answer 230

In hypothesis testing, the error made when the null hypothesis is true but is rejected.

Answer 231

In hypothesis testing, the error made when the null hypothesis is false but is not rejected.

Answer 232

A panel data set in which some data are missing.

Answer 233

An estimator with a bias that is equal to zero.

Answer 234

Two random variables are uncorrelated if their correlation is zero.

Answer 235

When the number of instrumental variables is less than the number of endogenous regressors.

Answer 236

Refers to an autoregression with a largest root equal to 1.

Answer 237

When computing the homoskedasticity-only F-statistic, this is the regression that applies under the alternative hypothesis, so the coefficients are not restricted to satisfy the null hypothesis.

Answer 238

See vector autoregression.

Answer 239

The expected value of the squared difference between a random variable and its mean; the variance of Y is denoted .

Answer 240

A model of k time series variables consisting of k equations, one for each variable, in which the regressors in all equations are lagged values of all the variables.

Answer 241

When a time series variable exhibits some clustered periods of high variance and other clustered periods of low variance.

Answer 242

Instrumental variables that have a low correlation with the endogenous regressor(s).

Answer 243

An alternative to OLS that can be used when the regression error is heteroskedastic and the form of the heteroskedasticity is known or can be estimated.

Econometrics Flashcards

(268 cards)