Statistics/Probability Flashcards

Question

Random variable:

Answer 1

sample mean +/- z-value * (sigma or SE / root(n))

Answer 2

x (sample mean) and sample standard deviation

Answer 3

If n is < 30, but you are given sigma, you can use sigma

Answer 4

* Has thicker tails than Normal (i.e. larger chance of extreme events). * Its shape depends on a single parameter “nu” 𝜈 = 𝑛 – 1, where n is the number of observations. * Assumption: t-distribution assumes that the data originates from a Normal Distribution.

Answer 5

Gaussian, Poisson, Chi-square

Answer 6

A stationary time series is one whose statistical properties such as mean, variance, autocorrelation, etc. are all constant over time.

Answer 7

the expected value of the squared deviation from the mean of a random variable

Answer 8

* Null Hypothesis: 𝐻0 belief about true population parameter value --> The null hypothesis will be rejected if the difference between sample means is bigger than would be expected by chance * Sample mean: 𝐻1 alternative

Answer 9

alpha --> probability of rejecting the null hypothesis when it is true

Answer 10

𝑧 − 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 or 𝑡 − 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 --> ±z/t-value which act as cutoff points beyond which the null hypothesis should be rejected

Answer 11

probability of obtaining a value of the test statistic as extreme as, or more extreme than, the actual value obtained, when the null hypothesis is true

Answer 12

* “we accept the null hypothesis as truth” * “we cannot reject the null hypothesis”

Answer 13

is a statement of assertion about the true value of an unknown population parameter (e.g. 𝜇 = 100)

Answer 14

Testing for “significant” relationships.

Answer 15

the Sum of Squared Errors (SSE) with respect to regression coefficients 𝛽0, 𝛽1

Answer 16

* square root of the average squared residuals * where 𝑛 is the number of observations and 𝑝 number of independent variables.

Answer 17

The proportion of total variation of Y that is explained by the model (i.e. by the independent variable(s))

Answer 18

* If 𝑛 is very large and 𝑝 is very small, the ratio is close to zero, and 𝑅2 ≈ 𝑅2adj * As the number of inputs (𝑥’s) increases, 𝑅2 typically increases regardless of whether the variables are useful for prediction * Adjusted 𝑅2 will only increase if the new 𝑥 variable improves the model more than would be expected by chance.

Answer 19

homoscedasticity

Answer 20

nonlinearity present, interactions between independent variables, outliers

Answer 21

* Nonlinearity: systematic pattern in the residuals * Heteroscedasticity: variance of errors changes across levels of independent variable * Autocorrelation: errors in one period are correlated with errors in another period

Answer 22

* 1/𝑥 Relationship * square root(𝑥) Relationship * x^2 Relationship * Exponential 𝑥^𝑏 Relationship

Answer 23

* from the unconditional variance of s2y * to y s to the conditional variance of s2e * That is, a reduction of s2y - s2e * The reduction expressed as a fraction is called R-squared or R2

Answer 24

* Least Absolute Value rather than Least Squares * This approach will weight extreme values less although a different set of diagnostics will then have to be used

Answer 25

* 𝑥2 Relationship - Example: return from an investment (Y), increases quadratically (exponentially) with an increasing investment (x). This could happen due to aggressive reinvestment and compounding returns. * square root(𝑥) Relationship - Example: stock volatility (𝑌) increases with a decreasing rate of volume (𝑥), i.e. y = square root(x)

Answer 26

* an independent variable in a regression model that is a product of two independent variables * Sometimes the partial effect of the dependent variable with respect to independent variable can depend on magnitude of yet another independent variable

Answer 27

* appears when independent variables used inside the regression equation are highly correlated * Effects: fit is not improved much, additional variables add little information, may cause two or more variables to become insignificant but significance may be high if one variable is dropped, Parameter estimates are unreliable

Answer 28

* A variable that takes on a value of 0 or 1 * Example: 1 war year, 0 no war, treated as a reference group (usually the majority of the data in the sample) * When a categorical variable has k categories (we call them ‘levels), we include only k-1 dummy variables in the regression model. * The category that is left out is usually the one with the most frequent observations and it acts as a reference.

Answer 29

A model for time series data in which a regression equation is used to predict current values of a dependent variable based on both the current values of an independent variable and/or its lagged (past period) values.

Answer 30

statistical method in which the data are not assumed to come from prescribed models that are determined by a small number of parameters, such as the normal distribution model and the linear regression model

Answer 31

Linear programming is a special type of optimization model that sets up constraints as linear equations and solves them simultaneously while optimizing an objective function.

Answer 32

The amount of profit that an additional unit of available resources would yield

Statistics/Probability Flashcards

(89 cards)