Statistical Models Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define parameters in regression equations (slope, Y-intercept)

A

The intercept and weight (measurement error) values are called the parameters of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the weight values?

A

Regression weights, or regression coefficients. Weight values are values that assume for the amount of error/random error in the model. They are calculated using the slope of the line, where the dependent Y [reliant] has a quantitative relationship with the independent X [controlled]. Y is assumed to have a constant standard deviation over multiple observations, which can then minimize error by calculating least squared estimates [or taking correlation by SD of Y compared to SD of X].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the intercept?

A

Used to account for error when the mean of the residuals [Y] is theoretically equal to zero [X=0]. It tells you nothing about the relationship b/w X and Y, and serves as a constant that gives some idea of where Y crosses the y-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do these two parameters estimate?

A

The slope and the intercept define the linear relationship between two variables, and can be used to estimate an average rate of change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are predicted values?

A

The values of Y-hat, or the values of the dependent variable assumed by the parameters of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are observed values?

A

The values of Y, or the actual values of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How should we understand model fit?

A

Model fit is a measurement/representation of how well the actual values of Y correspond to the predicted values of Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What specific equation is used to assess/estimate model fit?

A

Error variance, or the average squared deviations of the model parameters. This is used because the degree of error in the model tells us how far the predicted values distance from reality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is good error variance?

A

Small, minor differences in variability that do not detract or misrepresent the relationship b/w Y and Y-hat. Good error variance is not necessarily explainable, but also does not impact how accurate the model is assumed to be or can be corrected for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is bad error variance?

A

Differences in variability that are not explainable and that confound the relationship b/w Y and Y-hat. In this case, the error that we are attempting to examine can be related to entirely different variables, or are not actually relevant to the question that the model is attempting to represent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does least squares estimation do?

A

Finds a line determined by using an estimation of squared SD error variance relative to the constant Y [if the mean of X=0]. It makes the sum of squared errors as small as possible, and thus minimizes the total compounded error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a residual?

A

Error estimates that result from the model being unable to perfectly reconstruct the actual data b/c it cannot realistically represent the full population error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is error variance?

A

The average squared differences b/w the predict and the observed values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is R^2?

A

A squared correlation that represents the proportion of the variance in Y that is accounted for by the model. It estimates the strength of the relationship b/w the model and the response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain the purpose of statistical modeling.

A

To provide a mathematical representation of theories [about the relationships b/w different factors].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain how changing different parts of a linear equation will alter the model.

A

The model changes depending on which variable is different in the equation. This change is unique if it is a linear or quadratic model.

Linear:

a-value: Y values increase as we move up. Elevation of the line changes.

b-value: the line becomes steeper or less steep.

Quadratic:

a-value: elevation of the curve increases.

b-value: the depth of the curve increases.

17
Q

What if the slope is equal to zero?

A

The predicted values of Y are equal to a.

18
Q

Explain the goals of parameter estimation.

A

To find weight and intercept values that allow us to calculate the smallest discrepancy between the predicted Y values and the actual values of Y.

19
Q

Explain the least-squares estimation.

A

Finds a line determined by using an estimation of squared SD error variance relative to the constant Y [if the mean of X=0]. It makes the sum of squared errors as small as possible, and thus minimizes the total compounded error.

20
Q

Explain 3 sources of error variance.

A

Sampling error - error caused by observing a sample instead of the whole population.

Incomplete Model/Missing Variables - variables that mattered, but were not identified or applied to the model.

Imprecision in measurement - nonsystematic problems in how data was collected, how subjects interpreted the questions of interest, or w/ the process of experimentation. Random error has a normal distribution; high and low scores are roughly symmetrical.

21
Q

Explain what the best guess of the y-intercept would be if we have no information about x-values.

A

The mean of Y, because it is the point at which the sum of the deviations above that point are balanced by the sum of the deviations below that point. (e.g. a least squares statistic) The mean is the only value that makes the variance as small as possible. This is equivalent to ignoring any X serving as a predictor variable in the model.

22
Q

Explain the goal of model comparison.

A

To approximate which of multiple theories gives a better account of the data and better represents reality.

23
Q

Explain why R-squared is useful.

A

R^2 is useful b/c it is a standard metric for interpreting model fit. It evaluates the portion of the variance of Y accounted for by the model relative to the actual variance in Y.

24
Q

Identify parameters in a model, given an equation.

A

Linear: Y = a + bX | Quadratic: Y = a + bX^2

a = Y-intercept (the value of Y when X = 0)
b = slope (the-rise-over-the-run, the steepness of the line); a weight
25
Q

Identify independent and dependent terms in a model, and identify the assumptions we are making about causal modeling.

A

The dependent Y [reliant] has a quantitative relationship with the independent X [controlled]. We are assuming that the predictor X-value has a relationship to the unknown Y, and that there is a causal interaction [i.e. X impacts the possibility of Y occurring].

26
Q

Compare predicted vs. observed values.

A

1. Predicted - Y-hat. A value of Y assumed by the parameters used to estimate the model; in general, an estimation of what the relationship b/w Y and Y-hat may resemble.

2. Observed - actual Y. How the relationship w/ the Y value plays out in reality as opposed to the fictitious model.

27
Q

Compare “good” vs. “bad” model fit (perfect vs. worst)?

A

Good/Perfect - an R^2 of 1, w/ the model accounting for all the error variance in Y.

Bad - an R^2 of 0, w/ the model accounting for error variance as if no predicting variables were involved in the relationship of X to Y. Effectively, error variance is the same as the variance in Y.

28
Q

Compare error variance vs. R-squared.

A

Error variance: the average squared error. Average squared difference b/w the predicted values and the actual values of Y.

R^2: the average squared correlation of one or more coefficients. Average squared correlation of the proportion of variance in Y accounted for by the model vs. the actual variance in Y.

29
Q

What are two questions to ask when attempting to model an accurate estimation of a theory?

A
  1. Is a linear or a quadratic model more appropriate to capture the data we want to describe
  2. Are the parameters correct and do they accurately represent the data we want to describe
30
Q

What is the value of error variance when the model is perfect?

A

0.

31
Q

What is the value of error variance when a model is performing as badly as possible?

A

The mean of the sample or population.

32
Q

Why is the mean the worst a model can do?

A

Because it is a least squares statistic, it represents the point where the sum of the deviations above are balanced by the sum of the deviations below (i.e. made to equal 0). This makes the variance as small as possible, which is the purpose of least squares statistics.

In other words: our model is predicting no better than our “best” guess might be able to.

33
Q

What does it mean when our model can only predict the mean value?

A

Our model is getting an error variance that is as large as the actual variance of Y.

34
Q

What are 3 advantages of using R squared?

A
  1. It doesn’t matter how large the variance of Y is because everything is evaluated relative to the variance of Y
  2. Set end-points: 1 is perfect and 0 is as bad as a model can be
  3. It is standardized
35
Q

What do theories typically specify?

A

The form of the relationship b/w variables.

36
Q
A