PSCH 443 - Midterm Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is covariance?

A

the average product of deviation scores; how much scores covary together w/ respect to their standard deviation scores. This can be understood as the extent of how much they shared variability within the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is correlation?

A

the average product of z-scores. A standardized measure of covariance; measured w/ an association value that falls b/w 1 and -1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is slope?

A

the beta weight; predictor of Y-hat; “steepness of the line.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the intercept?

A

the constant; the value of Y when X = 0; “elevation of the line.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is error variance?

A

the average of the squared differences b/w the actual value of Y-hat and the predicted values from the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is SSM?

A

the variation in Y that our model captures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is SSRes?

A

the sum of the squared errors; error/Residual variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is SST?

A

the total variance in Y; all the variance that could possibly account for Y outside of (and including) the predictor variable(s).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is R^2?

A

the proportion of the variance in Y that is accounted for by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is F?

A

a test for significance of a group of variables; an index of how probable a result in the model is or is not due to sampling error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is MSm?

A

the variance in Y our model accounts for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is MSRes?

A

error (Residual) variance in the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are force methods of parameter estimation?

A

Gradient descent/brute force:

  1. Start with viable parameter value
  2. Calculating the error using slightly different value
  3. Continue moving the best guess parameter value in the direction of the smallest error
  4. Repeat this process until the error is as small as it can be.Effectively just plugging in values until the smallest error is manually located.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is least-squares estimation?

A

Uses beta values to create a line of fit from the sample data; seeks to capture the smallest difference b/w the predicted and actual values of Y. In linear least-squares estimation, this line is in the form of:

Y = a + (b(X)), where Y is the actual value, a is the constant and b is the slope times the beta variable’s value. 
  • the goal is to minimize error variance.
  • uses the correlation b/w X and Y, the standard deviations of X and Y, and the means of X and Y to calculate least squares estimation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is partial correlation?

A

Looks at the relationship between X and Y while holding a third variable (Z) constant; partials out the covariance of Z from both X and Y; ignores all the variability in the model to examine specific relationships b/w those variables.

  • serves as a means to control for variance within subset variables
  • does not provide as clear a picture of how the model does as a whole
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is semipartial correlation?

A

Correlation indicates the relationship between X and Y by holding the covariation of X and Z constant. This is the foundation of multiple regression.

  • allows us to examine the unique effects of X on the whole of Y, while holding third variable (Z) constant
  • we can assess the unique contribution of X relative to the whole of Y (i.e., the unique percentage of total variance in Y accounted for by X, holding the effects of any other variable(s) constant)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is multiple regression?

A

The ultimate goal is to model the influence of several predictors on an outcome variable. This model should account for:

  1. unique overlap of each predictor with the outcome
  2. degree of overlap between predictors
  3. extent to which the overlap between predictors overlaps with the outcome
  4. the overall degree to which the predictors explain the variability in the outcome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is multiple correlation?

A

A measure of how well a given variable can be predicted using a linear function of a set of other variables; the correlation between the variable’s values and the best predictions that can be computed linearly from the predictive variables.

  • takes on a value b/w 1 and 0, like correlation
  • the correlation of Y and Ŷ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the basics of ANOVA?

A

Statistical models used to analyze the differences among/between group means and their associated procedures (such as “variation” among and between groups).

F-statistic > 0.05 is considered significant; if not, we fail to reject the null. 

It measures the partition of variance throughout the different means of all the variables that are used within the model.
20
Q

Explain the interpretation of regression coefficients (both b and beta).

A

b = Unstandardized Coefficients; the weight variable in the units from the study itself.

    Ex. Interpretation: “For every point on the GRE one gains, they shave .002 years  (less than a day) off of their completion time.” 

Beta = Standardized Coefficients; the weight variable in SD units from the mean. 

    Ex. Interpretation: “For every standard deviation increase in GRE score, we predict a .2o1 standard deviation reduction in years to complete MS.”
21
Q

SE (standard error) in regression

A

The measure of sampling error associated with each coefficient or predictor variable.

  • average amount we would expect each parameter to vary if we were to take repeated samples
  • affected by sample size; larger sample size means less error
  • ideally standard error would be small
22
Q

Significance testing (t-tests) for regression coefficients

A

The logic is based on whether or not the model has any slope it can evaluate. If there is no slope, then no effect exists. It uses two steps:

  1. Tests regression parameter against an expected value of zero for b in the population
  2. If p
23
Q

What is multicolinearity?

A

Exists when there is a strong correlation b/w two or more predictors. Problems include:

  1. increased error in our parameters
  2. limits the size of R, and by extension R2 we can observe
  3. increased difficulty in assessing importance of predictorsIf multicolinearity occurs, we can drop the variable, combine it w/ the variable it correlates w/ to test if they measure the same thing, or leave it in the model provided it poses no huge issues w/ the data.
24
Q

Explain outliers.

A

Outliers are extreme cases on one or more of our variables; can have too much of an influence on parameter estimates and regression solution.

  • univariate outliers are extreme on one variable
  • multivariate outliers are extreme on combinations of variablesOutliers have the following negative impacts on the regression model:
  1. less normality
  2. skewed distributions
  3. results less likely to generalize to population
25
Q

What is normality?

A

The assumption of a normal bell-curve distribution; predictor and outcome variables should be normally distributed.

  1. prediction errors should also be normally distributed
    centered at zero
  2. if errors are truly random, most errors should be relatively small and evenly distributed around the meanIf data is non-normal, it will affect our parameter estimates–has to be large to cause a similarity important effect. Generally works against statistical significance, or implies a less statistically significant relationship b/w predictor and outcome.
26
Q

What is linearity?

A

The assumption that relationships between our predictors and outcome are linear or can be modelled linearly (i.e., by least squares estimation or linear regression line of fit).

27
Q

What is homoscedasticity?

A

The assumption that the residual variability should be consistent across the predictors; predictors should have the same general amount of variance distributed throughout the model; errors should be random.

Can cause the following problems: 
  1. cases with larger disturbances have more “pull” than other observations
  2. standard errors are biased
  3. incorrect conclusions about the significance of the regression coefficients

Durbin-Watson statistic provides measure of correlation among errors. Value that is significantly different from 2 is likely to indicate some type of problem w/ the error distribution across variables in the model.

28
Q

Detail both methods of regression.

A

Simultaneous - all variables were entered into the regression equation at one time.

  • goal is to enter predictors in order that accounts for most variance
  • solution does not have to make theoretical sense
  • sufficient sample size to enable reliable estimation of predictors
  • very clean data that met assumptions of regression

Hierarchical - Predictors are entered in an order specified by the researcher.

  • theory should determine order of entry
  • known predictors are entered first
  • subsequent variables are entered into the model to see how much variability they account for after previous variables are accounted for
  • evaluate whether later predictors account for a significant portion of the variance in Y
29
Q

Explain the basics of dummy coding.

A

Assigns numbers to represent different categorical values, but those numbers do not have usual mathematical meaning; enables comparisons among various categories.

  1. 0 and 1 – one usually represents presence of some key attribute
  2. sets of dummy variables are always entered in same block into regression analysis
  3. choose one category to be the baseline category when using multiple dummy codes
30
Q

What are suppression effects?

A

Occur when predictors in a model correlate with each other in a way that enhances the fit of one (or more) of the predictors. Usually found when:

  • absolute value of a beta weight associated with a predictor is larger than the correlation b/w the predictor and outcome
  • direction of the b and beta weights are the opposite of the simple correlation between the predictor and outcomeX2 is soaking up extraneous variance from X1; X1 is then able to soak up more variance in Y. This is problematic b/c X1 may not be able to explain a significant portion of the variance in Y without X2 (e.g., is too reliant on X2 rather than actually accounting for variance).
31
Q

Describe the ordinary least squares method of parameter estimation. Conceptually, what does this method accomplish and how?

A

Ordinary least squares allows us to assess the best fit of the predictor variables to the outcome values; it is creating a linear equation (i.e., Y = a + (b(x)) ) based on a slope of the predictor to the constant of the sample, which is usually the mean.

  • primary goal is to minimize error variance until it is as small as possible
  • evaluates the differences b/w the predicted value and the actual value of Y
  • assumes the mean of Y for the constant, and treats the constant as equal to 0
  • models causation, and relationship on predictor to Y-hatWe use the error account for by our model over the total error variance to calculate R2, which standardizes the error variance our model explains. If R2 is equal to 0, then the model is doing as badly as possible. R2 equal to 1 is a perfect relational model of Y-hat and the actual values of Y.
32
Q

When would you use hierarchical regression instead of simultaneous? What is one important feature of hierarchical linear regression?

A

A.) Hierarchical regression is used when we want to see how much variation our predictor variable accounts for after all the other possible variables are already controlled for; the way variables are entered into the model is chosen by the researched. Generally used to test theory.

B.) Any known predictors are typically entered first, and then the variable of interest is entered after these other variables are controlled for.
33
Q

What is the relation between the beta and the correlation (r) when there is only one predictor in a regression?

A

If there is only one predictor (i.e., bi-variate), then the value of beta and r is the same. This is b/c:

  • r metric is already standardized
  • beta is this same standardized metric conceptually
  • there are no other variables to assess correlation against/cause multicolinearityTherefore, b/c no other predictor variables exist to contrast the standardized metric and all the variance is already accounted for in the beta alone, they are understood as functionally the same value.
34
Q

Under what circumstances might it be more appropriate to interpret unstandardized regression coefficients over standardized and vice versa?

A

We use standardized regression coefficients when:

  1. we need to compare relative strength of different predictors that are measuring different qualities
  2. when the scales have little intrinsic or mathematical meaning (e.g, using categorical variables, Likert scales w/ no specific interval, etc)

We use unstandardized regression coefficients when:

  1. The units used to measure are well known and have clear mathematical meaning (e.g., dollars, inches, miles, pounds) and can be easily evaluated against each other
35
Q

Why is the mean a reasonable guess about X if we have no information on Y?

A

The mean is a reasonable guess b/c it is a least squares statistic. This means it already minimizes the average squared differences b/w Y and Y-hat to the smallest amount, in part b/c the mean’s upper and lower limits already “balance” to a mathematical zero.

36
Q

What should we do to deal with outliers in our data?

A

In order to deal w/ outliers or extreme cases, we should use SPSS to examine the following statistical procedures:

Cooks distance > 1
Mahalanobis Distance (with 2 predictors, n ~ 30) > 11
Leverage > .16 (2*(k+1/n))
DFBetas > 1

We should also rerun the analysis w/o the outliers include and compare both analyses to better understand the differences caused by the outliers. If we find an outlier statistic that seems to bias the data, we should remove it and include a reason as to why it was dropped from the data set.
37
Q

What are the dangers of having a small sample when we evaluate our regression models?

A
  1. according to the central limit theorem, a smaller sample size more poorly reflects both the actual population size and the assumption of a normal distribution
  2. the bs and beta weights assume normality and will be able to account for more error if the sample size is sufficiently normal
  3. effects confidence intervals and predictor metrics
  4. outliers are more problematic in smaller sample sizes
  5. the sample size may not generalize to the greater population values
38
Q

We often rely on statistical significance when we evaluate our model. What are some of the problems with this?

A
  1. The size of the coefficient can be misrepresented in NHST tests b/c they only incorporate either a rejection or failure to reject–they are mutually exclusive definitions and this creates problems
  2. Depending on the size of the sample, the result of a NHST can fail to represent the actual population if there was more or less data available
  3. Variance across all predictors could not actually be equal/violate homoscendasticity assumption and this changes NHST results
  4. As a probability value, it only gives as estimation b/c it assumes an arbitrary cut off for the possibility to being due to sampling error; this can be misleading if there was an effect throughout all or many cases, but it just doesn’t meet the cut off for whatever reason
39
Q

What might some alternatives to focusing exclusively on p-values when evaluating a regression analysis be?

A

Some alternatives to using p-values to evaluate our models include measuring effect size, which more closely reflects the strength of the relationship b/w the predictor and the outcome. NHST does not look at strength in any meaningful way, but rather if the test statistic can meet the p > 0.05 cut-off. This cut-off is arbitrary, and gives no indication of what the actual model wants to capture in some respects.

40
Q

Many of the assumptions of our regression models can be evaluated by studying the residuals. Why might residuals be useful for evaluating these assumptions?

A

Residuals are useful for evaluating these assumptions b/c of the following factors:

  1. errors should be random if all assumptions are met correctly
  2. residual variability should be consistent across all predictor variables and other parameters
  3. if residuals are not consistent, implies either a problem w/ the model or that variability is not properly being accounted for by some variable (or might measure same concept in different variables)

If the model respects normality, homoscendasticity, and no bizarre correlational relationships, then the residual errors should be equally distributed within predictor variables. This fact alone makes it good to assess the overall consistency of the predictors the model is using to model its linear relationships.

41
Q

Why would it be important to enter all components of a dummy coded variable into the same step of a regression equation?

A

There must be b weights of equal magnitude but opposite signs for the two analyses; if all the dummy coded variables are not entered, this means we do not actually know how the variables are related to each other, and the end result would be thrown off b/c it is not accounting for all the “meaningless variance” each variable represents (i.e., all the 0s and 1s in the sample are not actually represented and thus the mean values being assessed are incorrect).

42
Q

Give an example of a categorical variable and the dummy codes that you would use.

A

EX. Dummy coding:

    If we are looking at political data, we can code the variables to be Democrat, Republican, or Independent. We can then use Independent to represent 0 at baseline, and code in Democrat or Republican as a 1 if the respondent identified as either. 

EX. Dummy coding #2:

    If we are looking at gender data, we can code the variables to be Female or Male. Depending on what we want to assess, we can choose either to be the baseline and have a value of 0. The variable being contrasted would get the value of 1.
43
Q

When using dummy coded variables in multiple regression, what does the y-intercept represent when there are no other variables in the equation? What does the slope of each dummy code represent?

A

Y-intercept: the mean of the comparison group (i.e. the group that gets all zeros across dummy coding categories)

Slope: the b weight represents the difference of the means b/w all paired categorical variables.

44
Q

What is multicollinearity? B) How do you test it?

A

A.) A strong correlation between two or more predictors that impacts the outcome variable; when an association variables is incorporated into the model.

B.) In order to evaluate multicollinearity, we can look at the following statistics:

  1. VIF – Variance Inflation Factor :: VIF >10 or Average VIF > 1
  2. Tolerances - Reciprocal of VIF (1/VIF) :: tolerances
45
Q

What is the difference between correlation and regression?

A

Correlation - quantifies the degree to which two variables are related.

  • describes how two variables vary together
  • both X and Y are measured
  • allows us to interpret the confidence interval of r

Regression - finds the best line that predicts Y from X.

  • X values can be measured or controlled in the experiment
  • quantifies goodness of fit with r
  • variables of X and Y define how the linear relationship is represented (i.e., X is the constant and Y is the predicted or actual value)
46
Q

What is the correlation between Y and Ŷ?

A

The amount of error b/w the measurements of the predicted values for Y based on the modeled equation and the actual values of Y. The purpose of least squares estimation is to find a line of best fit that most closely captures a high value for r (i.e., 1 or -1 is perfect in either direction), which means there would be no error at all b/w the predictors and the actual values.

  • less amount of error / higher r = better correlated b/w predictor and outcome