midterm 2 Flashcards

1
Q

Mediation

A

when one independent variable (x1) has n indirect, main effect on a dependent variable (y) through the main effect of the third variable (x2)

x1 - x2 - y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

spuriousness

A

when a third variable causes both the independent and dependent variables, making their relationship illusory or non-causal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

multicollinearity

A

when two independent variables have the basically the same effect, making it difficult to determine their individual effects on the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how to check for multicollinearity

A

VIF (Variance Inflation Factor)

n > 5 alarming
n > 10 highly alarming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

to calculate interaction, merely adding the variable (for ex. gender) won’t address the interaction

A

therefore, you must add an interaction term:

y = Bo + B1X1 + B1X2 + B3X1X2

ex. coordination = Bo + B1 * drinks + B1 * female + b3 drinks * female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

conditions for regression (4)

A
  1. linearity
  2. nearly normal residuals
  3. constant variability
  4. independent observations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quadratic Term

A

Add to a non-linear equation x^2 to draw a curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Logarithms

A

How many of one number (the base) does it take to make another number?
ex. log2(8) = 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

natural logarithm

A

logarithm with a base of e. It is written as ln(x) and represents the power to which e must be raised to obtain x. turns a non-linear relationship linear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

logistic regression has what kind of independent and dependent variable?

A

it has a numeric IV and categorical DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

generalized linear model

A

a linear regression model that predicts a transformation of the dependent variable.

transforms y = to f(y) = log(y)
f(y) is the “link function” which means “some function of y”

glm() in R studio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

probability vs odds

A

probability is the # of possible successes divided by the # of possible outcomes divided by the # of possible failures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

probability equals

A

odds/1 + odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

odds equals

A

probability/1 - probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Doing generalized linear model in R

A

glm(dv ~ iv, family = binominal (link = “logit”), data = dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to transform the output of glm to percentage of probability

A

example number is 2

logit(y) = 2
odds(y) = e^2
probability(y) = e^2/1+e^2

whatever the number is, put % and move 2 decimal points (ex. # is 0.023 = 2.3%)

16
Q

when you’re wanting to find interactions for the glm equation in R, you’re answering what question?

A

is the effect of one dv (ex. years of experience) to get an iv (ex. calback) different for dv2 (ex. black true)

17
Q

MANOVA

A

multivariate analysis of variance

investigates the combination of two or more numeric variables differ across groups. looks at two + dv simultaneously

18
Q

What is the Null and Alternative Hypothesis for MANOVA

A

null: there is no difference between the groups on the combined dependent variables

alternate: at least one group is different on the combined dv.

19
Q

Assumptions of MANOVA

A
  1. each observation is independent
  2. the dv are multivariate normal
  3. no multivariate outliers
  4. the groups have equal variance and covariance
20
Q

for MANOVA, how do you test for equality of variance

A

Box’s M-test

21
Q

Discriminant Function Analysis

A

a statistical technique used to classify observations into predefined groups based on predictor variables. It draws a line to maximize distinction between two groups

22
Q

What can MANOVA vs DFA tell you about (ex. sadness and lethargy’s impact on depression)

A

MANOVA: depressed & non-depressed people are different on some linear combination of sadness & lethargy

DFA: what weights on sadness & lethargy best distinguish depressed and non-depressed people?

ex. LDA = 0.428 * sadness + 0.208 * lethargy (it’s figuring out those numbers^)

23
Q

What does DFA allow you to do?

A

lets you predict group membership (ex. if you’re in the depressed or non-depressed group)

Also! DFA with more groups (4…? groups i think?) has two linear functions and the proportion of trace (in R) tells you how important each function is

24
Q

confusion matrix

A

a table used to evaluate the performance of a classification model by comparing predicted vs. actual values

25
Q

logistic regression

A

statistical method for binary classification, meaning it predicts whether an observation belongs to one of two categories (e.g., Yes/No, 0/1, Pass/Fail). Predicted output is probability ranging from 0 to 1.