Midterm two Flashcards

1
Q

Mediation

A

When one independent variable (X1) has an indirect, main effect on a dependent variable (y) through the main effect of the third variable

x1 -> x2 -> y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Spuriousness

A

When a third variable causes both the independent and dependent variables, making their relationship illusory or non - casual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multicollinearity

A

when two independent variables have the basically the same effect, making it difficult to determine their individual effects on the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to check for multicollinearity

A

VIF (Variance Inflation Factor)
n > 5 alarming
n >10 highly alarming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

to calculate interaction, merely adding the variable (for ex. gender) won’t address the interaction

A

therefore, you must add an interaction term:

y = B0 + B1x1 + B2X2 + B3X1X2

ex. coordination = B0 + B1 * drinks + B1 * female + B3 drinks * female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Conditions for regression (4)

A
  1. linearity
  2. nearly normal residuals
  3. constant variability
  4. independent observations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quadratic Term

A

Add a non-linear equation x^2 draw a curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Logarithms

A

How many of one number (the base) does it take to make another number?

ex. log2(8) = 3
222 = 8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

natural logarithm

A

logarithm with a base of e. It is written as In(x) and represents the power to which e must be raised to obtain x. turns a non-linear relationship linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

logistic regression has what kind of independent and dependent variable?

A

It has a numeric IV and categorical DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

generalized linear model

A

A line regression model that predicts a transformation of the dependent variable

transforms y = to f(y) = log(y)
f(y) is the “link function” which means “some function of y”

glm() in R studio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Probability vs.odds

A

probability is the # of possible successes divided by the # of possible outcomes divided by the # of possible failures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Probability equals

A

odds/1+odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

odds equal

A

probability/1-probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

doing generalized linear model in R

A

glm(dv ~ iv, family = binominal (link =”logit”), data = dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how to transform the output of glm to percentage of probability

A

example number is 2

logit(y) = 2
odds(y) = e^2
probability(y) = e^2/1+e^2

whatever the number is, put % and move 2 decimal points (ex. # is 0.023 = 2.3%)

18
Q

When you’re wanting to find interactions for the glm equation in R, you’re answering what question?

A

Is the effect of one dv (ex. years of experience) to get an iv (ex. callback) different for dv2 (ex. black true)

19
Q

MANOVA

A

multivariate analysis of variance

investigates the combination of two or more numeric variables differ across groups. Looks at two + dv simultaneously

21
Q

Assumptions of MANOVA

A
  1. each observation is independent
  2. The dv are multivariate
  3. no multivariate outliers
  4. the groups have equal variance and covariance
22
Q

For MANOVA, how do you test for quality of variance

A

Box’s M-test

23
Q

Discriminant Function Analysis

A

A statistical technique used to classify observations into predefined groups based on predictor variables. It draws a line to maximize distinction between two groups

24
Q

What can MANOVA vs DFA tell you about (ex. sadness and lethargy’s impact on depression)

A

MANOVA: depressed and non-depressed people are different on some linear combination of sadness and lethargy

DFA: what weights on sadness and lethargy best distinguish depressed and non-depressed people?

ex. LDA = 0.428*sadness + 0.208 * lethargy (it’s figuring out those numbers)

25
Q

What does DFA allow you to do?

A

lets you predict group membership (ex. if you’re in the depressed or non-depressed group)

ALSO! DFA with more groups has two linear functions and the proportion of trace tells you how important each function is

26
Q

confusion matrix

A

a table used to evaluate the performance of a classification model by comparing predicted vs. actual values

27
Q

logistic regression

A

statistical method for binary classification, meaning it predicts whether an observation belongs to one of two categories (e.g., Yes/No, 0/1. Pass/Fail). Predicted outputs is probability ranging from 0 to 1