Midterm Flashcards

1
Q

The General Linear Model Goal

A

try to account for as much variability in a dependent variable as possible
the Y variable must be continuous (interval or ratio scale)
can use one or multiple IV’s to account for the variance in the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Categorical Variables

A

Binary Variable- 2 distinct categories
Nominal Variables- more than 2 distinct categories
Ordinal Variables- more than 2 distinct categories which go in a logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Continuous Variables

A

entities get a distinct score on a scale
Interval Variable: equal intervals on the variable represent equal differences in the property being measured
Ratio Variable: same as interval, but the ratios of scores are important and 0 is meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Measuring Effect Size

A

R2 (Coefficient of Determination)
how much variance in the outcome variable is accounted for by the IV
We test whether the effect size is significant by the F and t statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The Statistical Model

A

We want to try and theoretically reflect real world phenomena
we want to be 95% sure that our findings are due to our model (only 5% chance it happened by chance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sum of Squares Total

A

The total variability between scores and mean

The sum of each score minus the mean squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sums of Squares Error

A

Deviance between the model and each person’s predicted score

The sum of each score minus the predictor score squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sums of Squares Model

A

deviance between the mean and the model

The sum of The mean subtract each person’s predicted score squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mean Square

A

The average of the sum of squares

the SS divided by the associated degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Degrees of Freedom

A

the wiggle room in the data set
because our mean must stay constant, all of our scores can be anything except our last score, which must bring us to our mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Central Limit Theorem

A

if there are 30 or more participants in a study, a normal distribution will begin to emerge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Confidence Intervals

A

Describes the upper and lower bounds of a score

We want to be 95% sure the score will land in the confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Types of Hypotheses

A

Null Hypothesis- we assume there is no effect of the IV on the DV
Alternative Hypothesis- that there is an effect of the IV on the DV
We assume the null hypothesis until shown otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

One and Two Tailed Error

A

One Tailed- probability only goes one way (.05 on either the positive or negative side)
Two Tailed- probability is taken on both sides, .025 on each side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Type i error

A

when we believe there is a genuine effect and there is not

probability of this happening is measured at the alpha level (usually .05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Type ii error

A

when we believe there is no effect and there is an effect

the probability of this happening is measured at the beta level (usually .2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Confidence Intervals and Statistical Significance

A

if CI’s overlap, generally the findings are not significant

As sample size increases, CI’s decrease and we are more likely to find a significant result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Misconceptions about the P Value

A
  1. Significant result DOES NOT mean it is important
  2. A Non-significant hypothesis DOES NOT mean there is no effect, only that it is not big enough to be found
  3. A significant result DOES NOT mean the Ho is false
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Problems with NHST

A

All or nothing thinking (that significance is everything; instead, we can also look at effect size)
Reliant on sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Wider Science Problems with NHST

A

Incentive structure- you are more likely to exceed in the research field if your findings are significant
Researcher Degrees of Freedom- a researcher’s decisions can change the P value and make it significant
P-Hacking (changing certain numbers or methods after the fact to make your P significant)
Harking- finding a significant result in your data you weren’t studying and then changing your hypothesis to match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Avoiding the Wider Science Problems

A

Open science- movement to make the process, data, and outcomes of research freely available
Pre-registering research- receiving feedback and promises of publishing by preregistering with a journal; ensures less competition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Effect Sizes

A

standardized measures o the size of an effect which can be compared across studies
not as reliant on sample size as p
Cohen’s d, Pearson’s r, and odds ratio are all examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

r, Correlation Coefficient

A

a good measure when group size is the same
A positive correlation suggests that the values increase or decrease together
A negative correlation suggests that as one increases, the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Effect Sizes of Pearson’s r

A

r=.1 (S)
r=.3 (M)
r=.5 (L)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Advantages of Effect Size

A

encourages interpreting effects on a continuum and not the categorical sig. or non sig.
Effect size is affected by sample size but not confounded by it
Results in less incentive to mess wth the data

26
Q

Meta Analysis

A

computing effect sizes for a series of studies that investigated the same research question and looking at a weighted average of those effect sizes
Helps us get closer to discovering the true effect size of the population

27
Q

Bayesian Estimation

A

we use previous research and knowledge to predict what will happen in our study
then you compute with previous knowledge and your findings to improve future predictions

28
Q

Benefits of Bayesian Estimation

A

Evaluates the evidence for the Ho.
Not confounded by sample size and stopping rules
No way to p-hack because it is based on estimation and interpretation

29
Q

Assumptions of Parametric Tests

A

Additivity and Linearity
Normality
Homogeneity of Variance ( the variance is the same at each level of the variable)
Independence of scores

30
Q

Outliers

A

data points which land outside our normal range of data and which increase error because they increase variance

31
Q

Additivity

A

the combined effect of all our variable should be greater than any on its own (thereby reducing error)

32
Q

Linearity

A

the outcome is linearly related to the predictors

33
Q

Normal Distribution

A

If the data is normal distributed, the mean and other parameters are accurate reflections of the data
need to be more concerned with this in smaller samples

34
Q

Homoscedasticity/ Homogeneity of Variance

A

the assumption that your groups are fair to compare based on similar levels of variance
We assess using Levene’s Test (tests if variances are the same)
Levene’s error doesn’t detect large errors in small samples and does detect small errors in large samples

35
Q

Independence

A

the assumption that each participant’s error is unrelated to the error of the other participants

36
Q

Listwise Deletion

A

completely excludes a participant from all calculations if one piece of data is missing

37
Q

Pairwise Deletion

A

excludes a participant from a calculation only if they are missing data from one of the variables in question

38
Q

Bootstrapping

A

runs calculations on your data multiple times multiple ways to find the most accurate mean and confidence intervals
accepted practice is 10 000 bootstraps

39
Q

Transforming Data

A
Log Transformation (log of the values; reduces positive skew)
Square Root Transformation (square root of the values, reduces positive skew and stabilizes variance
Reciprocal Transformation(1 divide by values reduces the impact of large scores)
40
Q

Cautions with Transformations

A

They change your data

  • change your scale
  • they can change what we are measuring
  • last case scenario
41
Q

Correlation

A

a standardized measure of the relationship between two continuous variables
Can be positive or negative

42
Q

Covariance

A

a measure of the similarity of variance in 2 variables

how much scores vary from the mean on 2 variables

43
Q

Problems with Covariance

A

depends on the units of measure
we must standardize it
Correlation is standardized covariance

44
Q

Correlation Does Not Imply Causation

A

direction of causality cannot be inferred

there may be other confounding variables (3rd variable problem)

45
Q

Linear Regression

A

a method of predicting the value of one variable from another
it is the hypothetical relationship between two variables related linearly
In order to estimate the line of best fit, we use method of least squares

46
Q

F Statistic

A

Testing the fit
if the model results in better prediction than using the mean, we expect the model to be significant
influences by population size

47
Q

Hierarchical Model

A

Process of entering multiple predictors in steps; known predictors are entered first to become constants before unknown predictors are entered

  • you can see the unique predictive influence of a variable
  • Drawback: requires the researcher to know what they are doing
48
Q

Forced Entry

A

Process of entering multiple predictors; all variables are entered simultaneously

49
Q

Stepwise Entry

A

Process of entering multiple predictors; variables are entered in based on the amount of variance they can explain, highest first
SPSS does this automatically
Problem: reliance on mathematical criterion can mean that tiny math difference lead to huge interpretation errors

50
Q

Standardized Residuals

A

A way of identifying outliers
based on SD, suggests that if a case falls outside of a SD of +/- 3 (above/below 99% of the population) it can be considered an outlier

51
Q

Influential Cases

A

Certain outliers can pull the mean so far that the outlier does not have the greatest deviance (ie it doesn’t seem like the outlier) so deviance is not the best way to identify outliers

52
Q

Cook’s Distance

A

measures how much influence any individual case has on the model as a whole
allows for the identification of influential cases

53
Q

Common Sense Real- World Assumptions of Regression

A

1/ Outcome variable is continuous

  1. Predictor variable is continuous or dichotomous
  2. Predictors must not have 0 variance
  3. Linearity
  4. Independence
54
Q

Assumptions that Matter in Regression IN Order

A
  1. Additivity and Linearity
  2. Homoscedasticity
  3. Independence
  4. Normal distribution
  5. No Multicollinearity between predictors
55
Q

Moderation

A

the combined effect of two variables on another (interaction effect)
A moderator variable changes the strength or direction of the relationship between x and y
We follow up moderation with a simple slopes test

56
Q

Centering Variables

A

the process of transforming a variable so that the deviation is centered around 0, which represents the mean
Basically, take every score and subtract the mean

57
Q

Mediation

A

the situation when the relationship between a predictor and outcome variable can be explained by their relationship to a third variable

58
Q

Baron and Kenny, 1986

A

Mediation is tested through 3 regression models

  1. Predicting the outcome from the predictor (c path)
  2. Predicting the mediator from the predictor (a path)
  3. Predicting the outcome from both the predictor and mediator (Indirect effect)
59
Q

Baron and Kenny; 4 Conditions That Suggest Mediation

A
  1. Predictor must significantly predict the outcome (Sig. c path)
  2. Predictor must significantly predict the mediator (sig. a path)
  3. Mediator must significantly predict the outcome (sig. b path)
  4. Predictor must predict outcome less strongly in model 3 than model 1 (c’ must be lower than c)
60
Q

Sobel Test

A

alternate to estimate the indirect effect and its significance
If the Sobel Test comes up as significant, then there is significant mediation

61
Q

Effect Sizes of Mediation

A

indirect effect = ab
indirect effect, partially standardized = ab/S outcome
indirect effect, standardized ab/Soutcome X Spredictor

62
Q

Dummy Variables

A

When we have categorical predictors with more than one category, we create dummy variables rather than coding as 1’s and 2’s
You choose one variable that always is assigned 0 and for each dummy variable all the variables are 0 except 1, which is 1