Midterm Flashcards

1
Q

The General Linear Model Goal

A

try to account for as much variability in a dependent variable as possible
the Y variable must be continuous (interval or ratio scale)
can use one or multiple IV’s to account for the variance in the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Categorical Variables

A

Binary Variable- 2 distinct categories
Nominal Variables- more than 2 distinct categories
Ordinal Variables- more than 2 distinct categories which go in a logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Continuous Variables

A

entities get a distinct score on a scale
Interval Variable: equal intervals on the variable represent equal differences in the property being measured
Ratio Variable: same as interval, but the ratios of scores are important and 0 is meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Measuring Effect Size

A

R2 (Coefficient of Determination)
how much variance in the outcome variable is accounted for by the IV
We test whether the effect size is significant by the F and t statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The Statistical Model

A

We want to try and theoretically reflect real world phenomena
we want to be 95% sure that our findings are due to our model (only 5% chance it happened by chance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sum of Squares Total

A

The total variability between scores and mean

The sum of each score minus the mean squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sums of Squares Error

A

Deviance between the model and each person’s predicted score

The sum of each score minus the predictor score squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sums of Squares Model

A

deviance between the mean and the model

The sum of The mean subtract each person’s predicted score squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mean Square

A

The average of the sum of squares

the SS divided by the associated degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Degrees of Freedom

A

the wiggle room in the data set
because our mean must stay constant, all of our scores can be anything except our last score, which must bring us to our mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Central Limit Theorem

A

if there are 30 or more participants in a study, a normal distribution will begin to emerge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Confidence Intervals

A

Describes the upper and lower bounds of a score

We want to be 95% sure the score will land in the confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Types of Hypotheses

A

Null Hypothesis- we assume there is no effect of the IV on the DV
Alternative Hypothesis- that there is an effect of the IV on the DV
We assume the null hypothesis until shown otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

One and Two Tailed Error

A

One Tailed- probability only goes one way (.05 on either the positive or negative side)
Two Tailed- probability is taken on both sides, .025 on each side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Type i error

A

when we believe there is a genuine effect and there is not

probability of this happening is measured at the alpha level (usually .05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Type ii error

A

when we believe there is no effect and there is an effect

the probability of this happening is measured at the beta level (usually .2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Confidence Intervals and Statistical Significance

A

if CI’s overlap, generally the findings are not significant

As sample size increases, CI’s decrease and we are more likely to find a significant result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Misconceptions about the P Value

A
  1. Significant result DOES NOT mean it is important
  2. A Non-significant hypothesis DOES NOT mean there is no effect, only that it is not big enough to be found
  3. A significant result DOES NOT mean the Ho is false
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Problems with NHST

A

All or nothing thinking (that significance is everything; instead, we can also look at effect size)
Reliant on sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Wider Science Problems with NHST

A

Incentive structure- you are more likely to exceed in the research field if your findings are significant
Researcher Degrees of Freedom- a researcher’s decisions can change the P value and make it significant
P-Hacking (changing certain numbers or methods after the fact to make your P significant)
Harking- finding a significant result in your data you weren’t studying and then changing your hypothesis to match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Avoiding the Wider Science Problems

A

Open science- movement to make the process, data, and outcomes of research freely available
Pre-registering research- receiving feedback and promises of publishing by preregistering with a journal; ensures less competition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Effect Sizes

A

standardized measures o the size of an effect which can be compared across studies
not as reliant on sample size as p
Cohen’s d, Pearson’s r, and odds ratio are all examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

r, Correlation Coefficient

A

a good measure when group size is the same
A positive correlation suggests that the values increase or decrease together
A negative correlation suggests that as one increases, the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Effect Sizes of Pearson’s r

A

r=.1 (S)
r=.3 (M)
r=.5 (L)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Advantages of Effect Size
encourages interpreting effects on a continuum and not the categorical sig. or non sig. Effect size is affected by sample size but not confounded by it Results in less incentive to mess wth the data
26
Meta Analysis
computing effect sizes for a series of studies that investigated the same research question and looking at a weighted average of those effect sizes Helps us get closer to discovering the true effect size of the population
27
Bayesian Estimation
we use previous research and knowledge to predict what will happen in our study then you compute with previous knowledge and your findings to improve future predictions
28
Benefits of Bayesian Estimation
Evaluates the evidence for the Ho. Not confounded by sample size and stopping rules No way to p-hack because it is based on estimation and interpretation
29
Assumptions of Parametric Tests
Additivity and Linearity Normality Homogeneity of Variance ( the variance is the same at each level of the variable) Independence of scores
30
Outliers
data points which land outside our normal range of data and which increase error because they increase variance
31
Additivity
the combined effect of all our variable should be greater than any on its own (thereby reducing error)
32
Linearity
the outcome is linearly related to the predictors
33
Normal Distribution
If the data is normal distributed, the mean and other parameters are accurate reflections of the data need to be more concerned with this in smaller samples
34
Homoscedasticity/ Homogeneity of Variance
the assumption that your groups are fair to compare based on similar levels of variance We assess using Levene's Test (tests if variances are the same) Levene's error doesn't detect large errors in small samples and does detect small errors in large samples
35
Independence
the assumption that each participant's error is unrelated to the error of the other participants
36
Listwise Deletion
completely excludes a participant from all calculations if one piece of data is missing
37
Pairwise Deletion
excludes a participant from a calculation only if they are missing data from one of the variables in question
38
Bootstrapping
runs calculations on your data multiple times multiple ways to find the most accurate mean and confidence intervals accepted practice is 10 000 bootstraps
39
Transforming Data
``` Log Transformation (log of the values; reduces positive skew) Square Root Transformation (square root of the values, reduces positive skew and stabilizes variance Reciprocal Transformation(1 divide by values reduces the impact of large scores) ```
40
Cautions with Transformations
They change your data - change your scale - they can change what we are measuring - last case scenario
41
Correlation
a standardized measure of the relationship between two continuous variables Can be positive or negative
42
Covariance
a measure of the similarity of variance in 2 variables | how much scores vary from the mean on 2 variables
43
Problems with Covariance
depends on the units of measure we must standardize it Correlation is standardized covariance
44
Correlation Does Not Imply Causation
direction of causality cannot be inferred | there may be other confounding variables (3rd variable problem)
45
Linear Regression
a method of predicting the value of one variable from another it is the hypothetical relationship between two variables related linearly In order to estimate the line of best fit, we use method of least squares
46
F Statistic
Testing the fit if the model results in better prediction than using the mean, we expect the model to be significant influences by population size
47
Hierarchical Model
Process of entering multiple predictors in steps; known predictors are entered first to become constants before unknown predictors are entered - you can see the unique predictive influence of a variable - Drawback: requires the researcher to know what they are doing
48
Forced Entry
Process of entering multiple predictors; all variables are entered simultaneously
49
Stepwise Entry
Process of entering multiple predictors; variables are entered in based on the amount of variance they can explain, highest first SPSS does this automatically Problem: reliance on mathematical criterion can mean that tiny math difference lead to huge interpretation errors
50
Standardized Residuals
A way of identifying outliers based on SD, suggests that if a case falls outside of a SD of +/- 3 (above/below 99% of the population) it can be considered an outlier
51
Influential Cases
Certain outliers can pull the mean so far that the outlier does not have the greatest deviance (ie it doesn't seem like the outlier) so deviance is not the best way to identify outliers
52
Cook's Distance
measures how much influence any individual case has on the model as a whole allows for the identification of influential cases
53
Common Sense Real- World Assumptions of Regression
1/ Outcome variable is continuous 2. Predictor variable is continuous or dichotomous 3. Predictors must not have 0 variance 4. Linearity 5. Independence
54
Assumptions that Matter in Regression IN Order
1. Additivity and Linearity 2. Homoscedasticity 3. Independence 4. Normal distribution 5. No Multicollinearity between predictors
55
Moderation
the combined effect of two variables on another (interaction effect) A moderator variable changes the strength or direction of the relationship between x and y We follow up moderation with a simple slopes test
56
Centering Variables
the process of transforming a variable so that the deviation is centered around 0, which represents the mean Basically, take every score and subtract the mean
57
Mediation
the situation when the relationship between a predictor and outcome variable can be explained by their relationship to a third variable
58
Baron and Kenny, 1986
Mediation is tested through 3 regression models 1. Predicting the outcome from the predictor (c path) 2. Predicting the mediator from the predictor (a path) 3. Predicting the outcome from both the predictor and mediator (Indirect effect)
59
Baron and Kenny; 4 Conditions That Suggest Mediation
1. Predictor must significantly predict the outcome (Sig. c path) 2. Predictor must significantly predict the mediator (sig. a path) 3. Mediator must significantly predict the outcome (sig. b path) 4. Predictor must predict outcome less strongly in model 3 than model 1 (c' must be lower than c)
60
Sobel Test
alternate to estimate the indirect effect and its significance If the Sobel Test comes up as significant, then there is significant mediation
61
Effect Sizes of Mediation
indirect effect = ab indirect effect, partially standardized = ab/S outcome indirect effect, standardized ab/Soutcome X Spredictor
62
Dummy Variables
When we have categorical predictors with more than one category, we create dummy variables rather than coding as 1's and 2's You choose one variable that always is assigned 0 and for each dummy variable all the variables are 0 except 1, which is 1