PSYC 301- final exam Flashcards

1
Q

data fishiness- definition

A

properties of data or statistical tests that suggest potential problems (Abelson calls it this)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

two approaches to evaluating the assumptions of normality

A

NHST and descriptive approaches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

repeated measures (within subjects) one way ANOVA tests

A

mean differences in repeated measure studies with 3+ levels of a single factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does

T
K
G
n
N
P

mean in within subjects anove

A

T- sum of scores within a condition
K- # of levels of the IV
G- sum of all scores
n- sample size
N- total # of scores for the sample (kxn=N)
P- sum total of scores for given person in sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

variability in the means of scores across conditions exists for two reasons in within subjects ANOVA

(variance between treatments)

A

treatment effect- the manipulation distinguishing between conditions

experimental error- random chance errors that occur when measuring the construct of interest

*note no individual differences bc this is a constant across conditions; individual is baseline to themselves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

variability in the means of scores within conditions could be a result of 2 sources

(variance within treatments)

A

individual differences- differences in backgrounds, abilities, circumstances etc of individual ppl (this can be calculated out though)

experimental error- chance errors that occur when measuring the construct of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SSerror =
(4.22)

A

SSerror = SSwithin treatments - SS between subjects (individual diffs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

conceptually the F test for repeated measures becomes (4.16)

A

treatment effect + experimental error/ experimental error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

4.17 ** repeated measures in a nutshell

A

F = MSbetween treatments/MSerror

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

computing within treatment variability (4.19)

A

SSwithin treatments = ∑SSwithin each treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

SStotal =

A

SStotal = SSwithin + SSbetween

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

total df for repeated measures ANOVA (4.23)

A

dftotal = N-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

df between treatments
(4.24)

A

df between treatments = k -1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

df within treatments
(4.25)

A

df within treatments = N-k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

formulas for specific MS values in ANOVA
4.28
4.29

A

MSbetween treatments = SS between treatments/df between treatments

MSerror = SSerror/dferror

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

assumptions of the repeated measures ANOVA

A
  • independence of sets of observations
  • distribution of the outcome variable should be normally distributed in each level of the IV
  • sphercity (type of homogeneity of variance; equality of variances in different scores across all levels of the IV)
  • homogeneity of covariance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is sphercity

A

Are the differences in performance between Program A and Program B, Program B and Program C, and Program A and Program C equally variable?

equality of variances in different scores across all levels of the IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

data fishiness assumptions

A

assumption of normality
assumption of homogeneity of variance
independence of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

assumption of normality

A

scores the DV within each group are assumed to be sampled from a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

evaluating the assumption of normality

A

NHST approach
- tests if sample dist is sig. different from normal dist
- skew- captures symmetry
- kurtosis- captures extreme scores in tails ( 0= normal)

Descriptive approach
- look at descriptive/ graphical displays to quantify the magnitude and nature of non- normality
- skew and kurtosis threshold values ( skew greater than 2, and kurtosis greater than 7), positive kurtosis tends to be worse
- graphical displays (normal qq plots) plot your dist against normal dist with same sample size, if data is normal it looks like straight line, tails, thin or fat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

pros and cons of NHST and descriptive approachin evaluating normality

A

NHST bad bc of the role of sample size
- insensitive to non-normality in small samples and too sensitive to non-normality in large samples
- doesn’t take into account the type of non normality and how much, the question itself doesn’t make conceptual sense bc we want to know if the size (magnitude) of the non normality will alter our data

Descriptive approach better than NHST bc it allows us to see magnitude and type of non normality, but there is still the element of subjectivity meaning that it’s easy to see results when clearly good or bad, but its difficult to judge if deviations are consequential in ambiguous cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

assumptions of homogeneity of variance

A

assumption that variances around the means are generally the same

variances in scores on the DV within each group are the same across groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

evaluating the assumption of homogeneity of variance

A

NHST approach
- tests if variances in groups are sig diff from each other; levens test, hartleys variance ratio and f-max test

descriptive approach
- looks at descriptobe stats/ graphical displays to quantify the magnitude of differential variances
- threshold ratio of largest to smallest variances (recommended threshold 3:1)
- graphical displays (qq plots) take data from 2 conditions and plot (lowest and lowest together etc), if condiions satisfied it’ll be a straight line with slope of 1 and intercept equal to the difference between the means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

assumption of independence of observations

A

each observation (between subjects) or each set of observations (within subjects) comprising the data set is independent of all other observations or sets of observations in the data set
basically, no inherent structure in the nature of our data; no cluster

excluding couples data or roomates data

positive associations inflate alpha
negative associations inflate beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

evaluating the assumption of independence of observations

A

examine structural properties of data to see if a basis exists for questioning the validity of the assumption

if no basis is evident, generally fine to conclude the assumption holds

if a basis exists, independence can be assessed by computing the intraclass correlation for the structural property in the data presumed to produce the violation of independence

if intraclass correlation is very small (less than .10), prob fine to use t tests or ANOVA

clear thresholds for intraclass correlations remain debated so the conceptual basis for expecting violations is important in evaluating this index

if violation occurs, best to use alt analysis that accounts for lack of independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

addressing violations of assumptions

A

normality
- use alr procedures
- transform data to normalize dist
- identify and remove outliers (80%of time this is problem)
- eval level of measurement assumptions

homogeneity of variance
- use alt procedures
- identify and remove outliers
- eval level of measurement assumptions

indep of obvs
- alt stat procedures like MLM and HLM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

outliers

A

extreme values in a data set that differ substantially from other observations in the data set suggesting they might be drawn from a different population

often responsible for violations of normality and homogeneity of variance

have a disproportionate influence on stat results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

examples of common outliers

A

data entry/coding errors
responses in latency data
open ended estimate data (no upper boundary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

identifying outliers

A

impossible values in freq tables or histograms
seen in normal qq plots as steep tails

standardized residuals (general thresholds of 4 or 5 are sufficiently weird), includes target observation in mean which can drag it

studentized deleted residuals: index of deviation from the mean NOT including the target observation in the calc of the mean
- sample of 100, 3.6
- sample of 1000, 4.07

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

thin tails

A

fewer extreme observations than the normal dist

31
Q

fat tails

A

more extreme observations than the normal dist

32
Q

responses to outliers

A

impossible values should be corrected if possible or treated as missing data if not possible to correct

trimming or capping to most extreme acceptable value in data set/specified value
- conceptual basis not ideal bc no reason to assume value

33
Q

philosophocal issues in outliers

A

minimalist- data should be minimally altered
- dists should have some extreme values

maximalist- routine altering or delation of values
- outliers create violations

intermediate- won’t throw out unless really problematic

34
Q

levels of measurement

A

Nominal
- categorical distinctions, no mag
Ordinal
- rank ordering, no mag
Interval
- rank ordering and mag
Ratio
- rank ordering, mag, and ratio of difference

for rating scales 7 points is sufficient, 5 is ambiguous, and less than 5 is problematic

35
Q

argument for levels

A

has been argued that t tests and ANOVA are only meaningful to conduct if DV has at least interval properties

very problematic distributional properties of data can sometimes indicate level of measurement is not appropriate

36
Q

factorial ANOVA and its advantages

A

general term for an ANOVA with more than 1 IV

modest gain in efficiency
ability to test joint effects
- additive- no interaction
- nonadditive- interaction

37
Q

in a 2 way anova we test how many effects

A

3 effects
main effect of IV1
main effect of IV2
interaction effect of IV1 with IV2

number of levels doesn’t change number of effects!

38
Q

interactions sometimes referred to as

A

moderator effects
- a moderator regulates the effect of another IV

if 1st IVs effects change based on the 2nd IV, 2nd IV is the moderator

39
Q

F test associated with null hypotheses for 2 way ANOVA and the hypothesis (4.1)

A

F = variance between treatments/variance within treatments

difference is that between treatment variance will now be further divided into 3 components:
Factor A between treatment variance (MSa)
Factor B between treatment variance (MSb)
Factor AxB between treatment variance (MSaxb)

F val of 1 indicates no treatment effect (0)
F value greater than 1 indicates given treatment effect exists

40
Q

2 way between subjects ANOVA

G
N
p
q
n

A

G- grand total of all scores in entire experiment
N- total number of scores in entire experiment
p- # of levels in factor A
q- # of levels in factor B
n- # of scores in each treatment condition (each cell of the AxB matric)

41
Q

computer A x B between treatment variability (6.6)

A

SSaxb = SS between - SSa - SSb

42
Q

general formula for mean square (4.10)

A

MS = SS/df

43
Q

articulation (abelson)

A

extent to which results are presented in a clear and useful manner; as results get more complex, there will be more ways they can be articulated

44
Q

as in 1 way anova, 2 general approaches to follow up tests exist for two way anova

A

a posteriori tests (post hoc)
a priori tests (planned)

45
Q

analysis of simple effects

A

effect of 1 IV at a specific level of the other IV

once we get to 3 levels, simple effect test becomes omnibus itself, need contrasts

46
Q

setting alpha in 2 way between sibjects anova

A

alpha almost never adjusted for these multiple tests in ANOVA, thus emphasis tends to be on confirmatory analyses
replication seen as more essential

47
Q

principle for setting beta in the context of multiple test

A

minimum acceptable power on the basis of the weakest anticipated effect

minimum acceptable power on the basis of the most important effect/ sets of effects

48
Q

calcuating standardized effect sizes in 2 way between subjects anova

A

np2 = SSeffect/ SSeffect + SS within

49
Q

pearson correlation coefficient

A

index of association that assesses the magnitude and direction of linear relation between 2 variables

r = covariability/ variability separately (7.2)

50
Q

sum of the products of devation (7.3)

A

SP = ∑(X - X̄)(Y- Ȳ)

taking deviation products and summing them

index of covariability
- lots of above/above and below/below pairs will produce big positive SP values
-lots of below/above and above/ below pairs will produce big negative SP values
- equal mix of both will produce near 0 SP values

51
Q

r coefficient is an index of covariability of X and Y relative to variability of those separately

formula (7.4)

A

r = SP/ square root of SSxSSy

52
Q

relationship of r to z scores

A

z scores reflect an individual’s scores standing within the distribution for that score

tells us where they fall in the distribution of everyone

so r can be expressed in terms of z scores

53
Q

r expressed in terms of z scores (7.5)

A

r = ∑ZxZy/n

seen by some as best formula for r

54
Q

coefficient of determination

A

if pearson correlation coefficient (r) is squared, it reflects the proportion of variance in one variable linearly accounted for by the other variable

ex. r=50 indicates that the first variable accounts for .25 (25%) of the variability in the second score
- 25% overlap

55
Q

formula for t test of r (7.6)

A

t = r√n-2/√1-r2

bigger rs make bigger ts
bigger correlation gets in denominator, smaller the number gets

56
Q

factors influencing the size of r

A

distributions of variables
- perfect correlations only possible if shape of dists is exactly same

reliability of measures
- perfect correlations only possible with perfect reliability in both measures

restrictions of range
- restricting the range on either variables can attenuate correlations

57
Q

regression

A

formal procedure by which scores on one variable can be used to predict scores on another variable; it’s the statistical procedure by which we use a data set to arrive at a formula to produce the best fitting line for that data set

ex. GRE on yale undergrad grades

the better our predictor, the more tighly data points will cluster around the line

58
Q

when two variables are linearly associated, can be described with basic equation (7.7)

A

Y = bX + a

X- scores on first variable (predictor)
Y- scores on second variable (outcome)
b- fixed constant reps the slope of the best fitting line
a- fixed constant reps the Y intercept (expected value of Y when X is 0

59
Q

the extent to which the line generated by a given regression equation fits a specific data set is defined by the following (7.8)

foundational to regression

A

total squared error (SSerror)

Total squared error = ∑ (Y - Yhat)2

y reps an actual data point and y hat reps the predicted value for that data point given its X value

small values reflect less error

60
Q

formula for b (7.9)

A

b = SP/SSx

Sp is measure of covariability
SSx is measure of total variability of X

higher Sp increases b
higher SSx decreases b

61
Q

formula for a (7.10)

A

a = Ybar -bXbar

62
Q

when x and y are z scores, the simple regression equation becomes (7.11)

A

Zyhat = rZx

r becomes our slope and a becomes 0 so it can be dropped

63
Q

explain how b becomes r and a becomes 0

A

r= SP/square root of SSxSSy and b = SP/SSx so they’re the same

and then

both x and y have means of 0 when they’re z scores so

a = Yhat - bX
a = 0 - b0
a = 0

64
Q

standard error of estimate

A

a measure of the standard distance between a regression line and actual data points

total squared error is in it

look at ipad

SSerror is related to r, as r approaches 1, SS error becomes smaller and as r approachs 0 SSerror becomes larger

65
Q

SSerror equation (7.13)

A

SSerror = (1-r2)SSy

this leads to an alternative formula for standard error of estimate

66
Q

standard error of estime alt formula (with r)

A

look at ipad

67
Q

F test for the regression coefficient

A

F = variance predicted by the regression/ error variance

68
Q

MS values of regression

A

look at ipad x 2

69
Q

F test (7.20)

A

F = MS regression/ MSerror

70
Q

regression assumptions

A

independence of observations

linear relationship between X and Y

residuals (errors in prediction) normally dsitributed with mean of 0

homoscedasticity of residuals- equal variance around regression line

71
Q

MAGIC

A

M- magnitude
the mag of an effect can play a role in the persuasive strength of a research claim
- big effects not always practical
- small effects sometimes impressive
- conceptual implications sometimes matter more than size of effect

A- articulation
persuasive strength of a claim will be influenced by how efficiently, accurately and clearly an analytical strategy is used to capture key conclusions from the data

G- generality
generality across studies and researchers (replication)
generality across pops and contexts

I- interestingness
interesting as function of method
interesting as function of theory
interesting as function of surprise (novelty/mag)
interesting as function of importance (prac, implications)

C- credibility
conceptual basis for credibility
- fits with existing theory
- fits with common sense
methodological basis for credibility
- data fishiness
- improper stat procedures
- alt explanations beyond IV
- IV and DV reflect their constructs?

72
Q

main effect, what is it

A

effect of 1 ivs overall on the dv

73
Q

interaction

A

differences of differences; compares the differences in one factor across levels of another to determine whether they are consistent or not