module 5 Flashcards

1
Q

data fishiness assumptions

A
  • assumption of normality
  • assumption of homogeneity of variance
  • independence of observation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

assumption of normality general definition

A

scores on the dependent variable within each group are assumed to be sampled from a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NHST for evaluating normality general definition

A
  • tests if sample distribution is sig different from normal distribution (same mean and SD)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what tests are used for NHST tests for assumption of normality

A
  • shapiro wilkes test
  • kolomogorov smirnov test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

skew and kurtosis definition and cut offs

A
  • skew: asymmetry of distribution (0=normal) for descriptive approach >2
  • kurtosis: measure of how heavy/light distribution tails are (heavy=high kurtosis/many outliers, light=low kurtosis/no outliers) for descriptive approach >7
  • for both, 1.96 or above is non normal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

limitations of stat tests of normality

A
  • big difference needed for small samples, small difference for large sample
  • non-normality is less of a concern in small samples
  • doesnt take type of non normality into account
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

descriptive approach for evaluating normality definition

A
  • looks at descriptives and or graphic displays to quantify magnitude and nature of non-normality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

____ kurtosis is more problematic than ____ kurtosis in t tests, ANOVAs, correlations, and regressions

A

positive, negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

which approach makes more sense for normality testing; NHST or descriptives

A

descriptives bc it combines threshold of values and qq plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

thin vs fat tails for normality distributions

A
  • thin: fewer extreme observations than normal distributions
  • fat: more extreme observations than normal distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if data is normal, scatterplot should resemble a _____

A

straight line (as opposed to cloud shape)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

if the middle of the scatterplot line is straight and the ends flatten, it _____

A

indicates thin tails and is not problematic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

if the middle of the scatterplot line is straight and the ends have a steep slope, it _____

A

indicates fat tails and is problematic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

assumption of homogeneity of variance definition

A

variance of scores on dependent variable with in each group (condition) are the same across all groups (conditions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

evaluating homo of variance; NHST approach definition

A
  • tests if variances in groups are significantly dif from one another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

evaluating homo of variance; descriptive approach

A
  • looks only at imperfection
  • looks at descriptive stats and or graphic displays to quantify magnitude of differential variances (largest vs smallest SD)
  • looks at threshold ratio of largest to smallest variances
17
Q

tests for homo of variance

A
  • levenes tests
  • hertley variance ratio test or f-max tests
18
Q

limitations of NHST approach for homo of variance

A
  • role of sample size (dif in variance is less concern for small and more concern for larger sample sizes)
  • insensitive to dif in variance in small and sensitive to big
  • dif in variance is a magnitude problem
19
Q

if variances are equal, scatterplot should resemble a straight line with a slope of ___ and the intercept is ____ whereas when the variances are not equal, scatterplot will not cluster around the line and will be different from __

A

1, the difference between means,1

20
Q

independence of observation definition

A
  • each observation (between subjects) or set of observations (repeated measures) from the dataset is independent of all other observations/sets
  • ex of independance= roommates/partners
21
Q

positive associations inflate ___ and negative associations inflate ___

A

alpha, beta

22
Q

evaluating independence of observation

A
  • examine structural properties of data to see if basis exists for questioning validity of assumption
  • if no evident basis, its okay to carry on
  • thresholds are up for debate
  • if basis exists, independence can be assessed by computing interclass correlation for the part of data that is assumed to have lack of independence
  • if correlation is very small (<0.10), its fine to use t test/ANOVA
23
Q

address violation for normality

A
  • use alt stat procedures that dont need normality
  • evaluate level of measurement assumptions
  • identity and remove outliers
  • transform data to normalize distribution
24
Q

address violations of homo of variance

A
  • use alt procedures that dont need normality
  • evaluate level of measurement assumptions
  • identity and remove outliers
25
Q

addressing violations of independence of observations

A
  • alt stat procedures
  • ex multi level modeling (MLM) or hierarchal linear modeling (HLM)
26
Q

outliers definition

A
  • extreme values that differ largely from other other observations in dataset and suggest theyre drawn from another population
27
Q

examples of common outliers

A
  • data entry/encoding error (less common now, no longer manual data entry)
  • response latency data (longer response time due to distortion of error, due to distraction etc)
  • open ended estimate data
28
Q

problems with outliers

A
  • responsible often for violations of homo variance/normality
  • conceptual validity
  • disproportionate influence on stat results
29
Q

identifying outliers

A
  • impossible values in frequency tables/histogram
  • steep tails in normal qq plots
  • standardized residuals for observations
  • studentized deleted residuals
30
Q

standardized residuals for observations

A
  • index of deviation from the mean
  • follows z distribution
  • normal distributed N=100, 1 value should be >2.6
  • normal distributed N=1000, 4 values should be >3.0
  • general threshold of 4 or 5 is suggested
31
Q

studentized deleted residuals

A
  • index of deviation from mean (not including target observation in mean and SD calculation)
  • follows t distribution of df=n-2
  • sample of 100, value of >3.6 = outlier
  • sample of 1000, value of >4.07 = outlier
32
Q

response to outlier

A
  • correct or treat impossible values as missing data
  • possible but highly discrepant values can be trimmed or capped to most extreme value/specified values
  • highly discrepant values are treated as missing
33
Q

philosophical issues w outliers

A
  • minimalist perspective: never touch the data, strong rational needed for deletion/alteration of data (due to potential abuse)
  • maximalist perspective: routine altering/deleting of values, outliers violate assumptions, hard to interpret, must set clear rules/procedures to avoid abuse
  • intermediate perspective: justifiable w/ clear rules/procedures and high thresholds for outliers
34
Q

levels of measurement

A
  • nominal: # assignment is abt group membership/categorical (ex nationality)
  • ordinal: # assignment is abt rank order on scale but is not reflective of mag of dif (ex favs, difference between top 1-2 and 4-5 may be different)
  • interval: # assignment is abt rank order and mag of dif but no ratio (ex C degrees scale, 0 for freezing, 100 for boiling)
  • ratio: # assignment is abt rank order, mag and ratio dif (ex mass, length)
35
Q

what level of measurement has an absolute meaning ful zero (0) point

36
Q

before conducting analysis (t test/ANOVA) and descriptive stats, its only meaningful independent variable has at least _______ properties