module 5 Flashcards

Question 1

Q

data fishiness assumptions

Answer

A

assumption of normality
assumption of homogeneity of variance
independence of observation

Question 2

Q

assumption of normality general definition

Answer

A

scores on the dependent variable within each group are assumed to be sampled from a normal distribution

Question 3

Q

NHST for evaluating normality general definition

Answer

A

tests if sample distribution is sig different from normal distribution (same mean and SD)

Question 4

Q

what tests are used for NHST tests for assumption of normality

Answer

A

shapiro wilkes test
kolomogorov smirnov test

Question 5

Q

skew and kurtosis definition and cut offs

Answer

A

skew: asymmetry of distribution (0=normal) for descriptive approach >2
kurtosis: measure of how heavy/light distribution tails are (heavy=high kurtosis/many outliers, light=low kurtosis/no outliers) for descriptive approach >7
for both, 1.96 or above is non normal

Question 6

Q

limitations of stat tests of normality

Answer

A

big difference needed for small samples, small difference for large sample
non-normality is less of a concern in small samples
doesnt take type of non normality into account

Question 7

Q

descriptive approach for evaluating normality definition

Answer

A

looks at descriptives and or graphic displays to quantify magnitude and nature of non-normality

Question 8

Q

____ kurtosis is more problematic than ____ kurtosis in t tests, ANOVAs, correlations, and regressions

Answer

A

positive, negative

Question 9

Q

which approach makes more sense for normality testing; NHST or descriptives

Answer

A

descriptives bc it combines threshold of values and qq plots

Question 10

Q

thin vs fat tails for normality distributions

Answer

A

thin: fewer extreme observations than normal distributions
fat: more extreme observations than normal distributions

Question 11

Q

if data is normal, scatterplot should resemble a _____

Answer

A

straight line (as opposed to cloud shape)

Question 12

Q

if the middle of the scatterplot line is straight and the ends flatten, it _____

Answer

A

indicates thin tails and is not problematic

Question 13

Q

if the middle of the scatterplot line is straight and the ends have a steep slope, it _____

Answer

A

indicates fat tails and is problematic

Question 14

Q

assumption of homogeneity of variance definition

Answer

A

variance of scores on dependent variable with in each group (condition) are the same across all groups (conditions)

Question 15

Q

evaluating homo of variance; NHST approach definition

Answer

A

tests if variances in groups are significantly dif from one another

Question 16

Q

evaluating homo of variance; descriptive approach

Answer

A

looks only at imperfection
looks at descriptive stats and or graphic displays to quantify magnitude of differential variances (largest vs smallest SD)
looks at threshold ratio of largest to smallest variances

Question 17

Q

tests for homo of variance

Answer

A

levenes tests
hertley variance ratio test or f-max tests

Question 18

Q

limitations of NHST approach for homo of variance

Answer

A

role of sample size (dif in variance is less concern for small and more concern for larger sample sizes)
insensitive to dif in variance in small and sensitive to big
dif in variance is a magnitude problem

Question 19

Q

if variances are equal, scatterplot should resemble a straight line with a slope of ___ and the intercept is ____ whereas when the variances are not equal, scatterplot will not cluster around the line and will be different from __

Answer

A

1, the difference between means,1

Question 20

Q

independence of observation definition

Answer

A

each observation (between subjects) or set of observations (repeated measures) from the dataset is independent of all other observations/sets
ex of independance= roommates/partners

Question 21

Q

positive associations inflate ___ and negative associations inflate ___

Answer

A

alpha, beta

Question 22

Q

evaluating independence of observation

Answer

A

examine structural properties of data to see if basis exists for questioning validity of assumption
if no evident basis, its okay to carry on
thresholds are up for debate
if basis exists, independence can be assessed by computing interclass correlation for the part of data that is assumed to have lack of independence
if correlation is very small (<0.10), its fine to use t test/ANOVA

Question 23

Q

address violation for normality

Answer

A

use alt stat procedures that dont need normality
evaluate level of measurement assumptions
identity and remove outliers
transform data to normalize distribution

Question 24

Q

address violations of homo of variance

Answer

A

use alt procedures that dont need normality
evaluate level of measurement assumptions
identity and remove outliers

Question 25

Q

addressing violations of independence of observations

Answer

A

alt stat procedures
ex multi level modeling (MLM) or hierarchal linear modeling (HLM)

Question 26

Q

outliers definition

Answer

A

extreme values that differ largely from other other observations in dataset and suggest theyre drawn from another population

Question 27

Q

examples of common outliers

Answer

A

data entry/encoding error (less common now, no longer manual data entry)
response latency data (longer response time due to distortion of error, due to distraction etc)
open ended estimate data

Question 28

Q

problems with outliers

Answer

A

responsible often for violations of homo variance/normality
conceptual validity
disproportionate influence on stat results

Question 29

Q

identifying outliers

Answer

A

impossible values in frequency tables/histogram
steep tails in normal qq plots
standardized residuals for observations
studentized deleted residuals

Question 30

Q

standardized residuals for observations

Answer

A

index of deviation from the mean
follows z distribution
normal distributed N=100, 1 value should be >2.6
normal distributed N=1000, 4 values should be >3.0
general threshold of 4 or 5 is suggested

Question 31

Q

studentized deleted residuals

Answer

A

index of deviation from mean (not including target observation in mean and SD calculation)
follows t distribution of df=n-2
sample of 100, value of >3.6 = outlier
sample of 1000, value of >4.07 = outlier

Question 32

Q

response to outlier

Answer

A

correct or treat impossible values as missing data
possible but highly discrepant values can be trimmed or capped to most extreme value/specified values
highly discrepant values are treated as missing

Question 33

Q

philosophical issues w outliers

Answer

A

minimalist perspective: never touch the data, strong rational needed for deletion/alteration of data (due to potential abuse)
maximalist perspective: routine altering/deleting of values, outliers violate assumptions, hard to interpret, must set clear rules/procedures to avoid abuse
intermediate perspective: justifiable w/ clear rules/procedures and high thresholds for outliers

Question 34

Q

levels of measurement

Answer

A

nominal: # assignment is abt group membership/categorical (ex nationality)
ordinal: # assignment is abt rank order on scale but is not reflective of mag of dif (ex favs, difference between top 1-2 and 4-5 may be different)
interval: # assignment is abt rank order and mag of dif but no ratio (ex C degrees scale, 0 for freezing, 100 for boiling)
ratio: # assignment is abt rank order, mag and ratio dif (ex mass, length)

Question 35

Q

what level of measurement has an absolute meaning ful zero (0) point

Question 36

Q

before conducting analysis (t test/ANOVA) and descriptive stats, its only meaningful independent variable has at least _______ properties