Chapter 4 Flashcards
reliability
A measure is said to be reliable if the results are repeatable when behaviours are remeasured. Reliability is a direct function of measurement error
validity
a test is said to be valid if it measures what it is designed to measure (assumes reliability but not vice versa)
validity types
- content validity
- criterion validity
- construct validity
face validity
how valid a test appears to the taker, not really a kind of validity
content validity
simplest level of validity - concerns whether or not content of items on a test make sense in terms of the construct being measured
criterion validity
concerns whether measure can accurate forecast of future behaviour or whether behaviour is meaningfully related to some other measurement of behaviour
construct validity
concerns whether a test adequately measures some construct and connects to operational definition, establishes convergent and discriminant validities
convergent validity
established by construct validity - scores on a test that measures some construct should relate to scores on another test measuring the same construct
discriminant validity
established by construct validity - scores on a test that measures some construct should not relate to scores on another test measuring a different construct
nominal scale
functional unity
ordinal scale
functional unity, order
interval scale
functional unity, order, equal intervals
ratio scale
functional unity, order, equal intervals, absolute 0
descriptive statistics
summarize data collected from sample population
inferential statistics
allow you to draw conclusions about data that can be applied to the wider population
mean
measure of central tendency - average (with standard deviation)
median
measure of central tendency - middle number in data arranged sequentially (with IQR)
mode
measure of central tendency - number that appears most frequently (with range)
outliers
scores that fall far outside the mean
range
measure of variability - highest - lowest + 1
IQR
data that falls in the middle 50% of the set of scores (arranged sequentially)
standard deviation
estimate of avg amount by which scores in the sample deviate from the mean (square root of EX^2 over (N-1) where x^2 = difference score between X bar and X score) 68% of scores fall between -1 and 1 standard deviation, 98% fall between -2 and +2
variance
SD^2
histogram
Y=frequency, X=score
frequency distribution
tally of # of times each score appears
alpha level
probability of obtaining experiment results if Ho is true (linked to prob of type 1 error)
Type 1 error
Reject Ho when it is true
Type 2 error
Retain Ho when it is false
systematic variance
result of an IDable factor, either variable of interest or some factor you have failed to control adequately
error variance
nonsystematic variability due to individual differences and random, unpredictable effects during the study
effect size
estimate of magnitude of difference among sets of scores taking into account variability (can combine eta scores of individual studies into a meta analysis of related studies)
confidence interval
range of values expected to contain a population value with a certain degree of confidence C= X bar +/- t (standard error of the mean, calculated by SD/root of N)
power
ability of a test to reject Ho; affected by alpha, effect size, variance, and size of N (too small, can’t see effect; too large, magnifies)
historically psychology’s examples have been:
Western Educated Industrialized Rich Democratic
Varieties of psychological measurement
- self report
- behavioural
- physiological
file drawer problem
bias against publishing statistically non-significant findings