Stats/Research Design Flashcards
What is the difference between true, quasi, and non-experimental studies?
Exp: Manipulated IV and random assignment
Quasi: manipulation of an IV without random assignment
Non exp: No intervention/manipulation
Nomothetic vs idiographic
Group based research vs single subject
Autocorrelation
Problem for single subject design – the effect of measuring the same person repeatedly
Multiple baseline design
Treatment is provided sequentially across subjects/situations/behaviors to reduce problems of history (other reasons for change)
Simultaneous/Alternating Treatment design
Providing 2+ treatments at different and varied times of day to compare relative effectiveness (e.g., two different reinforcers)
Changing criterion design
Goal of changing behavior in increments, and adjusting the target criterion with practice (eg cutting back on smoking)
Time sampling – momentary and whole-interval
Recording behaviors with no discrete beginning/end by measuring time for which they are displayed – either y/n (momentary) or for the full duration (whole interval). Eg paying attention for full minute.
Event recording
Frequency counting target behaviors
Cluster sampling
Randomly selecting pre-existing groups of subjects
Systematic sampling
Selecting subjects based on a set ratio from a random start on a list (e.g., every 3rd person)
History (threat to internal validity)
Specific incidents that occur outside of the experiment that affect performance
Maturation (threat to internal validity)
Time based effects on performance (eg fatigue, aging)
Solomon Four Group Design
Addresses the problem of practice effects.
Two groups are measured pre-post, one gets the intervention
Two groups are measured post only, one gets the intervention
Instrumentation (threat to internal validity)
Changes in observers/equipment (eg machine wearing out over time)
Attrition/experimental mortality (threat to internal validity)
Differential loss of subjects across groups
Diffusion (threat to internal validity)
“contamination” of the groups when the control group inadvertently gets some of the treatment.
Threats to internal validity
Factors other than IV that may cause change in the DV
Threats to construct validity
Things associated with the intervention other than the SPECIFIC FEATURE that caused change (eg rapport rather than cognitive restructuring)
Rosenthal Effect vs Demand characteristics.
Rosenthal is Experimenter expectancy bias – cues transmitted by experimenter to subjects (fixed by experimenter blind)
Demand is things in the procedures that affect subject behavior (fixed by subject blind)
John Henry effect/compensatory rivalry
Control group participants try harder than experimenter due to a sense of competition
What is the relationship between internal and external validity
Inverse relationship.
The more controls exist, the less generalizable it’s likely to be.
What’s the difference between interval and ratio data
Interval has no absolute zero score, ratio data does.
What percentiles correspond to each whole-number Z score from -3 to +3
- 3: 0.1st
- 2: 2.5th
- 1: 16th,
0 : 50th
+1: 84th
+2: 97.5th
+3: 99.9th
What percentile is an IQ score of 70?
2.5
What is the Z score formula?
Z = (X - Mean) / SD
Explain standard error of the mean
The average deviation between sample mean and population mean. It’s calculated as
(SDpop) / √N
Beta
Probability of making Type II error
(Power = 1 – beta)
Homoscedasticity
Parametric test assumption that there should be similar variability among groups
What is McNemar’s test used for?
Group differences with nominal data
when groups are correlated (e.g., diagnosis y/n across siblings)
Coefficient of determination
Square of correlation coefficient. The amount of variability in Y that’s shared with X.
What happens to the correlation coefficient if you have a restricted range?
It drops dramatically (underestimates relationship)
Canonical correlation
Tests relationship between two sets of multiple variables
What is the relationship between effect size and standard deviation?
Effect size is expressed in units of SD.
Eg effect size of .5 would be response of, on average, half a standard deviation
What is the reliability coefficient (range and cut-off)
(RXX)
The percentage of true score variability. Range from 0.0 to 1.0, cut off is .8.
Content sampling (error)
Error resulting from whether or not a test’s items correspond to the test-taker’s knowledge base.
Time Sampling (error)
When a test is given twice and the scores are different due to time-related factors (Eg forgetting)
Test heterogeneity (error)
When items on a test tap into more than one domain.
How does the number of items affect reliability?
Reliability increases with more items
How does the homogeneity of items affect reliability?
More homogeneous items = increased reliability
How does the range of scores affect reliability?
A restricted range reduces reliability
How does the ability to guess affect reliability?
The easier items are to guess, the lower reliability is.
Standard error of measurement
The standard deviation of the theoretical, normal distribution of scores of one individual on equivalent tests. Aka the average amount of measurement error.
What does this formula measure:
SDx √(1-rxx)
Standard error of measurement (average amount of error in measuring a latent variable)
Explain the relationship between standard error of measurement and confidence interval?
CI = true score =/- the standard error of the measurement for 68% CI. Double or triple it for 95 and 99% CI.
Content validity
How adequately a test samples the target (representative sample of the knowledge/skills)
Criterion-Related Validity Coefficient (range and cut-off)
Measures how well a score can predict an outcome, ranging from -1 to +1. Validities beyond .2 are acceptable.
Standard error of the estimate
The average error in estimating a person’s criterion score based on a predictor
What’s the difference between standard error of the measurement and standard error of the estimate?
Measurement concerns reliability (how accurate is it at testing y)
Estimate concerns predictive validity (how good is it at predicting a later outcome)
What is the criterion-related validity coefficient (rxy) and cut off?
The correlation between predictor and criterion ranging from -1 to +1, acceptable cutoff is .2.
What are Taylor-Russell tables used for?
They indicate the amount of improvement in selection when a predictor test is used (incremental validity)
Selection ratio
The proportion of open positions to applicants. A low ratio means there are many more applicants than positions.
Adaptive tests (item response theory)
Tests wherein the response to one item determines whether further questions will be asked, resulting in the fewest number of items required.
Adaptive tests (item response theory)
Tests wherein the response to one item determines whether further questions will be asked, resulting in the fewest number of items required.
What happens to criterion-related validity when a test is cross-validated (ie given to another group of people)
Criterion related validity decreases because of sample differences.
How does restricted range / homogeneous sample affect criterion-related validity?
Validity decreases.
What is the relationship between validity and reliability
Validity is less than or equal to the square root of reliability. Reliability sets ceiling for validity.
Validity ≤ √Reliability
Criterion contamination
When a rater knows of the subject’s predictor score before assigning criterion rating.
Construct validity
How well a test measures the target trait. Includes convergent and divergent validity
How do you square decimals?
e.g.: square .1, .4, and .7
Square the number after the decimal and add a zero if it’s single digit. Should be 2 digits after the decimal.
.12 = .01.
.42 = .16
.72 = 0.49
How do you estimate the square root decimals?
e.g. square root of .4, .9
Make it 2 digits if needed (add a zero), and find the whole number closest to the square root. Then make sure the answer is in tenths (1 digit after the decimal)
.4 = 0.6 (closest to 40 is 62 = 36)
.9 = .9 (closest to 90 is 92 = 81)