Stats & Test Construction Flashcards

1
Q

Type I error

A

Mistakenly rejecting the null hypothesis when it’s true

Alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Type II error

A

Mistakenly retaining the null hypothesis when it is false

Beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discriminant analysis

A

Technique in multivariate statistics that describes differences between 2+ groups on a set of measures or that classifies subjects into groups based on a set of measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Threats to internal validity

A

Maturation, history, instrumentation, statistical regression, selection, attrition/mortality, interaction w/ selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ways to control threats to internal validity

A

Random assignment, within-subjects designs, blocking, matching subjects, ANCOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Threats to external validity

A

interaction b/t testing & treatment, interaction b/t selection & tx, reactivity, multiple tx interference (order/carryover effects)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ways to control external validity

A

Random sampling, naturalistic/field research, single or double-blind designs, counterbalance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some ways to increase power?

A

Increase alpha, increased N, increase effect size, decrease error, use powerful statistics, one-tailed if possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What percentage of scores on the normal curve fall between +/- 1 SD, +/- 2 SD, +/- 3 SD?

A

68%
95%
99%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
What percentiles are equivalent to the following z-scores?
-3
-2
-1
1
2
4
A
0.1 = -3
2 = -2
16 = -1
84 = 1
98 = 2
99.9 = 3
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Factors affecting test reliability

A

Test characteristics (length, item type, item homogeneity, influence of guessing), sample characteristics (sample size, range, variability), extent of test clarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sources of error in internal reliability

A

Content sampling, heterogeneity of content domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sources of error in test-retest reliability

A

Time-sampling factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which type of reliability is best for speed tests?

A

Alternate forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sources of error in inter-rater reliability

A

Factors related to raters (motivation, biases), characteristics of measuring device, consensual observer drift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dimensions of relevance in item analysis

A

1) Content appropriateness (item assesses bx domain the test is intended to evaluate)
2) Taxonomic level (does item reflect appropriate cognitive or ability level of population intended for)
3) Extraneous abilities (to what extent are knowledge or skills needed that is outside the domain being evaluated)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Item difficulty

A

The %age of people who get an item correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Item discrimination

A

Extent an item differentiates between those who get a high vs. low score

.35 or more is acceptable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Item response theory

A

Tests based on examinee’s level on the trait being measured vs total test score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Reliability coefficient

A

Proportion of variability in obtained test scores that reflects true score variability

Never squared to interpret

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Standard error of measurement (SEM)

A

An index of the amount of error that can be expected in a person’s obtained scores due to the unreliability of the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What qualitative evidence do you look for in a task that has good content validity?

A

Coefficient of internal consistency will be large

Test will correlate highly with other tests of the same domain

Pre- and post-test evals of the program designed to increase familiarity with domain will indicate appropriate changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Orthogonal rotation

A

Resulting factors are uncorrelated; attribute measured by one factor is independent from the attributes measured by the other factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Oblique rotation

A

Resulting factors are correlated & attributes measured by the factors are not independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the Rosenthal/Pygmalion effect?
Tendency for participant's performance to be effected by the expectations of the tester
26
What is the Hawthorne effect?
Tendency of subjects to behave differently when they are in a research study
27
What is the most common measure for internal test reliability?
Cronbach's alpha (can't be used for dichotomous tests)
28
What measure is used to evaluate the effect of lengthening or shortening a test?
Spearman-Brown correction formula
29
What formula is used to assess the reliability of a test with dichotomous responses?
Kuder-Richardson formula
30
What are acceptable scores of reliability?
.80 & above = good .70-79 = acceptable .60-.69 = marginally reliable .59 and below = not reliable
31
Name the 4 scales of measurement
1) Nominal = names of categories 2) Ordinal = rank data 3) Interval = no absolute 0, numbers scaled at equal distances 4) Ratio = has absolute 0
32
What are the assumptions of parametric statistics?
Normal distribution Homogeneity of variance (variance equal among all groups) Independence of observations
33
F-ration in a one-way ANOVA
Ratio of between group to within group variance
34
Moderator variable
Relationship of A and C depends on the value of B (the moderator)
35
Mediating variable
Accounts for (or partially accounts for) a relationship b/t an IV and DV Relationship between A and C decreases or is eliminated when B is included in the model
36
What is the null hypothesis in Chi-square?
Observed frequencies are randomly distributed Alternate hypothesis is that the observed frequencies are related to the treatment effect
37
Central limit theorem
As sample size increases, shape of sampling distribution of sample means approximates a normal distribution. Mean of sampling distribution of sample means = mean of population.
38
What factors affect Pearson's product moment correlation?
Linearity (assumes linear relationship b/t 2 variables) Homoscedasticity (scores are equally distributed) Range of scores (wider range provides more accurate estimate)
39
Point-biserial coefficient
Correlation between one continuous variable & one dichotomous variable
40
Phi coefficient
Correlation b/t 2 dichotomous variables
41
Assumptions of regression
Linear relationship b/t X and Y Homoscedasticity (error scores of criterion are the same across range of x) Homogeneity of variance
42
Multicollinearity
Degree to which predictors correlate with each other Decreases the accuracy of the regression equation
43
Sensitivity
TP/TP + FN
44
Specificity
TN/TN + FP
45
Positive likelihood ratio
Indicates the odds that a positive test comes from a true positive (a PLR of 3 means that a pt w/ a +predictor is 3x as likely to have the condition) Sensitivity/1-specificity
46
Positive predictive power
Probability that a pt with a + test has the true condition TP/TP + FP
47
Negative predictive power
Probability that a pt with a negative test result does not have the condition TN/TN + FN
48
Relationship between base rate & PPP/NPP
As the base rate increases, PPP will increase, whereas NPP will decrease. Converse is true as the base rate declines.
49
Bayes theorem
Often employed in decision analysis, allowing calculation of the posterior probability of an event (conditioned probability it is assigned when the relevant evidence is taken into account)
50
Item characteristic curve
Plot the proportion of ppl who answered correctly against the total test score, performance on an external criterion, or mathematically-derived estimate of ability; provides info on relationship between examinee's level on the trait measured by the test & the probability that he will respond correctly on that item
51
Which ANOVA post-hoc correction is most conservative?
Scheffe
52
Which ANOVA post-hoc correction is appropriate for pairwise comparisons?
Tukey
53
Mann-Whitney U
Compare two independent groups on a DV measured with rank-ordered data
54
Negative skew
Most scores are high but few extreme low scores; mean < median < mode; easy test, ceiling effects
55
Positive skew
Most scores are low but few extreme high scores; mean > median > mode; difficult test, floor effects
56
Variance
Average of the square differences of each observation from the mean
57
Null hypothesis in ANOVA
Group means were drawn from the same population (i.e., means are equal in the population)
58
What factors may lead to non-normal test distributions?
1) existence of discrete subpopulations w/i the general population w/ differing abilities 2) ceiling or floor effects 3) tx effects that change the location of means, medians & modes, affect variability & distribution shape
59
How is SEM related to test reliability?
The greater the reliability, the smaller the SEM
60
Reliable change index (RCI)
Indicator of the probability that an observed difference b/t 2 scores from the same examine on the same test can be attributed to measurement error