Statistics & Test Construction Flashcards

You may prefer our related Brainscape-certified flashcards:
0
Q

What is the formula for standard error of the mean?

A

Population standard deviation divided by square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

In a positively skewed distribution, give the measures of central tendency from lowest to highest.

A

Mode, median, mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

At what sample size does the standard error of the mean essentially become zero?

A

120 to 150.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the best way to increase power in a statistical test?

A

Increase sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name three properties of distributions of sample means that follow from the Central Limit Theorem.

A
  1. As the sample size increases, the distribution of means will become a normal distribution.
  2. The mean of the sample distribution is always the same as the population mean (Average of averages).
  3. The standard deviation of this distribution of sample means is the standard error of measurement.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is it that the standard error of the mean tells you about your sample?

A

How well your sample represents the population. (A large enough sample has no error because it is the population).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In an experimental design, what is a “natural treatment group”?

A

A control group (it only receives the treatment[s] that occur without intervention).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If an experiment is designed to test the effects of caffeine on performance, what question would be answered by a two-tailed test and what question would be answered by a one-tailed test?

A

Two-tailed: Does the caffeine group perform differently than the non-caffeine group?
One-tailed: Does the caffeine group perform better (or worse) than the non-caffeine group?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What saying describes Type I error, and what does it describe?

A

“Reject the true is not Type II” (it is Type I or alpha error). It is seeing something when there is really nothing to see.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What type of error is failing to reject the null hypothesis when it is false?

A

Type II or beta error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What type of test is used to decide whether several groups’ means are significantly different?

A

Analysis of Variance, or the F ratio (Mean square between groups over mean square within groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of statistical test is used to look for main effects (of two or more independent variables on the dependent variables) and interaction effects (between the independent variables as they affect the dependent variables)?

A

Two-way or factorial Analysis of Variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In a factorial analysis of variance, what would two or more types of treatment be called?

A

Levels of the independent variable “treatment type.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In a Chi-square test, what does a given cell always contain?

A

Counts or tallies. (How many members fall in each cell).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are two assumptions necessary to use the chi-square test?

A

Independence of observations, and mutually exclusive and exhaustive categories.

(i.e., a member cannot be in two groups at once).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the most important thing to know about Type II (beta) error?

A

Its magnitude can never be known. You failed to find anything when there was something - the difference is there, but since you did not identify it, you cannot quantify it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

At what quantitative level does an F-ratio become significant, and why?

A

At the level of one and above; because the difference between groups is the same as or greater than the differences within groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What two parameters does a single-sample t-test compare?

A

A sample parameter (mean) to a population parameter (mean).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What nonparametric test is used to decide whether observed frequency is different than expected frequency?

A

Chi-Square.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What test is the same thing as a two-group analysis of variance?

A

A two-sample t-test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the pre-requisite for doing a post-hoc test?

A

Having a significant F-ratio (Tukey, Scheffe´, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the best advice for interpreting main and interaction effects in a factorial analysis of variance?

A

Interpret main effects cautiously in light of interaction effects (main effects may depend on interactions).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the two most important requirements for a true experimental design (as opposed to a quasi-experimental design)?

A
  1. The experimenter must be able to manipulate the independent variable (i.e., have control over it).
  2. The members of each group must be randomly assigned from a larger sample that is otherwise the same.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Confounding variables are a threat to what kind of validity?

A

Internal validity.
(i.e., the members of each group may be there because of some extraneous factor so effects may represent those extraneous factors rather than the treatment).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What statistical test can help control for confounding variables?

A

Analysis of covariance (ANCOVA).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What kind of statistical test can help control for Type I error with several groups?

A

Multiple ANOVA or MANOVA.

It uses one p value in order to keep the by-group p values from adding up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What kind of correlation is used when a continuous variable is compared with an artificially dichotomized variable (such as low income vs. high income)?

A

Biserial correlation.

27
Q

What kind of correlation coefficient is used with rank-order data?

A

Spearman rho (r-rank o-order)

28
Q

What correlation coefficient is used with a non-linear (i.e., curvilinear) relationship?

A

Eta, which gives “effect size” over the entire range regardless of the direction of the relationship.

29
Q

With what type of experimental design, and what type of data, would a split-plot ANOVA be used?

A

A mixed design (pre-post and comparing groups) using interval or ratio data.

30
Q

What type of statistical test would be used when a single variable (e.g., reading score) is both a means of classifying members and a measure of a treatment effect?

A

Randomized block analysis of variance.

31
Q

What type of correlation coefficient is used to compare a continuous variable to a naturally occurring dichotomous variable such as gender?

A

Point-biserial correlation.

32
Q

What type of correlation coefficient is used to compare two artificially dichotomized groups (such as low/high ability and low/high income)?

A

Tetrachoric correlation.

33
Q

What type of correlation coefficient is used to compare two sets of nominal data (such as eye color and gender)?

A

Contingency correlation.

34
Q

What is the correlation of determination?

A

The square of the correlation coefficient (r2) which gives the shared variance between the two variables.

35
Q

Define homoscedasticity and heteroscedasticity.

A

Homoscedasticity: The correlation coefficient is the same over the entire range of the variables.
Heteroscedasticity: The correlation coefficient changes over the range of the variables.

36
Q

State the basic questions posed by internal and external validity.

A

Internal: How pure or well-controlled is the study?
External: How well does the study generalize to the general population?

37
Q

What type of statistic gives the effects of many variables on one variable?

A

Multiple regression.

38
Q

What type of statistic gives the relationships of multiple predictor variables to multiple outcome variables?

A

Canonical correlation.

39
Q

What type of study design has no external validity?

A

A single-subject design.

40
Q

What type of study is vulnerable to cohort effects?

A

A cross-sectional design.

41
Q

What type of study combines the best features of longitudinal and cross-sectional designs?

A

Cross-sequential designs.

42
Q

What is the term for an internal validity threat that occurs when participants change for reasons that have nothing to do with the factors being studied?

A

Maturation.

43
Q

What is the term for a validity threat that occurs when an external factor impacts some or all of the participants?

A

History (a threat to internal validity).

44
Q

In classical test theory, what components make up a test score?

A

True score (or true variance) and error (or error variance).

45
Q

What does a reliability of .90 for a test item say about the error of that item?

A

That 10% of that item’s variance is attributable to error variance.

46
Q

What is the term for the statistic that measures the proportion of true variance to error variance of a test item?

A

The coefficient of stability (coefficient of equivalence).

47
Q

What coefficient is NOT squared to yield the proportion of variance?

A

The coefficient of determination, which measures reliability in test construction.

48
Q

What is the best type of reliability measurement?

A

Alternate-forms (which is also the least-used form).

49
Q

What formula is used to correct for the smaller number of items in a split-half reliability measurement?

A

The Spearman-Brown Prophecy formula.

50
Q

What formula is used to measure inter-item consistency on a test by pairing each (non-dichotomous) item with every other item?

A

Cronbach’s Alpha coefficient.

51
Q

What statistic is used to measure inter-rater reliability?

A

Kappa.

52
Q

What range of scores gives a 95% confidence interval on a test?

A

±2 standard errors of measurement from the score.

53
Q

What measure can be used to provide likely ranges of where the “true score” on a test lies, given an obtained score; and how is it calculated?

A

The Standard Error of Measurement: the standard deviation of the test times the square root of (1 minus the reliability).

54
Q

What statistic gives a reliability coefficient for inter-item consistency with items that are dichotomous, such as true-false responses?

A

The Kuder-Richardson KR-20.

55
Q

What can never happen with statistics that measure reliability and validity?

A

Validity cannot exceed the square root of reliability.

56
Q

What are the two factors about tests that most affect their reliability?

A
  1. Test length (Longer is more reliable)

2. Item difficulty (Difficult is more reliable).

57
Q

What type of validity looks at several tests that measure the same construct?

A

Convergent validity (Monotrait/heteromethod)

58
Q

What type of validity involves several measures that should not show overlap because they measure different constructs?

A

Discriminant validity or divergent validity (Heterotrait/heteromethod)

59
Q

In a multitrait-multimethod matrix, what is actually indicated by the coefficient in the monotrait-monomethod cells?

A

Reliability.

60
Q

In a multitrait-multimethod matrix, which cells should have the lowest correlation, and the next lowest correlation?

A

Different traits measured by different methods, followed by different traits measured by the same methods.

61
Q

In a multitrait-multimethod matrix, which cells should have the highest correlation, and the next highest correlation?

A

Same traits measured by the same method, followed by same traits measured by different methods.

62
Q

Raising a cutoff score on a test increases which categories of decisions?

A

True negatives and false negatives.

63
Q

Raising a performance criterion score (without lowering a test cutoff score) increases which categories of decision?

A

True negatives and false positives - i.e., more members will be included in the “unsatisfactory performance” group who did well on the test).

64
Q

Attempting to eliminate all false positives on a test will also have what effect?

A

It will eliminate many true positives (i.e., create more false negatives), by raising the cutoff score).