Untitled Deck Flashcards
What is validity?
Agreement between a test score or measure and the quality it is believed to measure.
What is face validity?
Extent to which a measure appears to have validity; does not offer evidence to support conclusions drawn from a test; is not a statistical measure.
What is content validity?
Determines if items on a test are directly related to what they are assessing; logical.
What is the process of establishing content validity?
Define domain of test, select panel of qualified experts (NOT item writers), panel participates in process of matching items to domain, collect/summarize data from matching process.
What is criterion validity?
Using a current test to infer some performance criterion that is not being directly measured; supported by high correlations between test score and well-defined measure.
What is the process of establishing criterion validity?
Identify criterion and measurement method, identify representative sample, give test to sample and obtain criterion data, determine strength of relationship (correlation) between test score and criterion performance.
What is predictive validity?
How well a test predicts criterion performance in the future.
What is concurrent validity?
Assess the simultaneous relationship between a test and criterion.
What is validity coefficient?
The relationship between a test and the related criterion (r); extent to which the test is valid for making statements about the criterion.
What is the result of cross validation?
The prediction will be worse; there will be more error in the prediction; fewer points will fall on the regression/prediction line, but could still have good predictive validity.
What should be checked when evaluating validity coefficients?
Check for restricted range on both predictor and criterion, review evidence for validity generalization, consider differential prediction.
What is construct validity?
Often measures things that aren’t directly observable; requires operational definition and description of relationships with other variables.
What is the process of establishing construct validity?
Assemble evidence about what a test means; each relationship that is identified helps to provide a piece of the puzzle of what the test means.
What is convergent evidence?
Expect high correlation between 2+ tests that assess the same construct; not perfect (>0.90) means the tests are exactly the same if 1 or very close to 1.
What is discriminant/divergent evidence?
2 tests of unrelated constructs should have low correlations; discriminate between 2 qualities unrelated to each other. can drop items/create subscales
What is item writing guideline #1?
Define clearly what you wish to measure.
What is item writing guideline #2?
Generate a pool of items; write 3-4 for every one you will keep to avoid redundant items in final test.
What is item writing guideline #3?
Avoid items that are exceptionally long, causing confusion/misleading.
What is item writing guideline #4?
Be aware of reading level (scale and test taker); usually want reading level at 6th grade.
What is item writing guideline #5?
Avoid double-barreled items.
What is item writing guideline #6?
Consider using questions that mix positive and negative wording to avoid response set.
What is dichotomous format?
2 choice each question; requires absolute judgement; can promote memorization without understanding.
What is polychotomous format?
Has more than 2 options; probability of guessing correctly is lower; formula can be used to correct for guessing.
What are distractors?
Poor distractors hurt reliability and validity; rarely more than 3 or 4 work well.
What is an issue with multiple choice questions?
Unfocused stem, negative stem, avoid irrelevant info in the stem, avoid unequal question length, negative options, clues to correct answer. keep correct option and distractors in the same general category
What is Likert format?
Rate degree of agreement with statement; often used for attitude and personality scales; can use factor analysis. odd # have center evens do not. past 6 options cant discriminate between choices
What is test development step #1?
Review literature; what measures exist already, how can they be improved.
What is test development step #2?
Define the construct; what domain you’ll be sampling from.
What is test development step #3?
Test planning and layout; find representative sample of items that represent that domain well.
What is test development step #4?
Designing the test; brief clear instructions manual/directors for administrators.
What is item difficulty in test design?
If items are too difficult/easy, they will not discriminate between individuals; the test is not informative.
What is item attractiveness in test design?
For personality tests; if test taker is likely to answer yes/true/agree, should be rephrased if more people would agree.
What is test development step #5?
Item tryout; choose sample of individuals that match target population; initial has 1.5-2x more items than final test.
What is test development step #6?
Item analysis; people with high levels of characteristic should get high scores; should be a range of scores. item difficulty/attractiveness: # correct/marked true. difficulty should vary across test. item discrimination index: how well item discriminates bw high and low scorers on the test
What is test development step #7?
Building the scale; choose items with moderate difficulty, high discriminability.
What is test development step #8?
Standardizing the test; test used with large representative sample; same conditions and demographics as intended use. if sufficient reliability/validity calc percentiles etc, if not back to item writing/analysis
What is item difficulty?
What % of people got item correct; most tests have difficulty between 0.3-0.7; optimal difficulty approx 0.625.