Psychometrics & Statistics Flashcards
In Classical Test Theory, what 2 components comprise any obtained test score?
A true score (T) and random error (E)
Descriptive versus Inferential statistics
Descriptive statistics quantitatively describe the main features of data, i.e., central tendency (mean, median, mode), variability (SD, variance), and shape of summarized data points
Inferential statistics help reach conclusions that extend beyond the data alone using various methods, such as general linear model (t-test, ANVOVA, ANCOVA), regression analysis, multivariate methods (factor analysis, cluster analysis, linear discriminant function, multidimensional scaling).
Describe Kurtosis versus Skew
Kurtosis captures the degree to which a distribution of scores is clustered round the mean. i.e., if there’s a peaked (leptokurtic) or flat (platykurtic) distribution.
Skew is a measure of asymmetry of a probability distribution; shows tendencies of scores to cluster to the higher end (negative skew) or lower end (positive skew) of the distribution. Skewed distributions alter the rank order of central tendency scores (mean, median, mode).
What is IRT?
Item Response Theory focuses on item-level characteristics rather than on test-level characteristics.
Item-level responses are analyzed to compare the probability of a correct answer against underlying person parameters (i.e., trait or ability) and item parameters (i.e., difficulty or discrimination), using an item characteristic curve (ICC).
ICC also know as ‘item response function’. Can provide info on difficulty, difficulty & discrimination, or difficulty discrimination & guessing
What is probability theory?
The ratio of outcomes over an infinite number of replays of the game define the probability of that outcome; similar to games of chance
What are the key elements of Bayesian theory?
Bayes’s theorem is used in decision making analysis to allow the posterior probability of an event to be calculated. Key elements are posterior probability, prior probability, and likelihood.
The probability of outcome B given an event A is equal to the outcome A given B times the prior probability of outcome B divided by the prior probability of outcome A.
Conditional probability is the probability of an event given that a difference event has occurred.
What type of distribution do measures of motor ability and reaction times typically have?
positively skewed
What is regression towards the mean?
Tendency for scores at extremes of a distribution to migrate toward the mean on repeated assessment.
In a pair of independent measurement scores from the same distribution, samples far from the mean on the first set of scores tend to be closer to the mean on the second set; they appear to regress because of increased probability closer to the center of the distribution.
Variance versus Standard Deviation
Variance is the degree to which scores deviate from the mean; the average of the squared differences from the mean of each observation in a distribution.
SD is a statistical measure of variability of scores around the mean; equals the square root of the variance.
What is the purpose of transformations?
Transformation makes distributions of data points that lack true normality able to fit to a normal curve. Use of a transformation must be related to an essential measurement concern that can be identified and expressed.
What are Standard scores
Transformation of normally distributed data used to make a scale, or set mean and SD, so that measures can be compared. T score, z-score, index score, etc.
Using percentages- what distribution shape is the created?
Data take on a “rectangular” shape that forces artificially even intervals, regardless of the underlying values.
Percentiles are often used because of the familiarity an ease in understanding for laypersons and test consumers.
What is Reliability
Consistency of test results under varying test administration conditions; to what degree scores are systematic and the measure is free from measurement error. Reliability index is ratio of true score variance to total variance, r values range from 0 - 1, with .80 or higher considered acceptable.
What is Validity?
Describe external versus internal validity.
Degree to which a measure can be used to support a specific inference; a property of the inferences that the test is designed to assess rather than the test itself, thus is concerned with external set of considerations in establishing the credibility of a test.
External validity is the degree to which test results can be generalized to other groups and situations. Examples include ecological validity
Internal validity is the degree to which observed effects are real, i.e., not caused by confounding variables or extraneous factors.
What is Test-retest reliability
Stability of scores on repeated administration of an instrument to the same person.
Error variance- random fluctuation in performance from one administration to another
Describe Reliability versus sensitivity to change.
What is RCI?
Perfect reliability of a measure won’t sufficiently detect change; trade-off should optimize reliability versus sensitivity to change.
Reliable change index (RCI) is used to determine if changes exceed what is considered to result from methodological aspects associated with repeat assessment. RCI is based on test-retest reliability, standard error of the test, and practice effects.
What is Alternate form reliability?
Reliability coefficient that captures the stability of a test over time and consistency of responses to different samples of items asking the same knowledge or performance.
Alternate forms are parallel tests constructed to be similar in content, high in reliability, and equivalent. They reduce effects of error variance due to practice effects.
What is internal consistency reliability?
Evaluation of the internal consistency of a test by splitting it in different ways using only a single administration. Reliability of a half-test is the correlation between half scores of the test. Also known as split-half reliability.
What is Spearman-Brown formula?
Calculates the likely effect of lengthening a test. Lengthening a test will increase consistency with respect to item sampling (i.e., internal consistency) but not necessarily stability over time (i.e., test-retest)
What is Inter-item reliability?
Consistency- estimation of content sampling error and heterogeneity of the domain of knowledge or behavior.
Higher homogeneity is associated with better interitem reliability, but must match relative homogeneity of the construct or criterion that the test is trying to measure.