Exam 1 Flashcards
What is the year of the first known testing
2200 BC
Who did China first test
public officials
Who created the normal distribution
Gauss
What happened during the 1700-1800
normal distribution was created
Civil service examination was given in the US
intellectual disability and psychosis were classified
Free association tests were developed
What were the 6 tests developed in the 1900
Binet-simon scale Army Alpha and Beta test Woodworth Personality test and MMPI Rorschach inkblot college admission Wechsler intelligence scale
Device or procedure in which a sample of an individual’s behavior is obtained, evaluated, and scored using standard procedures.
Test
A set of rules for assigning numbers to represent objects, traits, attributes, or behaviors
Measurement
Systematic procedure for collecting information that can be used to make inferences about the characteristics of people or objects
Assessment
Test designed to assess the upper limits of an examinee’s ability and knowledge
Maximum performance tests
Examples of maximum performance tests
SAT, job performance, Exams
Test that attempts to measure the typical behavior and characteristics of examinees
Typical response test
Types of typical response tests
personality tests, test about attitudes towards somthing
What are the two types of scoring
Norm-referenced scoring and Criterion-referenced scoring
Type of scoring where an examinee’s performance is compared to the performance of other people
Norm referenced scoring
A type of scoring where an examinee’s performance is compared to a specific level of performance
Criterion-referenced scoring
What influences norm-referenced scoring
scores of other people on the test
What influences criterion-referenced tests
the test itself
What are the 10 assumptions
Psychological constructs exist
Constructs can be measured
Measurement isn’t perfect
A construct can be measured in different ways
All assessments have strengths and weaknesses
Always try to triangulate
Performance on a test can be generalized to other behaviors
Assessments can help people make decisions
Can be conducted in a “fair” manner
Can benefit individuals and society
What are the applications of psychological assessment
Diagnosis Treatment planning and effectiveness Selection, placement, and classification Self-understanding Evaluation & program evaluation licensing Scientific method
Categories with no numeric scales
Nominal
Rank ordering, intervals between items not known
Ordinal
Numeric properties are literal with equal intervals between values, no zero
Interval
Real values, has a zero
Ratio
Skew that points to the right
positive skew
Skew that points to the left
Negative skew
Part of distribution that most scores tend to concentrate around
Central tendency
The average of the distribution
Mean
The middle score in the distribution
median
The score that appears with the most frequency
mode
Average deviation of scores from the mean
Standard deviation
The mean of the sum of squared deviation of scores from the group mean
Variance
The difference between the highest and lowest scores
Range
The relationship between two variables
correlations
Variables that are outside of our measurement that influence the relationship between the variables of interest
third variable
Quantitative measure of the linear relationship between two variables
Correlation Coefficient
The amount of variance shared between two variables
Coefficient of determination
What is the rage of coefficient of determination
o to 1
What correlation coefficient is used with intervals and ratios
Pearson product moment
What correlation coefficient is used with ordinal scales
Spearman’s rank corelation
What correlation coefficient is used with one dichotomous scale or one interval/ratio scale
Point-based correlation coefficient
Predicting one variable given the information on another variable
Linear regression
What is the equation for linear regression
Y=a+bX
The likelihood of error in a prediction
Standard error of estimate
A range of scores that a participants is likely to fall in, given a certain degree of confidence
Confidence interval
Who was the first country to use testing
China
The 8 factors in developing tests
Develop specific assessment objectives
Develop procedures that are appropriate for the construct
Develop explicit scoring criteria
Specify a sampling plan for collecting data
Develop test administration guidelines
Plan accommodations for those with special needs
Review the assessment prior to administration
Evaluate the psychometric properties of assessments
7 things selected assessments should do
Tap into the construct
Produce reliable data that are representative of the target population
be fair
Match info found in the literature
be appropriate for your qualifications and experience
cannot be misinterpreted or misused
be secure
What are the 6 components of assessment
Obtain informed consent/assent
Administer assessment in a standardized manner
modify assessments to meet the needs of examinees
Maintain test security
Make sure everything is scored properly and fair
Keep everything confidential and anonymous
6 rules for interpreting/reporting results
Don’t use the assessment for other purposes
use multiple sources and types of assessment info
Stay close to the data-minimize subjectivity
Be aware of limitations of data
Consider if the normative sample is different from the chosen sample
Discuss results with examinees
Taking raw scores are transforming them in a systematic manner that places them on a scale that has a specific mean and standard deviation
Standard scores
Transformation in which the standard scores have the same distribution as the raw scores and maintain a direct relationship
Linear transformations
What are the four types of linear transformations
z-scores
T-scores
IQ scores
CEEB scores
an abnormal distribution that is transformed into a normal distribution
Normalized standardized scores
3 types of nonlinear distribution
Stanine scores
Wechsler scales scores
Normal curves equivalent
A measure indicating an examinee’s performance relative to the group performance
Percentile rank
Norm-referenced scores that identify the level achieved by the examinee
Grade equivalents
Theory/model of mental measurement that states that the responses to items on a test are accounted for by latent traits
Item response theory
An ability or characteristic that is thought to exist, but can’t be assessed directly
Latent Traits
The consistency accuracy or stability of results
Reliability
Theory that states that every score on a mental test is composed of two components: the true score and the error score
Classical Test Theory
The score that would be obtained on a perfect measure with perfect comprehension of the examinee
True score
What is true score formula
X = T + E
What is the X in the true score formula
the observed score
What is T in the observed score formula
The true score
What is E in the true score formula
Error
What are the two types of error
Systematic and Random
The differences that result from the items on the test and all possible items that the test could be constructed from.
Content sampling error
Random fluctuations in performance from one time to another
Time sampling error
Administer same test to same group at 2 different times.
Test-retest
How is test retest administered
1 form in 2 sessions
Administer 2 forms of the test to same group in the same session
Alternate forms
How is alternate forms administered
2 forms and 1 session
Administer two forms of test to same group at two different sessions
Delayed administration
How is delayed administration administered
Two forms in two sessions
Administer test to group one time. Split test into 2 halves
Split half
How is split half administered
One form one session
Administer a test to group one time
Coefficient alpha
How is coefficient alpha administered
One form one session
Administer a test to a group one time. Two or more raters score test independently
Inter-rater
How is inter-rater administered
One form one session
Combining scores on several different test/subtests
Composite scores
Which is higher the reliability of the composite score or the reliability of the individual scores
The composite score
What are the four factors in determining that a coefficient is acceptable
The construct
The time that is available to administer the test
That uses of the test
Method of estimating reliability
Standard deviation of the distribution of scores that would be obtained by one person if they were tested on an infinite number of parallel forms of a test comprised of items randomly sampled from the same content domain
Standard error of measurement
The appropriateness and accuracy of the interpretation a performance on a test
Validity
What are the two threats to validity
Construct underrepresentation
construct – irrelevant variance
What are the five types of validity evidence
Evidence based on test content
Evidence based on relations to other variables
Evidence based on internal structure
Evidence based on response processes
Evidence based on consequences of testing
A measure of validity that shows how well a specific item falls with in the content
Item relevance
Measure of validity that shows how well the test itself covers the domain
Content coverage
A measure of validity in which something appears to be valid not a true measure of validity
Face validity