3rd exam Flashcards
What is the formula for classical testing theory?
X= T + E (x- observed score), t (true score), E (error, systematic and random)
What creates a problem for classical testing theory?
Guessing on an achievement test could cause the true score to be wrong
Do we know when people guess?
We never know when someone is guessing
Abott’s formula
allows you to understand and calculate true score for blind guessing
If you are guessing wrong what happens within classical testing theory?
the observed score is not reflective of their true score
Abbotts actual math formula
R (correct responses) - W (wrong responses) divided by K (number of alternatives) -1
To overcome the influence of blind guessing
one should advise examinees to attempt every question– since not all guessing is blind. Guessing one can narrow down and get it correct and the number of times blind guessing goes on tends to be less frequent
What is an error in multiple choice questions?
not the question its self but the responses you chose from
What is the error within short-answer questions?
the issue is what is the question asking and how do I answer it? this affects reliability
Ebels idea of reliability and response options
reliability studies have been done on the number of response options, a better way to increase test reliability is to add more items (responses should be around 5)
Speed tests
best way to calculate reliability for speeded tests is to do a split half reliability on the test
With speed tests how should you do reliability
administer half the test and give half the time to complete the test, also administer 2 weeks apart, better indicator of reliability
Halo Effect
raters tendency to perceive an individual who is high (or low) in one areas is also high (or low) in other areas
2 kinds of halo effects
general impression model and salient dimension model
General impression model
tendency of rater to allow overall impressions of an individual influence judgment of a persons performance (ex: person may rate reporter as “impressive” and thus, also rate him/her as her speech as strong)
Salient dimension model
take one quality from the person and that affects the rating of another quality of the person (ex: people rated as attractive are also rate as more honest) (make inferences about an individual based on one salient trait or quality)
Simpson paradox
aggregating data can change the meaning of the data, can obscure the conclusions because of a third variable
Percentages are at the heart of the simpson paradox, why are they bad?
because they obscure the relationship between the numerator and denominator (ex: 8/10 is 80% but also 80/100 80% is the same but number of people who reviewed a restaurant is different)
What is important in knowing the percentage?
you need to know what the numerator and denominator are, or you are misinterpreting the percentages
What happens when you disaggregate the data?
you can truly see if the phonomenon is actually occurring in simpson paradox
Clinical Decision-Making
make decisions on own clinical experience
Mechanical decision-making
make decisions based on data or statistics
Clinical psychologists often feel that their decision making is
absolute, but it is flawed because there are biases that we pull that affect our decisions
Robin Dawes
asserts that mechanical prediction is better than clinical prediction
Dawes example
asked faculty to rate students in graduate program from 1964-1967. Asked them to rate each student on a 5pt scale , however was very low correlation between current faculty ratings and ratings by the admissions committee, but ratings were correlated with GRE and Undergrad GPA
quantitative data (mechanical decisions) were
more predictive than clinical judgment
When can mechanical or quantitative prediction work?
when people highlight what variable to examine to determine prediction-people are necessary to choose what variables to examine
dawes crude mechanical decision making
ex: marital relationship satisfaction was determined based on higher sex versus argument rations-people tend to rate relationships higher if have more sex and less fights
People are not good at what with the data according to Dawes?
integrating the data in unbiased ways
There is resistance to what prediction
mechanical prediction, our belief in prediction is reinforced by isolated incidents we can access (we rely on testing which is quantitative data)
Always need to know the base rate?
to make sure to not make clinical judgment errors
Clinical decision making always has to be balanced by
Mechanical decision making
When people seek out treatment, they seek it out when they are most
Severe, or something is really impacting them
When you are severe, you generally don’t get more severe, which relates to the
Regression to the mean, which relates to the middle
Why is mechanical better than clinical prediction?
Dawes says that humans make errors in judgment because they ignore base rates, ignore third variable, ignore regression to the mean
Third variable examples
ice cream sales go up, same as crime does in the summer, the third variable is heat
Representative thinking
we tend to make decisions based on the information we readily have access to. we use this as shortcuts to live our life, but with diagnosis we need to do more.
Using representative thinking
can sometimes cause errors in thinking.
Heuristic
simple rule to make decisions
Factor analysis goes under
Nondichotomous scoring systems
Item response theory goes under both
Item analysis for both dichotomous and nondichotomous
Generalize ability theory goes under the
Overall test
Factor analysis
determine which items are associated with latent constructs, these are constructs that cannot be measured directly, we do this mathematically (allows us to look at item quality).
Anxiety as a latent construct
3 buckets (overarching constructs): physical, emotional/psychological and cognitive (every disorder has buckets)
Within anxiety the latent construct, what would the 3 overarching constructs contain?
Physical (heart rate, sweating, shaking, GI distress), Emotional/psychological (irritability, worry, nervousness), Cognitive (poor concentration, rumination)
3 necessary conditions to write a factor analysis
- factor structure represents what we know about the construct
- factor structure can be replicated
- factor structure is clearly interpretable with precise scaling
what type of sample does a factor analysis require?
need a an over-inclusive larger sample between 200-500 subjects
facets
defined-homogenous item clusters that directly map onto the larger order factors
What happens when there are more items in a factor analysis?
created ability to tap into the constructs that you may have not anticipated, it can also produce facets or sub-constructs
With item format, where can you not do it?
cannot use dichotomous item response formats because it can cause a serious disturbance in the correlation matrix
why do authors suggest having rating scales or likert scales from 5 to 7 points?
more response items greater amount of variance can be captured