psych Testing Exam 1 Flashcards
Assessment
the overall picture
more comprehensive you get a number but also get a good integration of all data together If done right the person feels as if they are understood and have a comprehensive idea about whats going on with themselves
Testing
short term
1 number represents you
Assumptions we need to make
1.) psychological constructs exist
we should acknlodged that there is debate about this. ex. what are true constructs? (arbitray terms we throw out) 2. ) pyschological constructs can be measured however there is always error in our test scores 3. ) there are different ways to measure a construct. ex. one depression scale says they are depressed another one says they are not depressed 4. ) all procedures have strengths and weaknesses 5. ) multiple sources of info should be part of the process. (you wont get the whole picture) 6. ) performance on tests can be generalized. if someone is intelligent they should do well in another parts of their lives. 7. ) Assessment can provide useful information 8. ) assessments an be fair (they also cannot be) 9. ) assessment can benefit individuals in society
Jean Esquirol
used language ability to identify intelligence
differentiatd between emotional disorders and intellectual deficits present at birth proposed a contiuum of Retardation profound to mild
Edouard Seguin
worked with individuals with mental retardation
developed form board to increase motor control and sensory discrimination
Binet
asked by french to assist in integrating sub normal children into school.
WW1
brief cognitive screen to place recruits in the military
Woodworth personal data sheet
SAT
assess academic promise and equalize educational opportunities
Now referred to as Scholastic Assesment Test
nominal
numbers are arbitrarily assigned to categories
numbers are meaningless
ordinal
magnitude of some quantity or some order
“Strong Agree” class rank
interval
what we like to work with
establishes equal distance between measurements an average
ratio
meaningful zero point and equal intervals
If i weigh 200 I weigh twice as much as someone who weighs 100
Raw Scores,
generally meaning less
How can we make these more meaningful we need to create a context to understand them
Interquartile range
Middle 50% of scores around the median
How curves differ
if there is a negative skew the majority of scores are high
if there is a positive skew the majority of scores are low
Z score
z score = (indiv score - mean score)/Std Dev
always have a mean of zero and a stdDev of 1 can range from -4 to 4
t score
t=z score(newStdDev) + mean
mean t score = 50 std dev of 10
reliability
consistency of test scores
stability of test scores
Classical test theory
1.) True Score
reflects true ability 2.) Error Score reflects everything else the is randomly going on x = T + E
x = obtained score
t= true score
e = measurement error (random)
Measurement error
reduces the usefulness of measurement
our ability to generalize test results the confidence we have in test results
Content sampling error
a difference exists between test items and all possible items related to a construct
ex. A item will pick up some items of depression but not all (people who sleep all day but are happy) If the test items are a good sample of the construct content sampling error will be small
inter-rater error
errors in administration
clerical errors increased likelihood for inter-rater differences when scoring is subjective in nature
Cronbachs Alpha and Kuder Ricardson
internal consistency
Examine the constancy of repsond to all the indivudla items on the test Two popular types Cronbachs alpha appropriate to use with dimensional or likert items kuder richardson appropriate to use when items are dichotomous these differ from split half reliability.-These look at the relationship between the whole correlation matrix and see what the average relationship is
Standard Error of Measurement
SEM = SD (SQROOT(1-R))
if you receive a T score of 60 and the reliability of the rest is .84 what is your SEM
Evidence based on relations to Other variables
explore the relationships between test performance and external variables
Evidence based on internal structure
how are our test items related to one another.
Evidence based on Response Processes
how is the task completed by an individual
Evidence based on the consequences of testing
what are the intended and unintended consequences
Construct undrepresenations
present when the test doe not measure important aspect of the specified construct
Construct irrelevant variance
present when the test measures features that are unrelated to the specified construct
Face validity
does the test look the way you think it should look
tests with face validity are usually better received by the public not desirable when malingering is a concern
SPecificity
True negativity over total
Factor analysis
is a prominent technique to explore the internal structure of a measure
allows one to deteh the presence and structure of latent constructs among a set of variables
Exploratory Factory analysis
EFA is the statistical method to evaluate the interrealltioships of variables and derive factors
Principle Componenets Analysis
examine all shared unique and error variance between variance
Kaoser-Guttman criteria
retain factors for eighen values greater than one
not recommended but most commonly done
Eighen value
reflects how much variance each variable accounts for
Factor Rotations
to enhance interpretability, factors are typically rotated
permit factors to correlate with one another or not
CFA
model fit statistics test the fit or match between the eta and hypothesized factor structure
A positive finding in CFa does not mean the hypothesized strcutre is optimale only that the eta do not clearly contradict it
Selective Response
Strength
Easier to score cover more content enhance content sampling reduce construct irrelevant factors weaknesses Difficult and time consuming unable to assess all abilities
Constructed response
Strength
demonstrate all knowledge eliminates random guessing well suited for assessing higher order cognitive abilities Weakness You cannot do a large amount of essays difficult to score reliably vulnerable to feigning
Item response Theory
Measurement theory hypothesizes that responses to items are accounted for by latent traits
If a person has a low level of ability they are less likely to respond to test items correctly
How should a good item operate for item response theory
People with high ability are answering questions correctly
Item Characteristic Curves
a graph with ability reflected on the horizontal axis and the probability of a correct response reflected on the vertical axis
Inflection Point
the point halfway between the lower and upper symptoms and reflects the difficulty of the item
Discrimination
is reflected by the slop of the ICC at the inflection point
iccs with steeper slopes the better we are discriminating between people with high and low ability
Why do we have this standards
we need to endure that the testa we use are developed administered scored and interpreted in a sound manner
ACA/APA code