Part 2 (Final) Flashcards
What is a latent variable
latent variables are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables
What are two things we are expected to measure in an observation
true score: the real/expected influences on our measurements
ideally, more true score than error. However, often not the case
error: undefined/unexpected influences on our measurements
Why do we want enough questions in our questionnaire when it comes to error
if we have enough questions measuring the construct well, we can overcome error by cancelling out the randomness of overestimation vs underestimation
How can you reduce error
Combining measures
like layering all pictures that had only 10% of the information we can get a clearer idea of true score
What is reliability in the context of a questionnaire
In a world where our measurements are all error-prone, identifying patterns across multiple responses becomes critical. This is consistency
i.e. we need the latent variable to be influencing all responses to at least some extent producing a common pattern in the responses
the influence of the latent variable is the true score
any other influence is error
What is the classical measurement model
The classical measurement model has 3 key assumptions
remember: an assumption is something we expect to be true
- The individual items of a questionnaire each have error and true score. The amount of error in any given item varies randomly. The mean error across items is zero (given a sufficient N)
- The error in one item is not correlated with the error in any other
- The error in the items is not correlated with the true score
What is the parallel test model
Extends the classical measurement model with 2 more assumptions to be more practical
- The latent variable influences all items equally
all item-construct correlations are the same
2.Each items has the same quantity of random error
the combined influences of all other factors are the same
Each item is true score + error, so if you reduce the amount of true score (latent variable influence) then you would increase the amount of error
Name the 5 assumptions of the parallel test model
- Only random errors
- Errors are not correlated with each other
- Errors are not correlated with the true score
- The latent variable affects all items equally
- The amount of random error for each item is equal
What is the essentially tau-equivalent model
to avoid violating our assumptions it’s helpful to loosen them a bit
- Only random errors
- Errors are not correlated with each other
- Errors are not correlated with true score
- The latent variable affects all items equally only when standardized
differences are due to constants
e.g. all questions are turned into z-scores, we might have different response formats across our questions. If we do not convert them into z-scores, it might not look like they are all affected equally - The amount of random error for each item is not necessarily equal
partly a consequence of not standardizing
If you are using a mix of Likert-type, analog and dichotomous questions. Which model is better for you?
Essentially Tau-Equivalent Model because less likely to violate our assumptions
What is the congeneric model
The congeneric model is much less strict
- Random error is only preferred, but not necessary
- Errors are preferably not correlated with each other, but can be
- Errors are not correlated with the true score
- The latent variable affects all items in some way
- The amount of random error for each item is not necessarily equal
Compare the models in strictness
- Starting points: classical measurement model + parallel test
- Common & somewhat strict: essentially tau-equivalent + congeneric
- Much less strict: general factor (allows for multiple latent variables in each measure)
What are additional assumptions that correlation analysis will add to our model
- you have interval-level data
probably not true, usually ordinal even if close to interval - your data follow a normal distribution
not too well, without interval data - a straight line is the best way to represent the relation
this is probably true
Explain the assumption of linearity
If the line of best fit should be u-shaped, you have a big problem
A straight line fit to these data would be flat indicating no relation is present
luckily, for reliability, we’re talking about measuring the same construct across two different questions, so it’s quite unlikely for their relation not to follow a straight line
Name the two deviations from normality
Skewness: the presence of a longer than normal tail
Kurtosis: the presence of a taller or wider than normal spread
What does deviation from normality threatens
there are two main categories of deviation from normal (skewness and kurtosis), and they both threaten the validity of all these models
Describe skewness
Either the left or the right tail could be pulled out
Skewness means you distribution has asymmetry
A negative skewness means the left tail is long
A positive skewness means the right tail is long
Describe kurtosis
The peak can be pulled up or pushed down
There are two kinds of kurtosis as well
a negative kurtosis means the curve is falter
a positive kurtosis means the curve is taller
Assumes symetrical distribution, something is wrong on both sides of the distribution
How do we assess whether kurtosis and skewness are too much
We assume a normal distribution, but don’t typically get it in practice
there will be some degree of skew
there will be some degree of kurtosis
If either score are +/- 3, then it indicates it might be too much of a problem
OR
Multiply the standard error (SE) by 3 and if the skewness and kurtosis are bigger than the number, then it’s too much
For that to make sense, you need a fairly small sample size, since as n increase, SE decrease, but overall assumptions are better
How can large error be problematic for reliability
extremely large amounts of error will prevent you from observing any interesting associations (because it’s random)
How do we determine whether we have too much error?
We estimate it by correlating a measure with itself
Internal consistency, whatever is not true score must be error
Name the 5 form of internal consistency measures
- Cronbach’s alpha
- Split-half
- Test-retest
- Alternate Forms
- Omega
How common is Cronbach’s alpha
Most commonly reported
Easy to use with jamovi (and SPSS)
Briefly describe Split-half
Less common than alpha
Easy to use too (but only manually)
Usually, split by odd and even numbers. Correlate points from even questions to odd questions