2 psychometrics and confirmatory factor analysis Flashcards
What is psychometric assessment?
Psychometric assessments include no manipulation or random assignment of participants and, crucially, in psychometric assessments there is no correct answer, participants are free, and encouraged, to respond exactly as they believe most appropriate.
What do psychometrics investigate?
(1) investigating and attempting to measure the ways in which humans are either similar to or different from each other,
(2) investigating the ways in which different measures relate to each other, and, perhaps of most importance to the aim of psychometrics is,
(3) investigating how can we capture the complexity of the world in observable data.
What questions can be answered using psychometrics?
study of individual differences
get away from generalisations of rules of behaviour
Why should individual differences be investigated?
- because they are of interest in their own right → intriguing issue in itself
- because psychological tests are useful in applied psychology → invariably leads to development of tests, measuring spectrums of variables
- because tests are a useful dependent variable in other branches of psychology → experimental research!
- other branches can predict behaviour better when they consider individual differences → sensitivity of statistical tests
What is diversity?
distribution of differences among individuals, attribution of difference from one individual to another
How can differences be identified and measured?
- clinical approaches - obsrvation of individuals, during therapy/treatment
→ difference between people, no generalisation possible to larger samples, dependent on subjective view - “armchair speculation” unbiased observations of behaviour, but no understanding of wider context, reliability of conclusions questionable
- scientific assessment (psychometric testing, experimental testing, questionnaires)
empirical methods
behavioural consistency?
replicable!!
What does the international test commission guidelines ITCG posit as psychological testing?
- includes a wide range of procedures for use in psychological, occupational, and educational assessment
- includes procedures for the measurement of both normal and abnormal or dysfunctional behaviours
- normally designed to be administered under carefully controlled or standardized conditions
- provide measures of performance and involve the drawing of inferences from samples of behaviour
- include procedures which may result in the qualitative classification or wondering of people
Give a possible taxonomy of psychological assessment?
measurement
correct - tests
not using correct reponses - questionnaires, inventories
non-measurement
interviews, observations, …
other checklists, Q, …
-> assessment
What are benefits of psychological measurement?
objectivity, minimises subjective judgement
quantification
subtle effects can be observed and statistical analysis used to make precise statements about patterns of attributes and relationships
better communication, integrity of scientific community
When is a variable latent, a construct, a factor, or directly observable?
factor, latent variable and construct can be considered synonyms in the context of CFA
they cannot be directly measured
What did Darwin have to do with testing?
17th century - post-rennaissance philosophers started looking at testing in a more scientific way
empiricism arose
all factual or true knowledge originated in experience
darwin - 1858-1977 → influenced early psychology
members of a species exhibit a variability of characteristics and resulting in some being better suited that others for a set of environmental conditions
”characteristic” = anything tht can be attributed to an individual
→ significance of individual differences apparent
What was Francis Galtons (1822-1911) contribution to testing?
obsessed with making all kinds of measurements
he created the first assessment of mental ability
he produced the normal distribution - meaningful summary by mean and SD
What was Alfred Binets (1857 - 1911) contribution to testing?
abnormal vs. normal
intelligence test, we know it
Stanford-Binet-Intelligence Test
What are other important historical steps in psychological testing?
karl peason (1857-1936) - discovered the chi-square test of significance and regression analysis, as well as correlation coefficient
louis thurstone (1887-1955) - developed test theory and designed techniques for measurement scales
Rasch models (1980) - development of a group of statistical models
raymond cattell (1905-1998) - theoretical development of personality
difference between personality attributes value in industrialised countries vs developemtal countries
Anne Anastasi (1908-2001)
Paul Kline (1937-1999) - argued for transformation from social science to pure science
Michel Foucault - book, 2001, madness and civilisation
mental illness was a cultural construct rather than a natural fact
history of madness - questions about freedom and control, knowledge and power
Jean Etienne Esquirol (1772 - 1840)
transformed diagnosis of mental disorders - define problems on the basis of their symptoms
jean-Martin Charcot (1825-93) extended that
Emil Kraepelin (1856-1926) - concepts of mental disease and classification
What is the technical nature of assessment?
- standardised administration
- scales, items
-> normative - information on accuracy or consistency of scores
- give evidence of validity, which provides basis for making valid inferences about the differences between scores
- scientific rationale
- an explanation of construction;
- standardized administration procedures in many cases;
- use of a large sample to establish a process for comparison with others;
- accuracy and error measures;
- evidence for validity;
- guidance on interpretation
If those technicalities are not adhered to, what dangers can occur?
- inappropriate for purpose
- poor quality
- how to use it properly?
- false administration
- misuse, false interpretation
What are traits?
traits are relatively constant, long-lasting tendencies, they are predictable and indicate underlying potential (Allport, 1961; Allport & odbert, 1936)
-> can be measured
can be grouped into three classes: attainments, ability traits, personality traits
- how well a person performs following a course of instruction
- level of cognitive performance
- individuals style of behaviour
cognitive traits vs personality traits → meaningful way to rank people
experimental research - states?
correlational research - traits?
What are states?
transient or temporary aspects of the person, tend to be shown physiologically
assessment of states is more commonly conducted (depression, anxiety, helplessness, suicidal ideation, social contact, …)
-> more commonly measured
What are measurements of maximum performance?
tests of ability, aptitude and attainment
how well do ppl do things
abstract reasoning, spatial orientation or relations, numerical/inductive reasoning, ideational fluency, musical sensitivity, clerical speed and aptitude, programming aptitude, spelling and grammar, manual dexterity, handtool dexterity
What are measures of typical performance?
assessments of personality, belief, values, interests
more user friendly
no right or wrong
no time limit
- the 16 Personality Factor Questionnaire (16PF);
- the Personality Assessment Inventory (PAI);
- the Occupational Personality Profile (OPP);
- the 15 Factor Questionnaire (lSFQ);
- the California Personality Inventory (CPI);
- the Myers-Briggs Type Indicator (MBTI);
- the Minnesota Multiphasic Personality Inventory (MMPI);
- the Jung Type Indicator (JTI);
- the Millon Adolescent Personality Inventory;
- the Occupational Personality Questionnaire (OPQ);
- the Criterion Attribution Library (CAL)
What are other types of measurements?
standardised and non-standardised techniques
basis of group of individual administration
health, forensic, educational settings
paper-and-pencil approaches and appartus test
cognitive versus affective methods
precise scoring or elicitation questions
What is an open mode approach to testing?
free access, online, in their own time
main issues: authenticity, other persons, time variables
no knowledge of who has taken it
What is a controlled mode approach to testing?
identification of the test-taker
fixed date and time
but once again, authenticity and collusion, timing and security…
What is supervised mode approach to testing?
more traditional
attend a test session office
greater assurance of who is taking the test
need for administration training and costs
What is a managed mode approach to testing?
secure testing centre where supervision is available
controlled environment and uniformity in conditions
What ensures quality of mesurement?
- scope
- reliability
- validity
- acceptability
- practicality
- fairness
- utility
What is strucutral equation modelling?
comprehensive statistical approach that allows researchers to test complex relationships among observed and latent variables. Latent variables are not directly observed but are inferred from observed variables
evaluate model fit to data
handling complex relationships
What is confirmatory factor analysis?
examining the nature of latent constructs (attitudes, traits, intelligence, clinical disorders)
in contrast to exploratory factor analysis, CFA explicitly tests a priori hypotheses about relation between observed variables and latent variables
often the analytic tool of choice for developing and refining measurement instruments, assessing construct validity, identifying method effects, and evaluating factor invariance across time and groups
part of family: structural equation modeling (SEM)
- evaluate measurement model
- assess structural model
What belongs to theorectical formulation and data collection in a CFA?
- allows specification of highly complex hypotheses
- justification of models to do so
number of models positet a priori - equivalent models, justify clearly, can all models be estimated
identify competing theoretical models - credibility of observed measures
ensuring adequate sampling procedures, using power analysis to obtain estimates of adequate sample size
justify population
was SEM used for subsequent validation?
What belongs to data preparation in a CFA?
- data integrity
- distributional assumptions of the estimation method - maximum likelihood (ML), assumption of multivariate normality (MVN)
- failure to meet the assumption of MVN can lead to overestimation of chi-square statistic and inflated type 1 error and downward biased SE
- anaylsis and treatment of missing data (listwise deletion, pairwise deletion, mean substiution, multiple imputation, expectation maximisation)
most commonly: listwise deletion or available case analysis
What is an assumption of multivariate normality?
Multivariate normality assumes that the set of variables being analyzed jointly follows a multivariate normal distribution.
In simpler terms, not only should each variable be normally distributed, but their combined distribution should also form a specific shape in multidimensional space that corresponds to the multivariate normal distribution.
What is the chi-square statistic?
This test determines whether a sample data set matches a population with a specific distribution
and if two categorical variables are independent in influencing a test statistic
goodness-of-fit test
test for independence
- The chi-square test effectively compares the model you have defined against the ‘ideal’ model for the data. As a result, you want the chi-square test to be non-significant (p>.05) as this will allow us to infer that our predefined model is not significantly different from the ideal model of our data.
- Note: as the amount of data increase, chi-square is increasingly more likely to be significant and, therefore, any interpretation of the model must be based on more than the chi-square result, especially when the dataset is large.
What are type 1 and type 2 error?
Type 1 Error (False Positive):
This error occurs when the null hypothesis is true, but we incorrectly reject it.
In other words, it’s concluding that there is an effect or a difference when there isn’t one.
The probability of committing a Type 1 error is denoted by alpha (α), also known as the significance level of the test. If you set your α at 0.05, there’s a 5% risk of committing a Type 1 error.
An example of a Type 1 error would be a medical test incorrectly indicating that a patient has a condition when they do not.
Type 2 Error (False Negative):
This error occurs when the null hypothesis is false, but we fail to reject it.
It means we’re concluding there is no effect or difference when, in fact, there is one.
The probability of committing a Type 2 error is denoted by beta (β). The power of a test, which is 1 - β, represents the test’s ability to detect an effect or difference when it truly exists.
An example of a Type 2 error is a medical test failing to detect a condition that is present in a patient.
What belongs to analysis decisions in a CFA?
choice of input matrix and estimation method
default choice: variance-covariance matrix and ML estimation
other aspects of modeling process: latent variable scales were fixed? and type of software used
What belongs to model evaluation and modification in an CFA?
- model fit?
chi-square goodness-of-fit test; goodness-of-fit index; adjusted goodness-of-fit index, comparative fit index, root-mean-square error of approximation
recommendations for cut-off values have changed: .90; .95; .97 - see below for cut-off value recommendations
- is model modification practiced?
What belongs to reporting findings in an CFA?
- parameter estimated, including variances of exogenous variables be reported (standard error)
indication of variance accounted for in endogenous variables
report structure and pattern coefficients - specify model and justify choice
What did Jackson et al. (2009) reveal about the current reporting practices of CFA?
- 194 studies published in 24 journals
- 1409 models
- theoretical formulation and data collection
The majority of studies reviewed (75.5%) focused on validating or testing the factor structure of an instrument. Of these, most (77.7%) used an existing instrument, with the remainder (22.3%) reporting on validation of a new measure, examine constructs or theories (15.8%) or to assess a measurement model prior to conducting a SEM (8.8%). - 63.8% posited more than one a priori model
- approximately one-fifth (21.6%) of the studies clearly indicated that measured variables had been examined for univariate normality. In only
13.4% of the articles was it clear that MVN had been examined, and in even fewer instances (3.6%) was it mentioned that data had been screened for multivariate outliers - 65% did not report missing data information
- maximum likelihood was more common, 42%
was not reported in 33% - 90% reported chi-square values
CFI (78.4%), RMSEA (64.9%), and TLI (46.4%) - half of the studies (57.2%) stated an explicit cutoff criteria and approximately one-third (36.1%) provided a rationale for their choice of fit measures
⇒ a variety of reporting problems
⇒ no specification of matrix used, 50% did not report factor loadings or latent variable correlations, seldom mentioned whether they examined their data for normality
What should the RMSEA and SRMR be?
< .05 indicates a model close to ideal
< .08 suggests an acceptable model
> .08 is mediocre and becomes poor as the value moves further from .08.
What should the CFI be?
- CFI compares the defined model against the baseline (i.e. the least structure, worst defined model). As a result, you ideally want the CFI to be as large as possible. As a general guideline:
- > .95 is good
- > .90 is acceptable
- < .90 is mediocre and becomes poor as the value moves further from .90.
How would a CFA be conducted in JASP?
- name your factors
- move the datasets to the factors
- additional output
- additional fit measures (RMSEA, SRMR, CFI)
- click on plots - model plots
- chi-square test → non-significant
- CFI → above .95
- RMSEA and SRMR → below .08
- factor loadings -> the larger the better