week11 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What has influenced test development up to now?

A

Content developments
Theoretical developments
-Intelligence
-Personality

Technical and methodological
Statistics
E.g., Factor analysis

Computers & the internet

Contextual needs

  • Political (e.g., impact of World Wars)
  • Funding/policy (e.g., educational testing)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Future of Testing

A

Likely same influences will impact on testing into the future:
Content developments
Technical and methodological developments
Contextual changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Content Development

A

Construct development
A construct is a hypothetical entity with theoretical links to other hypothesised variables, proposed to relate to a consistent set of observable behaviours, thoughts or feelings that is the target of a psychological test.

Theoretical advances, such as new constructs emerging in the literature, might give an idea on future tests and procedures likely to be developed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Emerging Constructs

A

3:11-7:13 https://www.youtube.com/watch?v=9xTz3QjcloI

Expansion of constructs of intelligence
Gardner’s theory of multiple intelligences
Drive development of broader measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Content Development

A

Big Five shaped development of a number of assessment measures
New concepts/increased attention driving new measure development, e.g.,
–Emotional intelligence
–Refers to a person’s capacity to monitor/manage emotions, understand the emotions of others, and use these insights to function better interpersonally
—Controversial: where to locate this in existing theory? Amalgamation of existing personality traits?

Integrity: dependability, theft proneness, counterproductive work behaviour.
—Specific type of personality test or a direct measure to test a job applicants honesty, trustworthiness or integrity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Content Development

A

Neuroscience & brain function

  • Potential psychological interpretations for imaging?
  • Line between physiological and psychological assessment?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Technical and Methodological Developments

A

Increasing access to computers and internet over time
-Computer-assisted psychological assessment (CAPA)

Smart testing

  • Computerised and multidimensional adaptive testing
  • Item-generation technology
  • Time-parameterised testing
  • Latent factor-centred design
  • Internet testing

Serious gaming

Potential for virtual reality, artificial intelligence in assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Computer Applications

A

1950s: computers first available for testing and assessment

CAT conceived

New developments in test theory including item response theory

Costs/skills prohibitive for mainstream use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Computer Applications

A

1980s: widespread proliferation of affordable home computers
Test developer access to affordable computing power
Development of computerised testing began

1990s: widespread growth of the internet
Possibility of internet testing
Testing as big business
Rapid proliferation of tests/testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Are computer and pen and paper forms equivalent though?

A

Does computer presentation fundamentally change the construct being measured?
Generally the answer is no

Cross-mode correlations of 0.97 (e.g., Mead & Drasgow, 1993 meta-analysis)
Not much difference between ticking a box on a questionnaire with a pencil or mouse

Psychological decision-making processes remain the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

But….
speeded tests
psychomotor effects

A

Speeded tests are an exception (e.g., Greaud & Green, 1986)
Characterised by very simple tasks performed repetitively, as quickly as possible, within a short time limit (e.g., coding on WISC/WAIS)

Psychomotor effects on speeded tests, variations in response modality (i.e. pen & pencil vs. computer) do affect results

  • -Cross-mode correlation of 0.72 (e.g., Mead & Drasgow, 1993 meta-analysis)
  • -Using a pencil is easier than using a mouse, thus mode of response greatly affects measurement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Computer-assisted testing: WISC-V as an example

A

https://www.youtube.com/watch?v=tp5B86ajbmw

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Multidimensional Adaptive Testing (MAT)

A

MAT as an extension of Computerised adaptive testing (CAT) covered in educational testing lecture
-Multivariate generalisation

Revision: CAT is where a computer continuously monitors test-taker’s performance and selects next item to administer to get the most information

  • Item correct- harder item
  • Item incorrect- easier item
  • Adapts to your location on underlying trait- to around where you would get half right and half wrong
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Multidimensional Adaptive Testing (MAT)

A

MAT takes adaptive testing to the next level by applying this same idea to a battery of tests rather than a single test
–Capitalises on idea that many constructs measured by a test are correlated

Performance on each item then informs items used for every subtest in a battery

Adapts simultaneously across subtests

Key advantage: reduces test time without sacrificing accuracy of measurement across a whole battery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Limitations of MAT

A

Like CAT, amount of effort to develop a sufficiently large item bank to draw from

Requires 100s of items with item parameters estimated

Requires data from large samples of examinees with extensive testing during development, even more so than in CAT

Potential for “chopping and changing” between item types as system selects any subtest in the battery

  • -May be confusing for test-takers
  • -Need to remember instructions across subtests
  • —-Memory requirements may be unrealistic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Item-Generative Testing

A

Possible solution for need for large item banks (MAT and CAT)

New items generated automatically by a computer based on an underlying rule or algorithm

–Main source of difficulty for subtest by rule/template computer can generate infinite number of actual items of desired difficulty or by randomly initialising key variables/applying a rule

-Potential future assessments based on cognitive models of test performance to drive item-generative testing

17
Q

Time Parameterisation

A

Speed vs. accuracy?

  • May sacrifice one for the other, creates challenges in scoring/interpretation
  • Can’t tell from final score which strategy taken

BUT, Computer administered tests allow capture of response time

Challenge is how to use this?

  • Analyse separately?
  • Combine to investigate accuracy/time trade-off (efficiency)?
  • Treat time as a difficulty dimension?
  • Set a time limit or deadline for each item?
18
Q

Internet Testing

A

Revolutionised testing

Larger impact on distribution
than development of tests

Questions can be quickly circulated to psychologists and other uses

Internet versions of tests easily kept up to date and disseminated upon development

Can modify scoring and way questions presented easily

Information easily returned to test developer

  • –Potential for dynamic norming
  • –Potential in future for multidimensional, adaptive, item-generative, time-parameterised, latent factor-centred, dynamically normed test!
19
Q

Internet Testing: Risks & Limitations

A

“Digital divide” (Batram, 2000)
Some people have better access to the internet than others; best access tends to be most privileged

Strong tradition in testing of trying to avoid discrimination

Narrowing gap in recent years as computers and internet are becoming cheaper and more widespread

Potential to bridge service gaps in rural/remote areas
–More limited access to professionals/getting to a test centre

20
Q

Risks and Limitations

A

Security of Information

  • Security of potentially highly sensitive information for the test-taker
  • -Security of the test itself
  • —Tests restricted in access to maintain integrity of test
  • —Potential for rapid dissemination via the internet
  • —–Disable printing and screen capture
  • ———–Can’t stop someone photographing with digital camera!
  • Bandwidth limitations
  • —Can impact on timing due to lag
  • —–Serious challenges for CAT or MAT
  • –To work, answers need to go back to server for scoring/adaptation
  • ———If bandwidth challenges seriously slow down test and impact on test experience
  • —-OR, download whole test locally- large downloads & may exacerbate test security challenges
21
Q

Risks and Limitations

A

Proliferation of non-evidence-based assessments on the internet
Pop-psych and para-psychological
Major problem for field

Testing vs. Assessment
Internet suited to testing, but not assessment
Risk of being used (inappropriately) as a replacement for psychological assessment
Very open to misuse and misinterpretation

22
Q

Industrial and Organisational Testing Online

A

Rise of online recruiters and job markets

Potential for automatic head hunting with “web bots” trawling web for CVs (e.g., LinkedIn)

Temptation for delivery of psychological tests/assessment direct to public without a psychologist

  • “Unsupervised mode”
  • Raises questions about assumptions in psychology of requirements for assessment
23
Q

Supervised Testing in the Digital Age

A

Functions of supervision of assessment

  • Authenticating the test-taker
  • Establishing rapport
  • Ensuring test is administered according to manual
  • Preventing cheating
  • Ensuring security of the test itself
24
Q

Levels of Supervision

A

Open (“unsupervised mode”)
E.g., tests published online, in magazines, or books
Many personal development measures
If tests incurred significant development costs unlikely to be open
Only suitable for low-stakes testing

Controlled (e.g., password to access)
Suitable for first step in recruitment process
Recommended to follow-up with verified testing

Supervised mode (e.g., presence of proctor in non-secure environment)
E.g., NAPLAN in 2018

Managed mode (formal examination conditions with test kept secure)
May include locally supervised or remote (e.g., using webcam technology, keystroke monitoring, and timing)
Raises additional complexities!

25
Q

Considerations for Supervised Testing in the Digital Age

A

When does supervision matter?

Tests of typical performance (e.g., personality, interest inventories) tend to not be adversely affected by absence of formal supervision

Tests of maximal performance (e.g., aptitude, achievement tests) answers impacted on by presence/absence of a supervisor
Potential to look up answers, phone a friend, etc
Tends to inflate test scores

26
Q

Technology

A

What technology could be used in future assessment?

Serious games
Eye-tracking
Mobile phones/smart phones
Wearable devices
Some authors have suggested potential for virtual reality, artificial intelligence, & holograms!
27
Q

“Serious Games”

A

Game developed other than for primary purpose of entertainment (Charsky, 2010)

May provide an economical and accessible alternative where game play is a form of assessment

Benefits include the ability to design personalised games, promote health-related behavioural change, and educate participants

E.g., “Whack-a-mole” for cognitive assessment of older adults

28
Q

Mobile Phones

A
Mobile phones include:
Microphones that can record
Videos/photographs
Bluetooth
GPS
Accelerometers
E.g., GCC
Applications (Apps)
29
Q

Mobile Phones

A

Smartphones re-purposed as tools of assessment contain safeguards to protect the privacy of the subject of the assessment.
May be used for local, remote, & ecological momentary assessment (real time)
Well accepted by clients (e.g., psychiatric patients) and user-friendly

30
Q

Wearable Devices

A

Eye tracking glasses
Recording devices (steps, vocalisations, heart-rate etc)
E.g., Language Environment Analysis (LENA) tracks child/adult vocalisations and records
May be used to assess language impairments, monitor treatment progress, and effectiveness of interventions
Potential of technology significantly changing what we see as a “test” in future

31
Q

Contextual Changes

A

Broader social environment shapes what assessments are developed
Push for simpler/shorter measures that can be developed quickly, in contrast to technological wizardry!
Continually rising demands of general public
Increasing demands for accountability and transparency
Meet demands through ever more vigilance in terms of ethics and professionalism, and increasing scientific research into validity of tests.

32
Q

Contextual Changes

A

Managed care in clinical domain
Reluctance to use psychological assessment!
Subject to funding: not funded/limited funding (e.g., Medicare, NDIS)
E.g., NDIS funding based on functional impairment but no single measure appropriate across all ages and disabilities
Cost cutting/funding concerns- need to advocate for value/need
Important role of ethics in face of pressures in future assessment!