Psych Ass (Terms) Flashcards
The formula helps to estimate how the reliability of a test changes when the number of items (questions) in the test is increased or decreased. It is usually used in establishing Split-half reliability.
Spearman Brown formula
means that the two variables move in opposite directions: when one goes up, the other goes down.
negative relationship
A type of test that does not have a fixed set of questions or format. It allows for open-ended responses and is more flexible in nature. An example would be an open-ended interview.
Unstructured test
A psychological test where a person responds to ambiguous stimuli, such as pictures or words, with the goal of uncovering hidden emotions and internal conflicts. An example is the Rorschach Inkblot Test.
Projective test
A test with a set of fixed questions and a specific format for responses. It is standardized and typically used to measure specific traits or abilities. An example is the Children Personality Questionnaire-R (CPQR).
Structured test
ensures that participants are assigned to different groups in a way that minimizes bias
Random assignment
A test designed to measure a person’s cognitive abilities and intellectual potential. It often includes tasks related to reasoning, problem-solving, and understanding complex ideas.
Intelligence test
It is a group that is necessary to compare the effects of the experimental treatment
control group
is the average of a set of numbers
Mean
is the value that appears most frequently in a dataset. It represents the most common or typical value in a set of data.
Mode
is the middle value in a dataset when the values are arranged in ascending or descending order.
Median
can skew the mean by pulling its value towards extreme scores, which can misrepresent the central tendency of the data.
Outliers
Checks newborns’ health right after birth.
Apgar test
Is a battery of tests measuring intelligence and achievement of normal and exceptional children ages 2½ through 12½ years. It yields four scales: the Sequential Processing Scale, the Simultaneous Processing Scale, the Mental Processing Composite (Sequential and Simultaneous) Scale, and the Achievement Scale.
Kaufman Assessment Battery for Children (K-ABC)
Assesses academic skills and cognitive abilities. It measures seven broadly defined abilities identified in CHC theory: Long-Term Retrieval (Glr), Short-Term Memory (Gsm), Processing Speed (Gs), Auditory Processing (Ga), Visual-Spatial Thinking (Gv), Comprehension-Knowledge (Gc), and Fluid Reasoning (Gf).
Woodcock-Johnson III (WJ-III)
Evaluates development in infants and toddlers up to 3.5 years old.
Bayley Scale
It checks how well a new test correlates with an established measure of the same thing, given at the same time.
Concurrent validity
Estimates how consistent test scores would be if the test were longer or shorter.
Spearman-Brown formula
Measures how well items in a test measure the same thing, especially for yes/no or true/false questions.
Kuder-Richardson 20
Measures how closely a single score (continuous variable) relates to a dichotomous variable (like pass/fail).
Point-biserial correlation
Measures how strongly two sets of scores are related to each other in a straight line.
Pearson r
involves dividing the test into two halves and checking if scores on one half of the test are correlated with scores on the other half. A high correlation between the two halves suggests that the items in the test are measuring the same underlying construct or dimension consistently, thereby indicating internal consistency reliability.
Split-half reliability testing
is a personality test that includes scales to detect unusual or atypical responses.
16 Personality Factors (16-PF)
This scale assesses whether the respondent is trying to portray themselves in a socially desirable manner rather than answering honestly.
(16-PF)
Impression Management
This scale detects whether the respondent is endorsing unusual or improbable items at a higher frequency than expected. (16-PF)
Infrequency
This scale checks whether the respondent tends to agree with statements regardless of their content, often alternating between agree and disagree responses.
(16-PF)
Acquiescence
is agreeing to something, like a medical study, when someone can’t legally consent, such as minors.
Assent
Consistently skews measurements in one direction due to a specific cause.
Systematic error
Unpredictable variations in measurements that occur randomly, causing measurements to vary around the true value.
Random error
A statement that is true by definition or logical deduction.
Analytical statement
A statement that can be tested and proven false through observation or experimentation.
Falsifiable statement
A statement that is self-contradictory and inherently false.
Contradictory statement
A statement that is based on an assumption or hypothesis, awaiting evidence or testing for validation.
Hypothetical statement
are the most widely used measures of intelligence, and have been translated, adapted, and standardized in dozens of countries around the world. It measures cognitive abilities in adults with subtests for verbal comprehension, reasoning, memory, and processing speed.
Wechsler-Bellevue Intelligence Scale
A test that assesses abstract reasoning through visual pattern completion tasks.
Raven’s Progressive Matrices
A test that is originally for children, it now assesses various cognitive abilities across different age groups.
Stanford-Binet Intelligence Scale
It a test that is designed to reduce cultural bias by measuring cognitive abilities without relying on specific cultural knowledge or language skills.
Culture Fair Intelligence Test
In psychometrics, the process of choosing test items that are appropriate to the content domain of the test. It focuses on sampling relevant content domains in tests.
Domain Sampling Model
Assumes test scores include true scores and random error, emphasizing standardized testing.
Classical Test Score Theory
Assesses how well test items distinguish between different levels of ability or trait.
Item Discriminability Analysis
Models how individuals’ abilities relate to their responses on test items, providing insights into item difficulty and discrimination.
Item Response Theory
Changes in behavior that occur when individuals know they are being studied or observed.
Reactivity
The phenomenon where higher expectations lead to an increase in performance.
Pygmalion effect
Also known as the “expectancy effect,” it occurs when researchers’ expectations about study participants influence the participants’ behavior or performance.
Rosenthal effect
When participants in a study change their behavior because they perceive themselves in competition with another group or condition.
John Henry effect
refers to the consistency of test results between two different versions of the same test, administered to the same group of people.
Alternate Forms reliability
Different participants are assigned to different conditions or groups, with each participant experiencing only one condition.
Between-subjects
The same participants experience all conditions or treatments, allowing comparisons within the same individuals.
Within-subjects
Combines elements of both between-subjects and within-subjects designs. Some factors are tested within subjects, and others are tested between subjects.
Mixed
Examines the effects of two or more independent variables (factors) by combining them in all possible ways, with each participant being assigned to one of the combinations. This can be between-subjects, within-subjects, or mixed.
Factorial
A measure of the linear correlation between two variables, indicating how well one variable predicts the other.
Pearson r
A measure of reliability for tests with dichotomous (yes/no, true/false) items, assessing the internal consistency of the test.
Kuder-Richardson 20
A measure of internal consistency or reliability for tests with multiple items, indicating how well the items measure the same underlying concept.
Cronbach’s alpha
A measure of inter-rater reliability, indicating the extent to which different raters or observers agree on their assessments.
Kappa statistics
It asks “Does the test measure what it’s supposed to measure?”
Validity
“Does the test produce consistent results?”
Reliability
A type of rating scale where respondents indicate their level of agreement or disagreement with a series of statements.
Likert scale
A method where respondents compare two items or alternatives at a time and choose which one they prefer.
Paired comparisons
A format where respondents choose one or more options from a list of predefined choices.
Multiple choice
A scale where items are arranged in a hierarchical order of intensity or agreement, such that agreement with a stronger statement implies agreement with all milder statements below it.
Guttman scale
is one where the responses or results can be interpreted in multiple ways, leading to uncertainty or unclear conclusions about the respondent’s traits, beliefs, or attitudes.
ambiguous test
A test where respondents complete sentence stems to reveal subconscious thoughts and feelings.
Sack’s Sentence Completion Test
A test where respondents complete sentences to measure aspects like locus of control.
Rotter Incomplete Sentence Blank
A test assessing an individual’s sense of meaning and direction in life through specific questions or scales.
Purpose in Life Test
usually states that there is no relationship or no difference between variables.
null hypothesis
states that there is a relationship or a difference between variables.
alternative hypothesis
(correct answer rate) is the proportion of test-takers who answer a particular question correctly.
Item difficulty
assesses whether an item is able to differentiate between high and low performers.
Discrimination index
A theory for developing and analyzing tests based on the relationship between a test taker’s ability and their likelihood of answering each question correctly.
Item Response Theory (IRT)
assesses how consistently individuals perform on the same test when it’s given to them multiple times. It’s used to gauge the reliability of measures that are expected to remain stable over time, like personality traits or intelligence
Test-retest reliability
Observable actions or behaviors that can be directly seen or measured
Overt behaviors
Internal or hidden characteristics, such as personality traits or attitudes, that are not directly observable.
Covert traits
Characteristics or attributes that are relatively consistent and enduring over time, such as personality traits or intelligence.
Stable traits
Traits that are expressed or observed more frequently in individuals; they may not necessarily be stable or enduring.
Dominant traits
are any variables other than the independent variable (the variable being studied) that could influence the dependent variable (the outcome being measured) in an experiment.
Extraneous variables
Ensures a test accurately measures the intended trait or concept.
Construct validity
Checks if a test covers all relevant aspects of the subject it aims to measure.
Content validity
Refers to whether a test appears to measure what it claims to measure at face value.
Face validity
Measures the consistency among different raters or judges when scoring the same responses or behaviors
Interrater Reliability
Managing and analyzing a collection of test items to optimize test construction and ensure validity and reliability in assessment
Item Bank Analysis
happen when the results of a first test affect the results of a second test, influencing reliability measures like test-retest correlations.
Carryover effects
refers to the consistency of measurements across different tests, raters, or situations.
External consistency
emphasizes psychologists’ duty to prioritize clients’ well-being. They recognize when personal circumstances, like emotional distress, might hinder effective service delivery. By referring clients to more capable professionals, psychologists ensure clients receive optimal care.
Beneficence
underscores psychologists’ obligation to avoid harming clients. Emotional distress can impair service effectiveness, potentially failing to meet clients’ needs or causing inadvertent harm. Referring clients when unfit mitigates these risks, upholding ethical standards and client welfare.
Non-maleficence
requires mental health professionals to warn and protect potential victims when a client poses a serious threat of harm to others (duty to warn and protect).
Tarasoff rule
Evaluates the effectiveness of individual test questions.
Item analysis
Measures how easy or hard a test question is.
Item difficulty
Is established by identifying the total test scores of those who have answered correctly in a particular item of the test.
Discriminability Analysis
Compares average scores between two or more groups to see if they’re different.
ANOVA (Analysis of Variance)
Checks how two ranked lists (ordinal variables) are related.
Spearman rho
Tests if one group is specifically better or worse than another.
One-tailed test
Tests if there is any difference between two groups, without specifying the direction.
Two-tailed test
a component of construct validity, which shows that two theoretically unrelated constructs are actually unrelated in practice.
Discriminant Validity
refers to improvements in performance on a task due to repeated exposure and practice over time.
Practice effect
Performance declines due to tiredness or boredom from repeated testing.
Fatigue effect
Changes in performance over time due to general factors, which could include both practice and fatigue effects.
Progressive effect
The influence on performance when the same test is administered multiple times, which can include practice effects and other factors.
Test-retest effect
typically involve a detailed examination of a single instance or event, which often means there’s little to no manipulation of antecedent conditions and a focus on understanding the complexity and context of the subject.
Case studies
encompasses both reliability (the consistency of a measure) and validity (the accuracy and appropriateness of the measure in assessing what it intends to measure).
Psychometric soundness
evaluates the overall practical value and usefulness of a measurement or assessment tool beyond just its reliability and validity.
Utility
refers to accuracy in measurement
Valid
refers to consistency in measurement
Reliable
The 16-PF is a personality assessment tool that measures sixteen primary personality traits. These traits are organized into five global factors:
Warmth
Reasoning
Emotional Stability
Dominance
Liveliness
Tendency to consistently rate items higher or give higher scores than warranted.
Leniency error
Tendency to rate items near the middle of the scale, regardless of their true merit.
Central tendency error
Tendency to consistently rate items lower or give lower scores than objectively warranted.
Strictness error
Refers to irregular or non-standard distribution of scores that deviates from expected patterns, often due to systematic biases in scoring or response patterns.
Distribution error
The APGAR score evaluates a newborn’s health by assessing:
Appearance, Pulse, Grimace, Activity, and Respiration
indicates good health with no immediate concerns (APGAR)
7-10
suggests the need for some assistance or stimulation (APGAR)
4-6
signals severe distress requiring urgent medical intervention (APGAR)
0-3
The tendency for extreme scores or values on a variable to move closer to the average or mean upon re-testing or re-assessment
Statistical regression
refer to cues or subtle indications in an experiment that suggest to participants what the researcher expects to find or how they should behave. These can unintentionally influence participants’ responses, potentially biasing the results.
Demand characteristics
Participants don’t know their group (experimental or control); experimenter does.
Single-blind experiments
Neither participants nor experimenters know group assignments.
Double-blind experiments
Participants unaware they’re being studied. Raises ethical concerns like consent and deception.
Covert operations
Measures how well the results of a test correlate with those of other tests administered at the same time, assessing the same construct.
Concurrent validity
Examines the extent to which test results predict future behaviors or outcomes.
Predictive validity
Assesses whether a test adds new and unique information beyond what existing measures provide, improving predictive accuracy or understanding.
Incremental validity
Refers to the extent to which a score on a scale or test predicts scores on some criterion measure.
Predictive validity
The extent to which a test’s scores correlate with scores on another well-established test measuring the same construct, both measured at the same time.
Concurrent validity
The extent to which a test correlates with other tests that are intended to measure the same construct.
Convergent validity
The consistency of a test’s scores over time when the same test is administered to the same group of people on two different occasions.
Test-retest reliability
is a measure used in statistics that indicates the percentage of scores that fall below a particular score.
Percentile
is a way to express a proportion or a fraction of 100. It represents a part of a whole.
Percentage
is demonstrated when scores on a test that is intended to measure one construct are not related to scores on a test that measures a different construct.
Divergent validity (also known as discriminant validity)
Participants are divided into two separate groups (control group and experimental group).
Two independent groups design
An experiment where two or more independent variables are manipulated simultaneously to observe their effects on the dependent variable.
Factorial design
Participants are divided into more than two groups, each experiencing different conditions or levels of the independent variable.
Multiple groups design
Participants are matched on relevant characteristics (such as age, gender, etc.) and then randomly assigned to different groups or conditions in the experiment to control for those characteristics.
Matched-groups design
involves establishing whether scores on a test correlate with a criterion measure that assesses the same construct or related constructs.
Criterion-related validity
Two types of criterion-related validity
Concurrent validity and Predictive validity
Compares average scores between three or more groups to see if they differ significantly. Involves one categorical factor.
One-way ANOVA
examines how two factors independently and together influence an outcome. Involves two categorical factors.
Two-way ANOVA
is a well-known personality inventory that assesses personality traits based on the Five Factor Model (FFM) or Big Five personality traits. These traits are:
- Openness to Experience
- Conscientiousness
- Extraversion
- Agreeableness
- Neuroticism
NEO-PI-R
Personality model with three dimensions:
P: Psychoticism (tough-mindedness vs. tender-mindedness)
E: Extraversion (sociable vs. solitary)
N: Neuroticism (emotionally stable vs. unstable)
Eysenck’s PEN Model of Personality
Theory of psychological needs - Includes needs for achievement, affiliation, dominance, and nurturance
Murray’s Psychogenic Needs
Occurs when all or most participants score at the lower end of a scale due to the test being too difficult for them. This results in a clustering of scores at the bottom of the scoring range.
Floor effect
This would occur if a large number of participants scored at the upper end of a scale, indicating that the test was too easy for them.
Ceiling effect
Recalling and describing childhood memories to uncover underlying personality traits or conflicts.
Early Memories Procedure
Interpreting ambiguous scenes or pictures to reveal unconscious thoughts or emotions.
Scenotest
Responding spontaneously to stimulus words to uncover unconscious thoughts or feelings.
Free Association Technique
A projective psychological test where respondents are shown ambiguous pictures and asked to tell a story about each, revealing their perceptions, attitudes, and desires.
TAT (Thematic Apperception Test)
A questionnaire that assesses the Big Five personality traits (Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness) in individuals.
NEO-PI-R (Revised NEO Personality Inventory)
A comprehensive personality inventory measuring 16 primary personality traits that provide a detailed picture of an individual’s personality.
16PF (16 Personality Factors)
A self-report questionnaire based on Carl Jung’s theory of psychological types, categorizing individuals into one of 16 personality types based on their preferences in four dichotomies
MBTI (Myers-Briggs Type Indicator)
A non-verbal test measuring abstract reasoning ability through visual pattern completion.
Raven’s Standard Progressive Matrices
A psychological assessment tool using true/false questions to evaluate personality traits and psychopathology
MMPI (Minnesota Multiphasic Personality Inventory)
are used to assess an applicant’s honesty, trustworthiness, and reliability. They are not specifically measures of loyalty; rather, they focus on ethical behavior and attitudes towards work-related behaviors.
Integrity tests
Assesses personality traits and psychopathology using true/false questions.
MMPI (Minnesota Multiphasic Personality Inventory)
Evaluates visual-motor skills and perceptual abilities through geometric figure copying
Bender-Gestalt Test
Measures individual preferences based on psychogenic needs theory
Edwards Personal Preference Schedule (EPPS)
Participants are assigned to different groups to compare the effects of the independent variable across groups.
Between-subjects design
Each participant experiences all levels of the independent variable to compare effects within individuals.
Within-subjects design
Combines both between-subjects and within-subjects designs, using multiple independent variables.
Mixed design
Examines the joint effects of multiple independent variables on the dependent variable.
Factorial design
is usually placed near the beginning of the test and is purposely made to be relatively. easy to alleviate test related anxiety.
Giveaway item
A form used to collect detailed personal information about individuals applying for positions, particularly in military contexts. It includes personal history, education, employment, and other relevant details to assess candidates’ suitability for roles.
Personal Data Sheet (PDS)
Basic tests that measure straightforward abilities or knowledge, often used by a wide range of professionals.
Class A tests
More complex tests that require specialized training or knowledge to administer and interpret, focusing on personality or specific cognitive abilities.
Class B tests
Highly specialized tests that assess complex constructs or individualized aspects of cognition, typically requiring extensive training and expertise to administer and interpret accurately.
Class C tests
Correlates a dichotomous item score with the total test score
Point-biserial method
Compares performance between very high and very low scorers on the test.
Extreme group method
Uses correlation coefficients to measure relationships between variables.
Correlation Coefficient method
is a method of scaling test scores on a nine-point standard scale with a mean of five (5) and a standard deviation of two (2).
Stanine (Standard NINE)
Written test for literate recruits, assessing verbal and numerical abilities, among others.
Army Alpha
Non-verbal test for illiterate or non-English speaking recruits, using pictorial materials to assess spatial and mechanical abilities.
Army Beta