Final Exam Flashcards
Define Survey
- Surveys focus on group outcomes
- Surveys allow us to collect information so that we can describe and compare how people feel about things (attitudes), what they know (knowledge), and what they do (behavior)
what factors determine the type of survey software to use
is dependent on the specific needs, desire and constraints of the company or person
the characteristics of a good survey
o Have specific and measurable objectives
o Contain straightforward questions that can be understood similarly by most people
o Have been retested to ensure that there are no unclear questions or incorrect skip patterns
o Have been administered to an adequate sample of respondents so that the results are reflective of the population of interest
o Include the appropriate reporting of results (both verbal and written)
o Have evidence of reliability and validity
define experimental research techniques
help us determine cause and effect
define descriptive research techniques
help us describe a situation or phenomenon
what type of research techniques do surveys mostly use
descriptive research techniques
why do we develop a new test
- Meet the needs of a special group of test takers
o There are subgroups that need to be assessed. Like a new job that wasn’t there before. - Sample behaviors from a newly defined test domain
- Improve the accuracy of test scores for their intended purpose- low quality
- Tests need to be revised/modified (ex. old items, old norms)
o Possibly uses wording that is no longer acceptable Ex. multiple personality disorder is not DID
o Old normative groups: you cannot compare someone from the modern day to someone from 10 years ago - Tests may assess clinically useful constructs, but may be impractical for real – world clinical applications
o Ex. can we look at IQ clinically the same way we do in business
what are the 4 distinct stages of developing a test
- Test conceptualization
- Test structure and format
- Standardization
- Plan implementation (revisions)
what are the 2 questions that must be answered in order for you to know if there is a point creating a new test
will the test improve practice/ research
& will improve our knowledge of human behavior.
what are the tests in Phase 1: test conceptualization
- conduct a review of literature and develop a statement of need for the test
- describe the proposed uses and interpretations of results from the test
- describe who will use the test and why (including statement of user qualifications)
- develop conceptual and operation definitions of construct you intend to measure
- determine whether measures of dissimulation are needed and if so what kind
steps to defining the test universe
o Prepare a working definition of the construct (more conceptual in nature)
o Locate studies that explain the construct
o Locate current measures of the construct
what is included in the purpose of the test
what the test will measure and how the test users will use the test scores
the information that the test will provide to the test user
what do you do if there are no studied or measures on the construct
you go the theoretical model
what do you do if there is no theoretical model
you go to the studies, measure or theoretical models of constructs which are similar and create a new theoretical model
define operation definitions
specific behaviors that represent the purpose
what does a test plan or table of specification include
a definition of the construct, the content to be measured (test domain), the format for the questions, and how the test will be administered and scores
what are the steps of phase 2: specification of test structure and format
- age range appropriate for this measure
- testing format (ex. individualized or group, print of computerized) who will complete the test (ex. the examiner, the exmianee, and some other informant)
- the structure of the test (ex. subscales, composite scores, etc) and subscales (if any) will be organized
- written table of specifications
- item formats (given by subsets or subscales if any, with sample items illustrating ideal items) and a summary of instructions for administration and scoring
- written explanation of mehtods for item development (how items will be determined- will you need content experts to helpwrite or review items?), tryout, and final item selection
what does a test format refer to
refers to the type of questions the test will contain (usually one format per test for ease of test takers and scoring)
what are the two elements of test formats
o Stimulus (ex. a question or phase)
Stimulus to which the test taker responds
Ex. multiple choice is the question and the mechanism is the four of five possible answer in the question
o Mechanism for response (ex. multiple choice, true- false, essay, boarding licensing exam)
o May be objective (agreement) or subjective (possible disagreement) test format
define structured record reviews
- Forms that guide data collection from existing record (ex. using a form to collection information from personnel files)
what are structured observations
which are forms that guide an observe in collecting behavioral information (ex. using form to document the play behaviors of children on the playground)
define objective/ structured test types
Has one correct answer or that provide evidence of a specific construct
types of objective/ structured test types
o Selected response
o Multiple choice
o True false, forced choice
o Likert scales (also typical)
types of subjective/ free response test types
o Essay, short answer
o Interview questions
o Fill in the blank
o Projective techniques
define subjective/ free response test types
constructed response. Do not have on correct answer. Based on the interpretation that the response is correct or not correct or providing evidence of a specific construct is left to the judgement of the person who scores the test
which is most preferred objective test types or subjective test types
objective test types
how do objective and subjective test formats differ in sampling
o Objective tests are faster an therefore the test developer can cover a wider array of topis thereby increasing the available evidence of validity based on test content
o When the testing universe covers a wide array of topics objective tests are better
how do objective and subjective test formats differ in test construction
o Objective items especially M/C items requires extensive through and development time to come up with all the balanced possible responses
o Subjective tests required fewer items and are easier to construct
o Subjective tests are better suited for testing higher order skills such as creativity
how do objective and subjective test formats differ in scoring
o Objective scoring is simple and can be done by a computer or an aide with a high degree of reliability and accuracy.
o Scoring subjective items require time consuming judgements by an expert
how do objective and subjective test formats differ in response sets
o On objective tests, test takers can guess the correct answer and they can choose answers based on social desirability
o For subjective tests, test takers may bluff or pad answers with superfluous or excessive information. Scorers might be influenced by irrelevant factors such a spoor verbal or writing skills
what are distracters/alternatives
The wrong answers in a multiple choice test
pros of a multiple choice test
- More answer options (4-5) reduce the chance of guessing that an item is correct
- Many items can aid in student comparison and reduce ambiguity, increase reliability
cons of a multiple choice test
- Measures narrow facets of performance
- Reading time increased with more answers
- Transparent clues (ex. verb tenses, or letter uses “a” and “an”) may encourage guessing
- Difficult to write four or five plausible choices
- Takes more time to write questions- limit use of “none of the above” or “all of the above” to “+ or – worded items – never/always”
advantages of structured response/ selected response test types
- Great breadth (3 of items, covering content)
- Quick scoring
- Decreases influence of possible factors that may influence error (ex. writing ability)
disavantages of structured response/ selected response test types
- Limited depth
- Hard to write
- Difficult to assess higher levels of skills and at times you cannot measure it (writing ability and running ability)
- Guessing/ memorization vs knowledge
disadvantages of forced choice test types
has very little face validity which may produce poor responses form test takers. Making a number of decisions between or among apparently unrelated words or phrases can become distressing and test takers who want to answer honestly and accurately often become frustrated with forced choice questions
advantages of forced choice test types
the items are more difficult for respondents to guess or fake
where are forced choice items test types mostly used
used primarily in personality and attitude tests
define structured interviews
have scoring and have criteria for scoring (like a rubric)
define unstructured interviewing
have no scoring or criteria for scoring
what are projective techniques
- Projective techniques are often employed in clinical setting
o Uses a highly ambiguous stimulus to elicit an unstructured response (ie the test takers “projects” his or her perception and perspective onto a neutral stimulus)
EX. THE PAINT SPLATTER ROCHSCHER TEST
advantages of subjective items/ free response/ constructed response items
- Easier to write
- Can test higher cognitive skills
- Encourages organized/developed thoughts
- Eliminates guessing
disadvantages of subjective items/ free response/ constructed response items
- Difficult to grade- influence of feigning and impact of writing ability
- Judgement error (ex. interrater reliability)
- Required advance- objective scoring key
- Fewer items
what type of response item format is a likert scale
a typical response item format
define performance assessments
require test takers to directly demonstrate their skills and abilities to perform a group of complex behaviors and tasks ex. an audition of a musician trying out for a band
o The setting in which these tasks are demonstrated is made as similar as possible to the conditions that will be found when the tasks are actually performed
define simualtions
require test takers to demonstrate their skills and abilities to perform a complex task
o the tasks are not performed in the actual environment in which the real tasks will be performed often due to safety or cost- related concerns
define portfolios
collection of work products that a person gathers other time to demonstrate his or her skills and abilities in a particular area
what is dissimulation
o When a person misrepresent himself or herself in positive or negative manner
o Decreases the reliability and validity of the measurement
define response sets
Are patterns of responding that result in misleading information and limit the accuracy and usefulness of the test scores
reasons why people lie or answer fakely or random on a test
o 1. Information requested is too personal
o 2. Answer items carelessly
o 3. May feel coerced into completing the test or do not motivated to give maximum effort
o 4. Believe that is how they are supposed to answer
define social desirability
- Some test taskers choose socially acceptable answers or present themselves in a favorable light
answering in a way that makes them look socially acceptable or in a favourable light
define faking
some test takers may respond in a particular way to cause a desired outcome
define random responding
responding to items in a random fashion by marking answer without reading or considering them
reasons why people fake bad
o Cry for help
o Want to plea insanity in court
o Want to avoid draft in military
o Want to show psychological damage
define acquiescence
A tendency to agree with the idea or behaviors presented
they Believe this is how they are supposed to answer
suggestions for writing good test items
o Consider the time necessary to complete
o Prepare answer key ask an expert to review items to reduce ambiguity and inaccuracy
o Use multiple independent scorers/raters
o Score essays anonymously
o Identify item topics by consulting the plan
o Be sure that each item is based on an important learning objective or topic
o Write items that assess information or skills drawn only from the testing universe
o Write each item in a clear and direct manner
o Use vocabulary and language appropriate for the target audience
o Avoid using slang of colloquial language
o Make all items independent
o Ask someone else (preferably a subject matter expert) to review items in order to reduce unintended ambiguity and inaccuracies
what should administration instructions include
o Whether the test should be administered in a Group or individual administration
o Requirements for location (ex. quiet, privacy)
o Required equipment (computer, pencil)
o Time limits or approximate completion time
o Script for administrator and answers to questions test takers may ask
o Credentials or training require for the test administrator
define population
all members of the target audience
define sample
administering a survey to a representative subject of the population
define probability sampling
the type of sampling that uses statistics to ensure that a smaple is representative of a population
define simple random sampling
every member of a population has an equal chance being chosen as a member of the sample
o Ex: if your population is every student at GH for this sampling method you could have the name of every single student and put it in a hat or whatever and randomly select participants
simple random sampling, stratified random sampling, and cluster sampling are all examples of what sampling method
probability sampling methods
define systematic random sampling
every nth person is chosen (ex. every 3rd person)
denim stratified random sampling
population is divided into subgroups (ex. age, gender, SES, race)
o the population is divided into subgroups or strata
o a random sample is selected from a stratum
o A certain amount of people have to be in the sample in each subgroup ex. there must be at least 50% males and 50% females
define cluster sampling
used when it is not feasible to list all the individuals who belong to a particular population and is a method often with surveys that have large target populations (ex. east, west, central, north, south)
Dividing the population into clusters and picking a certain number of clusters to be in your sample. Ex. you divide the population into 5 clusters and pick 3 of those clusters to be part of your sample)
define non probability sampling
a type of sampling in which not everyone has an equal chance of being selected from the population
define convince sampling
the survey researcher uses any available group of participants to represent the population
define sample size
refers to the number of people needed to represent the target population accurately
o The more similar the members of the population the smaller the sample needs to be. The more dissimilar the members of the population the larger the sample needs to be
o The fewer the people chosen to participate in the test, the more error the survey results are likely to include
define homogeneity of the population
how similar the people in your population are to one another (more similar the smaller the size)
define sampling error
a statistic that reflects how much error can be attributed to the lack of representation of the target population by the same of respondents chosen
define distributing the survey
how will the instrument/ test be given to the respondent (mail, phone, weblink, in person)
what is a cumulative/summative model of scoring method
o Assumes that the more a test taker responds in a particular fashion the more he/she has of the attribute being measured (ex. more “correct” answers, or endorses higher numbers on a Likert scale)
o The test taker receives one point for each correct answer and the total number of correct answers becomes the raw score on the test
define semantic differential
adjective pairs at each end of the continuum
what is a ipsative model of scoring methods
test takers is given 2 or more options to choose from
o The ipsative model only tells u information regarding where test takers stand relative to themselves on the constructs that the test is designed to measure
what scoring method model is most commonly used
Cumulative/summative model
what is the categorical model scoring method
is used to put the test taker in a particular group or class o Test takers scores are not compared to that of other test takers but rather compare the scores on various scales within the test takers (which scores are high and low) or pattern of responses o Looks at the patterns within to see if you are more of or less of and then puts you in a category
what is a pilot test
- A scientific evaluation of the test’s performance
- Administering the test to a sample of test’s targets audience and analyzing the data obtained from the pilot test
in a pilot test what is the depth and breadth of the pilot test dependant on
depends on the size and complexity of the target audience and the construct being measured
define item analysis
how developers evaluate the performance of each test item
define item difficulty
percentage of test takers who respond correctly (vs. total # of people)- assesses the p value (percentage value)
define discrimination index
compares the performance of those who obtained very high test scores (the upper group [U]) with the performance of those who obtained very low test scores (the lower group [L]) on each item.
equation for calculating the upper group (U)
U = (# of people who responded correctly )/(total # of people in the upper group) × 100
equation for calculating the lower group (L)
L = (# of people in the lower group her responded correctly )/(# of people in the lower group ) × 100
equation for calculating discrimination index
Discrimination Index= U -L
what number of discrimination index is best
30
what does it mean if you have a low or negative discrimination index
- If the D value is low or negative that means that the item is not discriminating between higher scorers and low scorers in this case the test developers have to discard and rewrite items that have low or negative D values
define item-response theory (IRT)
estimate of the ability of test takers that is independent of the difficulty of the items presented as well as estimates of item difficult
define characteristic curve (ICC)
the line that results when we graph with the level of ability on the construct being measured
o We can determine the difficulty of an item on the ICC by locating the point at which the curve indicates. Probability of .05 of answering correctly. The higher the ability level associated with this point, the more difficult the question
what is an optimal .p value
0.5
what does a .p value of .7 .8 or .9 mean
question or test is too easy
what does a .p value of .3 .4 or .2 mean
question or test is too hard
define item bias
when an item is easier for one group than for another group
o the preferred method of researchers involves the computation of item characteristics by group (ex. men and women) and using the ICCs to make decisions about item bias
define interitem correlation matrix
displays the correlation of each item with every other item
o Provides important information for increasing the test’s internal consistency
o Ideally each item should be correlated with every other item measuring the same construct and should not be correlating with items that do not measure the same construct
define phi coefficients
the result of correlating two dichotomous (having only two values) variables
define item total correlation
a measure of the strength and direction of the relation between the way test takers responded to one time and the way they respond to all of the items as a whole
measures the strength of the relationship between the way test taker answers one question to the rest of the questions
what does a negative item total correlation mean
the people who answered a question correctly actually did worse on the test than people who got the question wrong
o These questions should be revisited and edited or taken out
define validation
the process of obtaining evidence that the test effectively measures what it is supposed to measure (ie. Reliability and validity)
define cross-validtation
a final round of test administration to another sample (target)
what is written in the Manual
- Gives all the information that was undertaken in the first 3 phases
- Include an adequate description of the test development process so others can replicate what was accomplished and for users to evaluate the usefulness of the test for their purposes
o The reason for this is so that the user is able to use the instrument and if they ever want to replicate what was done, they can - The specific contents will vary according to the types of test and its applications, and some measures may have special legal requirements
pros and cons of publishing the test with a publisher
o Pros: the publisher has expertise that you will get through the test publisher
The publisher has a larger network and range for marketing the test
o Cons: The test publisher owns your test your just the author
pros and cons of self publishing your test
o Pros: you own the test
o Cons: you most likely to not have a great reach or range and therefore cannot network or market that much. Publishes typically have a bigger network