Final Flashcards
An evaluation that uses the data collected before a program begins. Examples include pilot projects, baseline surveys, and feasibility studies.
Design evaluation
Coefficient of stability: Evaluator administers the same instrument twice, separated by a short period of time. Results are compared, using a statistic such as a correlation coefficient
Reliability coefficient/ test-retest reliability
Evaluator administers two equivalent versions of the same instrument (parallel forms) to the same group of people. Results are compared, using a coefficient of stability.
Alternate-form coefficient
A nonexperimental research method relying on questionnaires or interview protocols
Survey
The extent to which researchers can make this statement with confidence
Internal validity
Study conducted at a single time period, and data are collected from multiple groups
Cross-sectional study
Data are collected at two or more points in time
Longitudinal study
Design that combines cross-sectional and longitudinal elements by following two or more age groups over time
Cohort-sequential design
Two observers’ data are compared to see whether they are consistently recording the same behaviors when they view the same events. Statistical procedures such as correlation or percentage of agreement can be used to establish this type of reliability.
Interrater reliability
Interpretive research approach relying on multiple types of subjective data and investigation of people in particular situations in their natural environment
Qualitative research
Use of multiple data sources, research methods, investigators, and/or theories/ perspectives to cross-check and corroborate research data and conclusions
Triangulation
used to determine whether a single observer is consistently recording data over a period of time.
Intrarater reliability
To what degree does all accumulated evidence support the intended interpretation of scores for the proposed purpose?
construct validity
commonly used data collection instruments or procedures designed to measure personality, aptitude, achievement, and performance.
Tests
Items on the test represent content covered in the program (e.g., did the teacher teach the children that blue on the map means water?). Evaluators can work with content specialists to list the content that is part of the program, and can compare the test items to see whether they correspond.
Content-related evidence
a self-report data collection instrument that is filled out by research participants.
Questionnaire
The researcher takes independent samples from a general population over time and the same questions are asked.
Trend study
a situation where the interviewer asks the interviewee a series of questions
Interview
indicates that the measure actually reflects current or future behaviors or dispositions.
Criterion-related evidence
a situation where a focus group moderator keeps a small and homogeneous group (of 6–12 people) focused on the discussion of a research topic or issue.
Focus group
Evaluators need to be aware of the consequences of using data, especially with regard to the potential to worsen inequities.
Consequential evidence
Researcher watches and records events or
behavioral patterns of people
Observation
Observation conducted in real-world situations
Naturalistic observation
Observation conducted in lab setting set up by the researcher
Laboratory observation
Evaluators need to stay on site for sufficient time to “get the story right.” If a study is conducted over too short a time or interviews are conducted with too few people, it is possible that an evaluator will reach “premature closure”
Prolonged and substantial engagement
the ability to generalize the results of a study to the population from which the research population was drawn.
External validity
A word that produces an emotionally charged reaction
Loaded term
A question that suggests how the participants should
answer
leading question
Asking about two or more issues in a single question
Double-barreled question
Nonoverlapping response categories
Mutually exclusive categories
Response categories that cover the full range of possible responses
Exhaustive categories
An ordered set of response choices, such as a 5-point rating scale, measuring the direction and strength of an attitude
Rating scale
Descriptors placed on points on a rating scale
Anchor
Participant must select from the two response choices provided with an item
Binary forced-choice approach
A scaling technique that is used to measure the meaning that participants give to attitudinal objects or concepts and to produce semantic profiles
Semantic differential - a scaling technique that is used to measure the meaning that participants give to attitudinal objects or concepts and to produce semantic profiles
An item directing the participant to different follow-up questions depending on the initial response
Contingency question
Tendency for a participant to respond in a
particular way to a set of items
response
Evaluators need to be aware of their assumptions, hypotheses, and understandings, and of how these change over the period of the study.
Progressive subjectivity
those incurred to provide a treatment or a service
direct costs
resources lost due to the disorder.
indirect costs
Events that occur during the study other than the experimental treatment effect that can influence the results
History threat
describing design that is intended to simplify life for everyone by making products, communications, and the built environment more usable by as many people as possible at little or no extra cost
Universal design
naturally occurring physical or psychological changes
Maturation threat
A monitoring system that would involve community members in determining what indicators of change are important to them
Most Significant Change Method
Christian Commission for Development in Bangladesh and Rick Davies
Administration of a pretest that affects participants’ performance on a posttest (i.e., participants become “testwise”).
testing threat
Having pretests and posttests that differ in terms of difficulty; this can lead to seeming changes that are really due to the difference in the tests.
Instrumentation
Having extreme groups in a study (i.e., people very high or very low on a particular characteristic); it is possible that changes will be seen on the dependent variable because of this threat.
statistical regression
Differences between the experimental and control groups on important characteristics, other than receipt of the intervention.
differential selection
Differential dropouts of participants from either the experimental or control groups.
experimental mortality
treatment was not implemented as intended
treatment fidelity
monetary (i.e., dollar) values are estimated for the resources used (i.e., costs) and for the program effects (i.e., benefits), and these two components (costs and benefits) are then compared to determine the worth of a program.
cost-benefit analysis
based on cost and effect data (rather than cost and benefit data as in benefit-cost analysis).
cost-effectiveness analysis
What a concept means in abstract or theoretical terms
Conceptual definition of a sample
What you will observe and/or measure; links the concept you want to sample to the real world
Operational definition of a sample
Group to which you wish to generalize your results
target
List of people who match the conceptual definition
Experimentally accessible population
List of people in the experimentally accessible population
Sampling frame
Match between accessible population and target population
population validity
voluntary consent without threat or undue inducement; it includes knowing what a reasonable person would want to know before giving consent (informed), and explicitly agreeing to participate (consent).
informed consent
collecting, analyzing, storing, and reporting data in such a way that the data cannot be traced back to the individual who provides them.
confidentiality
involves the selection of a sample from a population in a way that allows for an estimation of the amount of possible bias and sampling error.
Probability-based sampling
Those in which every member of a population has a known nonzero probability of being included in the sample.
Random samples
Every member of the population has an equal and independent chance of being selected.
Simple random sampling
A robust description of the context in which qualitative research is conducted
Thick description
Selection of easily obtainable participants for sample group and usually the cheapest and fastest way of obtaining a sample group; interviewers happen to be available at a program site.
Convenience sampling
When will you have a poor strength of treatment?
A. When the participants are sensitized to the posttest.
B. When the evaluator uses different types of measurement for dependent variables.
C. When the treatment “dose” was not strong enough to produce the expected change.
D. When the evaluator systematically conducts an intervention at different times of the day.
C
In evaluation, a limitation of traditional definitions of needs assessment (according to your textbook) is that they focus only on needs and lack any attention to assets?
A. True
B. False
True
The assumption that all people within a particular subgroup are similar to each other in terms of their other background characteristics, or similar enough where differences are not identified.
Myth of homogeneity
Which of the following is(are) needed to infer causality?
A. X precedes Y
B. X is related to Y
C. Confounding variables can be ruled out as causes
D. All of the above are required.
D
When an extraneous variable systematically varies with the independent variable and influences the dependent variable, it is called: A. Another dependent variable B. A confounding variable C. A moderating variable D. An unreliable variable
B
Data analysis tends to be an ongoing and iterative (nonlinear) process in qualitative research.
The cyclical process of collecting and analyzing data during a single research study).
Interim analysis
the difference between the sample and the population.
Sampling error
A major characteristic of experimental and quasi-experimental designs is an independent variable that can be manipulated.
A. True
B. False
True
Used in telephone surveys; computer generates a random list of phone numbers.
Random - digit dialing
If an evaluation is focused on addressing human rights, with what paradigm is it most closely aligned? A. Postpositivist B. Transformative C. Constructivist D. Pragmatic
B
Take every nth name off a list. Suppose you have 1,000 names on the list and you need a 10% sample. You pick a random number between 1 and 10, start there, and then take every 10th name on the list.
Systematic sampling
Which of the following is NOT a purpose for an evaluation?
A. To identify needed inputs, barriers, and facilitators to program development or implementation.
B. To determine inequities on the basis of gender, race, ethnicity, disability, and other relevant dimensions of diversity.
C. To determine how a program can be made to look positive despite its lack of impact/effect.
D. To demonstrate that accountability requirements are fulfilled.
C
If there are different groups (strata) that you want to be sure to include, then you can divide the population into subgroups first and then randomly sample from the subgroups.
Stratified sampling
What are some designs used by the methods branch to determine the effectiveness of an intervention? A. Experimental, quasi-experimental, single-group, and survey B. Narrative, phenomenological, ethnographic, and case study C. Dialectical, transformative, and case study D. Gender, race, and class analysis
A
Use with naturally occurring groups (e.g., classrooms, school districts, city blocks). Units are randomly selected from full list of possible sites. Then you can collect data from the members in the randomly selected unit.
Cluster sampling
Which of the following is legitimately considered (according to your textbook authors) a type of evaluation?
A. Organizational assessment
B. Participatory evaluation
C. Cost analysis/assessment
D. All of the above are considered legitimate
D
Use a combination of sampling strategies over the course of the study (e.g., start with cluster sampling, then use simple random sampling within clusters).
Multistage sampling
Cost Analysis is defined as an evaluation’s determination of whether a program’s effect was worth its cost.
A. True
B. False
A
Pilot testing data collection instruments is not very important.
A. True
B. False
A
Choose unusual or special individuals (e.g., highly successful or unsuccessful school principals).
Extreme or deviant sampling
Identify instances where the phenomenon of interest is strongly represented. Look for rich cases that are not necessarily extreme.
Intensity sampling
What are some examples of qualitative data collection options as listed by your textbook authors?
A. Observations, interviews, review of artifacts, and focus groups
B. Performance assessments and structured observations
C. Interviews, surveys with random samples, and norm-referenced tests
D. Structured observations, criterion-referenced tests, and portfolios
A
What are some forms of evidence used to support validity/credibility in qualitative data collection? A. Peer debriefing B. Member checks C. Persistent observations D. All of the above
D
Choose individuals that represent maximum variation of the phenomenon (e.g., teachers in isolated rural, suburban, and inner-city areas).
Maximum variation sampling
Identify strongly homogeneous cases; find individuals who share relevant characteristics and experiences.
Homogeneous sampling
What is universal design as discussed by the textbook authors?
A. A way to simplify life for everyone by making products, communications, and the built environment more useable by as many people as possible.
B. A technique for making tests more accessible to disabled people.
C. Use of multiple languages in a data collection instrument
D. All of the above.
A
In most cases, use homogeneous groups (e.g., if service providers and participants are included in the same focus group, this might yield biased results).
Focus group sampling
This is the opposite of the extreme or deviant sampling strategy; you want to identify the typical or average
Typical case sampling
What is INTRArater reliability?
A. It is used to determine whether a single rater or observer is consistent over time.
B. It compares the data of two raters or observers to see whether they are rating the same behavior consistently.
C. It is used to compare two kinds of data collection to see whether they are describing the same event.
D. It is used to compare when different raters are administering similar instruments
A
Strategy combines the identification of strata of relevant subgroups with purposeful selection from those subgroups.
Stratified purposeful sampling
Use cases that can make a point dramatically or are important for other reasons. Patton (2002b) says that the key to identifying a critical case is “If it’s true of this one case, it’s likely to be true of all other cases” (p. 243).
Critical case sampling
An evaluator created a test. In order to test reliability, he had the participants take the test and he analyzed the results to examine the consistency of their responses. What is this an example of? A. Repeated measures reliability B. Intraparticipant reliability C. Internal-consistency reliability D. Multi-dimensional reliability
C
Start with key informants who are then asked to recommend others you should talk with—some who agree with them and some who disagree with them.
Snowball or chain sampling
Set up criteria to specify what characteristics people in the study need to have.
Criterion sampling
Reliability and validity are commonly used terms to describe the quality of quantitative data collection. What does validity mean in this situation?
A. Does the instrument measure cultural competence?
B. Does the instrument (as used with the participants) really measure what it is supposed to measure?
C. Does the instrument measure what is it is supposed to measure consistently?
D. Does the instrument reliably measure what it is supposed to measure over time?
B
If the evaluation is focused on a theoretical construct such as creativity, you need to describe the meaning of that construct, and then identify individuals who theoretically exemplify that construct.
Theory-based sampling
In multiple regression, when we say that we control for the effects of some variable(s) we are:
A. statistically adjusting or subtracting the effects of a variable to see what a relationship would have been without it
B. actually removing a variable from a model so that it does not interact with the effects of other variables
C. changing the mediating capabilities of an endogenous variable
D. changing the mediating capabilities of an exogenous variable
A
Look for cases that both confirm and disconfirm emerging hypotheses.
Confirming and disconfirming case sampling
What kind of data collection is useful when you want people to discuss a particular topic? A. Case studies B. Interviews C. Focus groups D. Open-ended questionnaires
C
Selection of individuals emerges as the study progresses; you do not know a priori who will need to be included.
Opportunistic sampling
What are some critical issues related to data collection?
A. Language of participants
B. Literacy level of participants
C. Use of a dominant or colonizing language
D. All of the above
D
Randomly choose individuals from a purposefully defined group.
Purposeful random sampling
The \_\_\_\_\_\_ sampling strategy chooses unusual or special individuals (e.g., highly successful or unsuccessful school principles)? A. Maximum variation sampling B. Stratified purposeful sampling C. Extreme or deviant sampling D. Snowball or chain sampling
C
Determine whether there is a political reason for including particular areas and individuals for the credibility and perceived usefulness of the study.
Politically important case sampling
What is the definition of a sampling frame in your textbook (Mertens and Wilson)?
A. The target population of your study
B. List of all the people in the experimentally accessible population.
C. The people you plan to observe
D. None of the above
B
Which of the following is NOT one of the 13 categories of disabilities in the Individuals with Disabilities Education Improvement Act of 2004? A. Emotional disturbance B. Speech of language impairment C. Visual impairment D. Majority-minority group membership
D
The set of cases selected from the population
Sample
Which of the following is a type purposeful/theoretical sampling? A. Simple random sampling B. Critical case sampling C. Systematic sampling D. Cluster sampling
B
The full group to which one wants to generalize
Population
Which major sampling option is more commonly used in the Values Branch? A. Probability-based sampling B. Multistage sampling C. Theoretical/Purposeful sampling D. Simple Random Sampling
C
A numerical index based on sample data
Statistic
What is the opposite of deviant case or extreme sampling? A. Typical case sampling B. Homogeneous sampling C. Critical case sampling D. Snowball sampling
A
A numerical characteristic of a population
Parameter
What kind of sampling strategy begins with a random start, includes that element, and then includes every nth name off a list? A. Interval sampling B. Systematic sampling C. Random digit sampling D. Multistage sampling
B
The type of statistical analysis focused on describing, summarizing, or explaining a set of data
Descriptive statistics
According to your text, the “myth of homogeneity” means assuming that all people within a particular subgroup are similar to each other in terms of their other background characteristics, or at least sufficiently similar that you do not have to focus on those differences.
A. True
B. False
True
The type of statistical analysis focused on making inferences about populations based on sample data
Inferential statistics
What is nested sampling?
A. You gather samples using a variety of methods.
B. You use different people from different populations
C. You have identical samples for both the quantitative and qualitative parts of the study.
D. Data are collected from a group using one method; then a subset of that group is selected to provide data using another method.
D
The theoretical probability distribution of the values of a statistic that would result if you selected all possible samples of a particular size from a population
Sampling distribution
Which sampling strategy uses cases that can make a point dramatically or are important for other reasons? A. Stratified random sampling B. Critical case sampling C. Homogeneous sampling D. Politically important case sampling
B
The theoretical probability distribution of the means of all possible samples of a particular size selected from a population
Sampling distribution of the mean
The standard deviation of a sampling distribution
Standard error
A statistic that follows a known sampling distribution and is used in significance testing
Test statistic
An evaluation that is allowed to evolve throughout the course of the project. Examples include participatory, qualitative, critical, hermeneutical, bottom-up, collaborative, and transdisciplinary approaches.
Emergent evaluation
The branch of inferential statistics focused on obtaining estimates of the values of population parameters
estimation
Use of the value of a sample statistic as one’s estimate of the value of a population parameter
Point estimation
Placement of a range of numbers around a point estimate
Interval estimation
Use a word or short phrase to summarize the topic found in a passage of the data.
Descriptive codes
Use the exact language of the participants as a code.
In vivo codes
Captures actions in the data and usually ends in “-ing.”
Process coding
Labels emotions that are expressed by the participants.
Emotion coding
Can reflect the values, attitudes, or beliefs expressed by participants.
Values coding
A hypothesis states that there is no difference between the scores of the experimental group and the control group
Null hypothesis
The hypothesis states that there will be a difference between means in the population.
Alternative hypothesis
An interval estimate inferred from sample data that has a certain probability of including the true population parameter.
Confidence interval
A set of data, where the rows are “cases” and the columns are “variables”
Data set
The analysis guides the design of subsequent stages of a study and leads to further analysis that integrates the data from these stages.
Sequential integration
Data arrangement in which the frequencies of each unique data value is shown
Frequency distribution
The branch of inferential statistics focused on determining when the null hypothesis can or cannot be rejected in favor of the alternative hypothesis
Hypothesis testing
Depicting frequencies and distribution of a quantitative variable
Histogram Graph
The point at which one would reject the null hypothesis and accept the alternative hypothesis
Alpha level
The average deviation of data values from their mean in squared units
Variance
The area on a null hypothesis sampling distribution where the observed value of the statistic, if it fell in this area, would be considered a rare event
Critical region
The square root of the variance
Standard deviation
The likelihood of the observed value (or a more extreme value) of a statistic, if the null hypothesis were true
Probability value (p value)
Conclusion that an observed finding would be very unlikely if the null hypothesis were true
statistically significant
Used to determine if the difference between the means of two groups is statistically significant
Independent samples t test:
Claim made when a statistically significant finding seems large enough to be important
Practical significance
Methods used to gather data in an emergency response situation in order to share information in real time.
Rapid evaluation and assessment methods (REAM)
An index of magnitude or strength of relationship
Effect size indicators
Rejection of a true null hypothesis
Type 1 error
Failure to reject a false null hypothesis
Type II error
Which of the following is used for group differences evaluation questions?
A. t test for independent samples
B. Pearson product-moment coefficient of correlation
C. Mean and variance
D. Range
A
What are some strategies for analyzing qualitative data?
A. Engage in continuous and ongoing data analysis
B. Reflectively reading interview transcripts and field notes to get a holistic picture of the research question.
C. Determining codes for the data that suggest emergent concepts
D. All of the above
D
Who are the researchers who initiated grounded theory as a systematic method? A. Glaser and Strauss B. Campbell and Shadish C. Pope and Wallace D. Patton and Stake
A
One reason that statistical analysis is useful is:
A. It helps you in coding interview transcripts
B. It helps you reduce a large amount of data into more meaningful terms such as an average
C. It is a systematic system for organizing data into relevant categories or themes.
D. It is useful for obtaining a thick description of the data
B
According to your textbook, generalizability is only a concern in the interpretation of quantitative data
A. True
B. False
B
According to the authors of your text, it is sometimes appropriate to involve stakeholders in the analysis phase of the evaluation.
A. True
B. False
A
What are some theoretical frameworks commonly used in qualitative data analysis?
A. Postpositivism
B. Postpragmatism
C. Postmodernism
D. Feminist theory and indigenous theory
D
According to your text, it is rarely if ever important for evaluators to use a particular theoretical framework lens in analyzing data.
A. True
B. False
B
What factor(s) might influence whether you decide to analyze your data using software or manually?
A. The amount of time you have available to analyze your data.
B. The amount of data you have collected
C. The training and support available at your institution.
D. All of the above are factors are important to consider.
D
The \_\_\_\_\_\_ sampling strategy chooses unusual or special individuals (e.g., highly successful or unsuccessful school principles)? A. Maximum variation sampling B. Stratified purposeful sampling C. Extreme or deviant sampling D. Snowball or chain sampling
C
What is the definition of a sampling frame in your textbook (Mertens and Wilson)?
A. The target population of your study
B. List of all the people in the experimentally accessible population.
C. The people you plan to observe
D. None of the above
B
Which of the following is NOT one of the 13 categories of disabilities in the Individuals with Disabilities Education Improvement Act of 2004? A. Emotional disturbance B. Speech of language impairment C. Visual impairment D. Majority-minority group membership
D
Which of the following is a type purposeful/theoretical sampling? A. Simple random sampling B. Critical case sampling C. Systematic sampling D. Cluster sampling
B
Which major sampling option is more commonly used in the Values Branch? A. Probability-based sampling B. Multistage sampling C. Theoretical/Purposeful sampling D. Simple Random Sampling
C
What is the opposite of deviant case or extreme sampling? A. Typical case sampling B. Homogeneous sampling C. Critical case sampling D. Snowball sampling
A
What kind of sampling strategy begins with a random start, includes that element, and then includes every nth name off a list? A. Interval sampling B. Systematic sampling C. Random digit sampling D. Multistage sampling
B
According to your text, the “myth of homogeneity” means assuming that all people within a particular subgroup are similar to each other in terms of their other background characteristics, or at least sufficiently similar that you do not have to focus on those differences.
A. True
B. False
A
What is nested sampling?
A. You gather samples using a variety of methods.
B. You use different people from different populations
C. You have identical samples for both the quantitative and qualitative parts of the study.
D. Data are collected from a group using one method; then a subset of that group is selected to provide data using another method.
D
Which sampling strategy uses cases that can make a point dramatically or are important for other reasons? A. Stratified random sampling B. Critical case sampling C. Homogeneous sampling D. Politically important case sampling
D