Final Flashcards by Allison Morrow

An evaluation that uses the data collected before a program begins. Examples include pilot projects, baseline surveys, and feasibility studies.

Design evaluation

How well did you know this?

Not at all

Perfectly

Coefficient of stability: Evaluator administers the same instrument twice, separated by a short period of time. Results are compared, using a statistic such as a correlation coefficient

Reliability coefficient/ test-retest reliability

How well did you know this?

Not at all

Perfectly

Evaluator administers two equivalent versions of the same instrument (parallel forms) to the same group of people. Results are compared, using a coefficient of stability.

Alternate-form coefficient

How well did you know this?

Not at all

Perfectly

A nonexperimental research method relying on questionnaires or interview protocols

Survey

How well did you know this?

Not at all

Perfectly

The extent to which researchers can make this statement with confidence

Internal validity

How well did you know this?

Not at all

Perfectly

Study conducted at a single time period, and data are collected from multiple groups

Cross-sectional study

How well did you know this?

Not at all

Perfectly

Data are collected at two or more points in time

Longitudinal study

How well did you know this?

Not at all

Perfectly

Design that combines cross-sectional and longitudinal elements by following two or more age groups over time

Cohort-sequential design

How well did you know this?

Not at all

Perfectly

Two observers’ data are compared to see whether they are consistently recording the same behaviors when they view the same events. Statistical procedures such as correlation or percentage of agreement can be used to establish this type of reliability.

Interrater reliability

How well did you know this?

Not at all

Perfectly

Interpretive research approach relying on multiple types of subjective data and investigation of people in particular situations in their natural environment

Qualitative research

How well did you know this?

Not at all

Perfectly

Use of multiple data sources, research methods, investigators, and/or theories/ perspectives to cross-check and corroborate research data and conclusions

Triangulation

How well did you know this?

Not at all

Perfectly

used to determine whether a single observer is consistently recording data over a period of time.

Intrarater reliability

How well did you know this?

Not at all

Perfectly

To what degree does all accumulated evidence support the intended interpretation of scores for the proposed purpose?

construct validity

How well did you know this?

Not at all

Perfectly

commonly used data collection instruments or procedures designed to measure personality, aptitude, achievement, and performance.

Tests

How well did you know this?

Not at all

Perfectly

Items on the test represent content covered in the program (e.g., did the teacher teach the children that blue on the map means water?). Evaluators can work with content specialists to list the content that is part of the program, and can compare the test items to see whether they correspond.

Content-related evidence

How well did you know this?

Not at all

Perfectly

a self-report data collection instrument that is filled out by research participants.

Questionnaire

How well did you know this?

Not at all

Perfectly

The researcher takes independent samples from a general population over time and the same questions are asked.

Trend study

How well did you know this?

Not at all

Perfectly

a situation where the interviewer asks the interviewee a series of questions

Interview

How well did you know this?

Not at all

Perfectly

indicates that the measure actually reflects current or future behaviors or dispositions.

Criterion-related evidence

How well did you know this?

Not at all

Perfectly

a situation where a focus group moderator keeps a small and homogeneous group (of 6–12 people) focused on the discussion of a research topic or issue.

Focus group

How well did you know this?

Not at all

Perfectly

Evaluators need to be aware of the consequences of using data, especially with regard to the potential to worsen inequities.

Consequential evidence

How well did you know this?

Not at all

Perfectly

Researcher watches and records events or

behavioral patterns of people

Observation

How well did you know this?

Not at all

Perfectly

Observation conducted in real-world situations

Naturalistic observation

How well did you know this?

Not at all

Perfectly

Observation conducted in lab setting set up by the researcher

Laboratory observation

How well did you know this?

Not at all

Perfectly

Evaluators need to stay on site for sufficient time to “get the story right.” If a study is conducted over too short a time or interviews are conducted with too few people, it is possible that an evaluator will reach “premature closure”

Prolonged and substantial engagement

the ability to generalize the results of a study to the population from which the research population was drawn.

External validity

A word that produces an emotionally charged reaction

Loaded term

A question that suggests how the participants should | answer

leading question

Asking about two or more issues in a single question

Double-barreled question

Nonoverlapping response categories

Mutually exclusive categories

Response categories that cover the full range of possible responses

Exhaustive categories

An ordered set of response choices, such as a 5-point rating scale, measuring the direction and strength of an attitude

Rating scale

Descriptors placed on points on a rating scale

Anchor

Participant must select from the two response choices provided with an item

Binary forced-choice approach

A scaling technique that is used to measure the meaning that participants give to attitudinal objects or concepts and to produce semantic profiles

Semantic differential - a scaling technique that is used to measure the meaning that participants give to attitudinal objects or concepts and to produce semantic profiles

An item directing the participant to different follow-up questions depending on the initial response

Contingency question

Tendency for a participant to respond in a | particular way to a set of items

response

Evaluators need to be aware of their assumptions, hypotheses, and understandings, and of how these change over the period of the study.

Progressive subjectivity

those incurred to provide a treatment or a service

direct costs

resources lost due to the disorder.

indirect costs

Events that occur during the study other than the experimental treatment effect that can influence the results

History threat

describing design that is intended to simplify life for everyone by making products, communications, and the built environment more usable by as many people as possible at little or no extra cost

Universal design

naturally occurring physical or psychological changes

Maturation threat

A monitoring system that would involve community members in determining what indicators of change are important to them

Most Significant Change Method Christian Commission for Development in Bangladesh and Rick Davies

Administration of a pretest that affects participants’ performance on a posttest (i.e., participants become “testwise”).

testing threat

Having pretests and posttests that differ in terms of difficulty; this can lead to seeming changes that are really due to the difference in the tests.

Instrumentation

Having extreme groups in a study (i.e., people very high or very low on a particular characteristic); it is possible that changes will be seen on the dependent variable because of this threat.

statistical regression

Differences between the experimental and control groups on important characteristics, other than receipt of the intervention.

differential selection

Differential dropouts of participants from either the experimental or control groups.

experimental mortality

treatment was not implemented as intended

treatment fidelity

monetary (i.e., dollar) values are estimated for the resources used (i.e., costs) and for the program effects (i.e., benefits), and these two components (costs and benefits) are then compared to determine the worth of a program.

cost-benefit analysis

based on cost and effect data (rather than cost and benefit data as in benefit-cost analysis).

cost-effectiveness analysis

What a concept means in abstract or theoretical terms

Conceptual definition of a sample

What you will observe and/or measure; links the concept you want to sample to the real world

Operational definition of a sample

Group to which you wish to generalize your results

target

List of people who match the conceptual definition

Experimentally accessible population

List of people in the experimentally accessible population

Sampling frame

Match between accessible population and target population

population validity

voluntary consent without threat or undue inducement; it includes knowing what a reasonable person would want to know before giving consent (informed), and explicitly agreeing to participate (consent).

informed consent

collecting, analyzing, storing, and reporting data in such a way that the data cannot be traced back to the individual who provides them.

confidentiality

involves the selection of a sample from a population in a way that allows for an estimation of the amount of possible bias and sampling error.

Probability-based sampling

Those in which every member of a population has a known nonzero probability of being included in the sample.

Random samples

Every member of the population has an equal and independent chance of being selected.

Simple random sampling

A robust description of the context in which qualitative research is conducted

Thick description

Selection of easily obtainable participants for sample group and usually the cheapest and fastest way of obtaining a sample group; interviewers happen to be available at a program site.

Convenience sampling

When will you have a poor strength of treatment? A. When the participants are sensitized to the posttest. B. When the evaluator uses different types of measurement for dependent variables. C. When the treatment “dose” was not strong enough to produce the expected change. D. When the evaluator systematically conducts an intervention at different times of the day.

In evaluation, a limitation of traditional definitions of needs assessment (according to your textbook) is that they focus only on needs and lack any attention to assets? A. True B. False

True

The assumption that all people within a particular subgroup are similar to each other in terms of their other background characteristics, or similar enough where differences are not identified.

Myth of homogeneity

Which of the following is(are) needed to infer causality? A. X precedes Y B. X is related to Y C. Confounding variables can be ruled out as causes D. All of the above are required.

``` When an extraneous variable systematically varies with the independent variable and influences the dependent variable, it is called: A. Another dependent variable B. A confounding variable C. A moderating variable D. An unreliable variable ```

Data analysis tends to be an ongoing and iterative (nonlinear) process in qualitative research. The cyclical process of collecting and analyzing data during a single research study).

Interim analysis

the difference between the sample and the population.

Sampling error

A major characteristic of experimental and quasi-experimental designs is an independent variable that can be manipulated. A. True B. False

True

Used in telephone surveys; computer generates a random list of phone numbers.

Random - digit dialing

``` If an evaluation is focused on addressing human rights, with what paradigm is it most closely aligned? A. Postpositivist B. Transformative C. Constructivist D. Pragmatic ```

Take every nth name off a list. Suppose you have 1,000 names on the list and you need a 10% sample. You pick a random number between 1 and 10, start there, and then take every 10th name on the list.

Systematic sampling

Which of the following is NOT a purpose for an evaluation? A. To identify needed inputs, barriers, and facilitators to program development or implementation. B. To determine inequities on the basis of gender, race, ethnicity, disability, and other relevant dimensions of diversity. C. To determine how a program can be made to look positive despite its lack of impact/effect. D. To demonstrate that accountability requirements are fulfilled.

If there are different groups (strata) that you want to be sure to include, then you can divide the population into subgroups first and then randomly sample from the subgroups.

Stratified sampling

``` What are some designs used by the methods branch to determine the effectiveness of an intervention? A. Experimental, quasi-experimental, single-group, and survey B. Narrative, phenomenological, ethnographic, and case study C. Dialectical, transformative, and case study D. Gender, race, and class analysis ```

Use with naturally occurring groups (e.g., classrooms, school districts, city blocks). Units are randomly selected from full list of possible sites. Then you can collect data from the members in the randomly selected unit.

Cluster sampling

Which of the following is legitimately considered (according to your textbook authors) a type of evaluation? A. Organizational assessment B. Participatory evaluation C. Cost analysis/assessment D. All of the above are considered legitimate

Use a combination of sampling strategies over the course of the study (e.g., start with cluster sampling, then use simple random sampling within clusters).

Multistage sampling

Cost Analysis is defined as an evaluation’s determination of whether a program’s effect was worth its cost. A. True B. False

Pilot testing data collection instruments is not very important. A. True B. False

Choose unusual or special individuals (e.g., highly successful or unsuccessful school principals).

Extreme or deviant sampling

Identify instances where the phenomenon of interest is strongly represented. Look for rich cases that are not necessarily extreme.

Intensity sampling

What are some examples of qualitative data collection options as listed by your textbook authors? A. Observations, interviews, review of artifacts, and focus groups B. Performance assessments and structured observations C. Interviews, surveys with random samples, and norm-referenced tests D. Structured observations, criterion-referenced tests, and portfolios

``` What are some forms of evidence used to support validity/credibility in qualitative data collection? A. Peer debriefing B. Member checks C. Persistent observations D. All of the above ```

Choose individuals that represent maximum variation of the phenomenon (e.g., teachers in isolated rural, suburban, and inner-city areas).

Maximum variation sampling

Identify strongly homogeneous cases; find individuals who share relevant characteristics and experiences.

Homogeneous sampling

What is universal design as discussed by the textbook authors? A. A way to simplify life for everyone by making products, communications, and the built environment more useable by as many people as possible. B. A technique for making tests more accessible to disabled people. C. Use of multiple languages in a data collection instrument D. All of the above.

In most cases, use homogeneous groups (e.g., if service providers and participants are included in the same focus group, this might yield biased results).

Focus group sampling

This is the opposite of the extreme or deviant sampling strategy; you want to identify the typical or average

Typical case sampling

What is INTRArater reliability? A. It is used to determine whether a single rater or observer is consistent over time. B. It compares the data of two raters or observers to see whether they are rating the same behavior consistently. C. It is used to compare two kinds of data collection to see whether they are describing the same event. D. It is used to compare when different raters are administering similar instruments

Strategy combines the identification of strata of relevant subgroups with purposeful selection from those subgroups.

Stratified purposeful sampling

Use cases that can make a point dramatically or are important for other reasons. Patton (2002b) says that the key to identifying a critical case is “If it’s true of this one case, it’s likely to be true of all other cases” (p. 243).

Critical case sampling

``` An evaluator created a test. In order to test reliability, he had the participants take the test and he analyzed the results to examine the consistency of their responses. What is this an example of? A. Repeated measures reliability B. Intraparticipant reliability C. Internal-consistency reliability D. Multi-dimensional reliability ```

Start with key informants who are then asked to recommend others you should talk with—some who agree with them and some who disagree with them.

Snowball or chain sampling

Set up criteria to specify what characteristics people in the study need to have.

Criterion sampling

Reliability and validity are commonly used terms to describe the quality of quantitative data collection. What does validity mean in this situation? A. Does the instrument measure cultural competence? B. Does the instrument (as used with the participants) really measure what it is supposed to measure? C. Does the instrument measure what is it is supposed to measure consistently? D. Does the instrument reliably measure what it is supposed to measure over time?

If the evaluation is focused on a theoretical construct such as creativity, you need to describe the meaning of that construct, and then identify individuals who theoretically exemplify that construct.

Theory-based sampling

In multiple regression, when we say that we control for the effects of some variable(s) we are: A. statistically adjusting or subtracting the effects of a variable to see what a relationship would have been without it B. actually removing a variable from a model so that it does not interact with the effects of other variables C. changing the mediating capabilities of an endogenous variable D. changing the mediating capabilities of an exogenous variable

Look for cases that both confirm and disconfirm emerging hypotheses.

Confirming and disconfirming case sampling

``` What kind of data collection is useful when you want people to discuss a particular topic? A. Case studies B. Interviews C. Focus groups D. Open-ended questionnaires ```

Selection of individuals emerges as the study progresses; you do not know a priori who will need to be included.

Opportunistic sampling

What are some critical issues related to data collection? A. Language of participants B. Literacy level of participants C. Use of a dominant or colonizing language D. All of the above

Randomly choose individuals from a purposefully defined group.

Purposeful random sampling

``` The ______ sampling strategy chooses unusual or special individuals (e.g., highly successful or unsuccessful school principles)? A. Maximum variation sampling B. Stratified purposeful sampling C. Extreme or deviant sampling D. Snowball or chain sampling ```

Determine whether there is a political reason for including particular areas and individuals for the credibility and perceived usefulness of the study.

Politically important case sampling

What is the definition of a sampling frame in your textbook (Mertens and Wilson)? A. The target population of your study B. List of all the people in the experimentally accessible population. C. The people you plan to observe D. None of the above

``` Which of the following is NOT one of the 13 categories of disabilities in the Individuals with Disabilities Education Improvement Act of 2004? A. Emotional disturbance B. Speech of language impairment C. Visual impairment D. Majority-minority group membership ```

The set of cases selected from the population

Sample

``` Which of the following is a type purposeful/theoretical sampling? A. Simple random sampling B. Critical case sampling C. Systematic sampling D. Cluster sampling ```

The full group to which one wants to generalize

Population

``` Which major sampling option is more commonly used in the Values Branch? A. Probability-based sampling B. Multistage sampling C. Theoretical/Purposeful sampling D. Simple Random Sampling ```

A numerical index based on sample data

Statistic

``` What is the opposite of deviant case or extreme sampling? A. Typical case sampling B. Homogeneous sampling C. Critical case sampling D. Snowball sampling ```

A numerical characteristic of a population

Parameter

``` What kind of sampling strategy begins with a random start, includes that element, and then includes every nth name off a list? A. Interval sampling B. Systematic sampling C. Random digit sampling D. Multistage sampling ```

The type of statistical analysis focused on describing, summarizing, or explaining a set of data

Descriptive statistics

According to your text, the “myth of homogeneity” means assuming that all people within a particular subgroup are similar to each other in terms of their other background characteristics, or at least sufficiently similar that you do not have to focus on those differences. A. True B. False

True

The type of statistical analysis focused on making inferences about populations based on sample data

Inferential statistics

What is nested sampling? A. You gather samples using a variety of methods. B. You use different people from different populations C. You have identical samples for both the quantitative and qualitative parts of the study. D. Data are collected from a group using one method; then a subset of that group is selected to provide data using another method.

The theoretical probability distribution of the values of a statistic that would result if you selected all possible samples of a particular size from a population

Sampling distribution

``` Which sampling strategy uses cases that can make a point dramatically or are important for other reasons? A. Stratified random sampling B. Critical case sampling C. Homogeneous sampling D. Politically important case sampling ```

The theoretical probability distribution of the means of all possible samples of a particular size selected from a population

Sampling distribution of the mean

The standard deviation of a sampling distribution

Standard error

A statistic that follows a known sampling distribution and is used in significance testing

Test statistic

An evaluation that is allowed to evolve throughout the course of the project. Examples include participatory, qualitative, critical, hermeneutical, bottom-up, collaborative, and transdisciplinary approaches.

Emergent evaluation

The branch of inferential statistics focused on obtaining estimates of the values of population parameters

estimation

Use of the value of a sample statistic as one’s estimate of the value of a population parameter

Point estimation

Placement of a range of numbers around a point estimate

Interval estimation

Use a word or short phrase to summarize the topic found in a passage of the data.

Descriptive codes

Use the exact language of the participants as a code.

In vivo codes

Captures actions in the data and usually ends in “-ing.”

Process coding

Labels emotions that are expressed by the participants.

Emotion coding

Can reflect the values, attitudes, or beliefs expressed by participants.

Values coding

A hypothesis states that there is no difference between the scores of the experimental group and the control group

Null hypothesis

The hypothesis states that there will be a difference between means in the population.

Alternative hypothesis

An interval estimate inferred from sample data that has a certain probability of including the true population parameter.

Confidence interval

A set of data, where the rows are “cases” and the columns are “variables”

Data set

The analysis guides the design of subsequent stages of a study and leads to further analysis that integrates the data from these stages.

Sequential integration

Data arrangement in which the frequencies of each unique data value is shown

Frequency distribution

The branch of inferential statistics focused on determining when the null hypothesis can or cannot be rejected in favor of the alternative hypothesis

Hypothesis testing

Depicting frequencies and distribution of a quantitative variable

Histogram Graph

The point at which one would reject the null hypothesis and accept the alternative hypothesis

Alpha level

The average deviation of data values from their mean in squared units

Variance

The area on a null hypothesis sampling distribution where the observed value of the statistic, if it fell in this area, would be considered a rare event

Critical region

The square root of the variance

Standard deviation

The likelihood of the observed value (or a more extreme value) of a statistic, if the null hypothesis were true

Probability value (p value)

Conclusion that an observed finding would be very unlikely if the null hypothesis were true

statistically significant

Used to determine if the difference between the means of two groups is statistically significant

Independent samples t test:

Claim made when a statistically significant finding seems large enough to be important

Practical significance

Methods used to gather data in an emergency response situation in order to share information in real time.

Rapid evaluation and assessment methods (REAM)

An index of magnitude or strength of relationship

Effect size indicators

Rejection of a true null hypothesis

Type 1 error

Failure to reject a false null hypothesis

Type II error

Which of the following is used for group differences evaluation questions? A. t test for independent samples B. Pearson product-moment coefficient of correlation C. Mean and variance D. Range

What are some strategies for analyzing qualitative data? A. Engage in continuous and ongoing data analysis B. Reflectively reading interview transcripts and field notes to get a holistic picture of the research question. C. Determining codes for the data that suggest emergent concepts D. All of the above

``` Who are the researchers who initiated grounded theory as a systematic method? A. Glaser and Strauss B. Campbell and Shadish C. Pope and Wallace D. Patton and Stake ```

One reason that statistical analysis is useful is: A. It helps you in coding interview transcripts B. It helps you reduce a large amount of data into more meaningful terms such as an average C. It is a systematic system for organizing data into relevant categories or themes. D. It is useful for obtaining a thick description of the data

According to your textbook, generalizability is only a concern in the interpretation of quantitative data A. True B. False

According to the authors of your text, it is sometimes appropriate to involve stakeholders in the analysis phase of the evaluation. A. True B. False

What are some theoretical frameworks commonly used in qualitative data analysis? A. Postpositivism B. Postpragmatism C. Postmodernism D. Feminist theory and indigenous theory

According to your text, it is rarely if ever important for evaluators to use a particular theoretical framework lens in analyzing data. A. True B. False

What factor(s) might influence whether you decide to analyze your data using software or manually? A. The amount of time you have available to analyze your data. B. The amount of data you have collected C. The training and support available at your institution. D. All of the above are factors are important to consider.

``` Which of the following is a type purposeful/theoretical sampling? A. Simple random sampling B. Critical case sampling C. Systematic sampling D. Cluster sampling ```

``` Which major sampling option is more commonly used in the Values Branch? A. Probability-based sampling B. Multistage sampling C. Theoretical/Purposeful sampling D. Simple Random Sampling ```

``` What is the opposite of deviant case or extreme sampling? A. Typical case sampling B. Homogeneous sampling C. Critical case sampling D. Snowball sampling ```

Final Flashcards

(176 cards)