Midterm Flashcards
“Do teenaged girls attend school less frequently during menstruation?”
Which of the following evaluation types could be used to answer the above research question?
Needs assessment Theory of change Process evaluation Impact evaluation cost effectiveness analysis
Needs assessment
The question of whether girls attend school less frequently is a descriptive question at least as written. While there may be implied causation between menstruation and attendence, menstruation is not itself an intervention to be evaluated. If we found this to be true, perhaps that would justify the need for a program to help girls cope with menstruation better.
Which of the following research questions could be answered by an impact evaluation?
Are principals spreading misinformation when explaining to parents what their rights are?
Do parents have a right to be involved in decision-making at schools?
Does providing parents information about their rights lead to better teacher attendance?
Providing information is implied intervention, and the impact question is: does information lead to some outcome.
Which group is used to estimate the counterfactual when difference-in-differences is used for the identification strategy?
Non-participants before and after the program has been implemented
In a difference-in-differences study, there are individuals in the study sample who did not participate in the program. We estimate the counterfactual by measuring the change in the outcome of interest in this group from baseline to endline, and then compare this to the change for program participants, netting out any differences between the two groups from baseline
An evaluation of the Millennium Villages Program (MVP) describes its approach as follows: we “compare trends in development indicators [such as change in Proportion of households that own a mobile phone, changes in skilled birth attendance over time] for each of three Millennium Villages to trends in the same indicators for the same country overall, rural regions of the same country, and rural areas of the province or region where the Millennium Village is located. We use this approach because changes in the comparison areas—in particular in the rural area of an MVP site‘s province or region—constitute … what would have happened at the MVP site in the absence of the MVP.”
what identification strategy is being used?
In the MVP evaluation, researchers look at individual and household outcomes of both MVP and comparison groups, both before and after the program, and use the difference-in-differences to estimate the impact.
An evaluation of the Millennium Villages Program (MVP) describes its approach as follows: we “compare trends in development indicators [such as change in Proportion of households that own a mobile phone, changes in skilled birth attendance over time] for each of three Millennium Villages to trends in the same indicators for the same country overall, rural regions of the same country, and rural areas of the province or region where the Millennium Village is located. We use this approach because changes in the comparison areas—in particular in the rural area of an MVP site‘s province or region—constitute … what would have happened at the MVP site in the absence of the MVP.”
which of the following assumptions must hold for this to produce a valid measure of impact?
Absent the program, development outcomes of villagers in Millennium Villages would change the same amount as those in comparison villages
The assumption made when a difference-in-differences identification strategy is used is often called the “parallel trends” assumption: that the change in outcomes of the comparison group reflects the same magnitude change in outcome of the treatment group had they not been treated (in the counterfactual world).
Which of the following designs would NOT give us a statistically equivalent comparison group?
Take random samples from two separate populations, give one sample the intervention, and compare the two samples
Identify a target group, all of whom will eventually receive the intervention. Then randomly assign when each person will receive the intervention
Take a random sample from the target population that receives the intervention, and compare it to the rest of the target population, which does not receive the intervention
Conduct a lottery to determine who receives the intervention and who doesn’t
Take random samples from two separate populations, give one sample the intervention, and compare the two samples
If the samples are from different popluations, they may be representative of each population respectively, but there is no expectation that the two populations are statistically identical
A sample of 1384 households who used 184 naturally-occurring springs as their primary drinking water source were randomized into two cross-cutting treatments: a source water quality intervention (“spring protection”) and a point-of-use water quality intervention (“WaterGuard promotion,” a chlorine product). Spring communities were randomized into either the “high-” or “low-intensity” for the WaterGuard intervention. In high-intensity communities, 6 out of the 8 sample households were randomized into the WaterGuard treatment arm; in low-intensity communities only 2 of the 8 sample households were randomized into the treatment arm. Across the entire sample, half of the households were randomly selected to receive seven 150 mL bottles of WaterGuard and a voucher for an improved storage pot with a tap and a lid.
In the above research design description, what appears to be the unit(s) of randomization for the WaterGuard promotion evaluation?
The randomization was conducted at two stages. The community level randomization determined the level of intensity within the community (which answers one research question possibly about spillovers), and the second level of randomization was at the household level (likely to answer the impact of a direct intervention).
Consider an intervention to inform physicians of the dangers of drug-resistant bacteria and overprescription of antibiotics. Prescriptions are ordered through an electronic system. The intervention works by creating alerts about the dangers each time a prescription is filled. The outcome is number of antibiotic prescriptions.
Considering the above problem, which of the following arguments is the most convincing for the appropriate unit of randomization?
The drug level, because some antibiotics are at a larger risk of drug resistance than others
The patient level, because each time a patient comes in, we can randomize whether the physician receives the alert or not
The patient level, because some patients are less likely to complete the course of antibiotics than others, and not completing the course is the primary cause of drug-resistant bacteria
The physician level, because physicians may learn about the dangers of drug resistance after the alert has notified them a few times, and this may affect their decisions about future patients
The physician level, because physicians may learn about the dangers of drug resistance after the alert has notified them a few times, and this may affect their decisions about future patients
Two of the other (incorrect) options reflect challenges that to be considered in program design (drug and patient-specific variance in risk). The other patient level suggestion is logical, but does not reflect challenges to the research design–specifically spillovers. The concern about randomizing at the patient level is that if doctors sometimes get sometimes gets alerts and sometimes not, may change behavior relative to the counterfactual–e.g. being more likely to always issue a warming (since they’re thinking about the risks more frequently) or less likely (if they believe that no-alert means no-risk).
A bank in the Philippines has just opened a new “commitment savings account” for which there are penalties if clients withdraw money before a prespecified date, or before they reach a certain pre-specified target amount. This helps clients who want to save resist the temptation of withdrawing for unnecessary expenses. They want to measure the account’s impact on overall savings, but also the potential side affect of being more vulnerable to shocks. Due to internal policies, they are not allowed to deny anyone access to this account.
Looking at the above evaluation question, what is the appropriate method of randomization?
Encouragement design
If eligibility is universal from day one, the only option is to try to encourage some to take up the account. However, the encouragement should not promote the virtues of saving per-se, because that itself is an intervention. So a possible appropriate encouragement might be an expedited, simplified, or facilitated sign-up process
Compared to other methods of randomization, what are the main limitations of the encouragement design?
It measures impact of only those who change behavior due to the incentive to take-up treatment
The incentive to take-up treatment may have a direct impact
The design only measures impacts on “compliers,” because “always takers” will have take-up whether in the treatment or control group and “never-takers” will not take up, even if assigned to the the treatment group. If the encouragement itself has an impact on outcomes, we will be measuring the impact of the combination of the intervention and encouragement.
Consider a sample of 250 villages that you would like to randomly assign to two groups. Your implementing partner has the funding and mandate to conduct the intervention in exactly 150 villages, leaving 100 for the control. An allocation method that will certainly achieve this goal is:
Complete randomization (sorting a list randomly and assigning the top 60% from that list to the treatment)
With complete randomization, we can sort districts by a random number and just pick (for example) the top 150 districts and therefore have a treatment group of exactly 150.
The government recently passed legislation that all 500 districts in the country must have a hospital that can provide basic emergency care. Currently, only 20% of districts have such hospitals. Because the construction of hospitals is expensive and can take up to two years to build, the government plans to phase this program in over 10 years. It is willing to randomize and wants to know the short-run (1 year) impact of this program on health outcomes. However, individuals from neighboring districts will likely use the hospital if one does not exist in their own district, and therefore they will likely see improved health outcomes as well, even if they are not in a “treatment” district.
What strategy would best manage the spillover
Create buffer
There may not be an obvious higher level of treatment than the district. And even if so, those clusters of districts may still have neighboring districts in other clusters. We don’t have enough information to assume changing the level would help. However, selecting a sample that is by design spread out in the first phase–e.g. without adjacent districts in the study, spillovers can be contained. Placebo treatments and controlling for density are useful for identifying those in the comparison group more likely to take-up, but not for spillovers.
The government recently passed legislation that all 500 districts in the country must have a hospital that can provide basic emergency care. Currently, only 20% of districts have such hospitals. Because the construction of hospitals is expensive and can take up to two years to build, the government plans to phase this program in over 10 years. It is willing to randomize and wants to know the short-run (1 year) impact of this program on health outcomes. However, individuals from neighboring districts will likely use the hospital if one does not exist in their own district, and therefore they will likely see improved health outcomes as well, even if they are not in a “treatment” district.
If not properly contained, and we were unaware of this potential for spillover, how might our results be affected?
It would likely lead us to underestimate the program’s impact
It is likely that control districts will have better health outcomes than the counterfactual because now they have hospitals in neighboring districts. Therefore the difference between the treatment and control would be less than the difference between the treatment and the counterfactual. This would lead us to underestimate the impact.
The government recently passed legislation that all 500 districts in the country must have a hospital that can provide basic emergency care. Currently, only 20% of districts have such hospitals. Because the construction of hospitals is expensive and can take up to two years to build, the government plans to phase this program in over 10 years. It is willing to randomize and wants to know the short-run (1 year) impact of this program on health outcomes. However, individuals from neighboring districts will likely use the hospital if one does not exist in their own district, and therefore they will likely see improved health outcomes as well, even if they are not in a “treatment” district.
Suppose that in our impact analysis of the program described above we are only comparing endline outcomes, without any controls or covariates. If in the control group (but not the treatment group), some of the particularly poor and disadvantaged districts refused to particpate in the study, and we did nothing to correct for the attrition, what might that do to our results?
It would likely lead us to underestimate the program’s impact
In this case, the control group at endline would be made up of richer districts than the treatment group, and therefore would have better health outcomes. If the poorest households were not included in our endline of the control group, we would conclude for the control group to have better health outcomes than it had in reality. This would lead us to underestimate the impact of our program, since the difference in outcomes between treatment and control group appears smaller than it really is.
The government recently passed legislation that all 500 districts in the country must have a hospital that can provide basic emergency care. Currently, only 20% of districts have such hospitals. Because the construction of hospitals is expensive and can take up to two years to build, the government plans to phase this program in over 10 years. It is willing to randomize and wants to know the short-run (1 year) impact of this program on health outcomes. However, individuals from neighboring districts will likely use the hospital if one does not exist in their own district, and therefore they will likely see improved health outcomes as well, even if they are not in a “treatment” district.
As part of our training on financial literacy for microenterprises, we teach entrepreneurs how to keep financial diaries. This also allows us to obtain accurate data on their profits. For members of the control group (who are not given training, and do not keep financial diaries), we conduct a monthly survey on revenues and costs to measure profits.
Not only is take-up affected by treatment status, but so is measurement. For example, if there is any systematic error in measuring profits in the survey method, but not the financial diary method, those measures would differ by treatment status, even if the true treatment is zero. So now the assigned treatment status is affecting outcomes not only through the treatment, but through measurement. This is a violation of excludability because now the treatment is not the “exclusive” difference between the two groups.
Given non-compliance in the above example, what is the correct inference of the Complier Average Causal Effect (CACE, also known as Local Average Treatment Effect–LATE) estimate?
This is the impact on those who took up because they were assigned to treatment
The definition of compliers is those who take up only when assigned the treatment (and would not have taken up had they be assigned to the control group).
What do we expect to happen to the sampling distribution as our sample size increases?
The sampling distribution will approach a bell curve
Increasing sample size will result in a narrower sampling distribution that is shaped like a normal (or bell) curve. Nothing will happen to the underlying population distribution
In a cluster randomized trial where the intra cluster correlation (rho) is 0, we estimate that our Minimum Detectable Effect (MDE) is 0.15. If we increase the total sample size, from 1400 to 2800, what is the best description of what will happen to our MDE? Assume that everything else remains constant
MDE would decrease to about 0.106
Because ICC is 0, there is no needed adjustment for clustering. Therefore doubling the sample size allows us to detect a smaller impact. The new MDE will decrease by a factor of square-root of 2. The new MDE is equal to (old MDE)/sqrt(2) standard deviations.
If the minimum detectable effect for a 50:50 allocation was 0.20 SD, and we were to change the allocation ratio (sample size in treatment : sample size in control) to 90:10, about what would be the new MDE?
0.33 SD
Sqrt(1/(0.50.5)) = 2 Sqrt(1/(0.10.9)) = 3.333 The relative ratio of these two is 1.6666, and therefore the MDE in the 1/10 allocation is 1.6666 times higher than 0.20, and therefore 0.33.
Which of the following is a construct?
intelligence Interest in learning Mother's years of education Student test scores Intrinsic motivation of teachers
Intelligence, interest, and intrinsic motivation are all concepts that require indicators. There is no direct quantifiable representation of each of these constructs. Student test scores and years of education translate directly into quantities
A recent study titled “The White Man Effect” found that when measuring altruism in Sierra Leone through a “dictator game”, the amount “dictators” gave “the recipient” in the game varied significantly depending on who was present. In particular, they found that when white, foreign, observers were present, respondents tended to give more of their share to the recipient, signaling higher levels of altruism. However, they found that if the players in the game were from communities that received significant foreign aid, this effect went down, perhaps in an attempt to signal the inability to simply “give money away”.
The above is an example of which type of bias?
Social desirability bias
If survey respondents are subject to a “White Man Effect” (as described in the previous question), we would expect to find that some deliberately claim higher levels of wealth when the surveyor appears rich. This effect may be in the reverse direction if the respondent believes the surveyor is linked to aid decisions, in which case the respondent may claim lower levels of wealth.
This type of bias likely occurs at which stage in the response process?v
Reporting the answer
It is likely that respondents make their estimates, and then revise those estimates in the reporting stage.
To what extent do you agree that education is more the father’s responsibility than the mother’s?
o Strongly Agree
o Agree
o Neither Agree nor Disagree
o Disagree
o Strongly Disagree
The question above is what question and response type?
Likert Scale
Single response
This is close ended, because response options are given. It is single response because one would not both agree and disagree with this statement at the same time. And this type of scale is known as a likert scale.
A survey questionnaire includes the following questions:
Q5: How many times have you purchased meals outside (at a restaurant, cafe, food stand, etc) in the past week?
Q6: How many times have you purchased tea or coffee outside in the past week?
Q7: How many times have you purchased candy on impulse at the checkout counter in the past week?
Q8: How many times have you purchased cigarettes in the past week?
The above series of questions is meant to measure cigarette usage, something many people in this particular context are embarrassed about. What is the technique used to collect that information?
By embedding the “sensitive” question in a series of similar-sounding, less-sensitive questions, the researchers hope that the respondent will be conditioned to respond as if they are equally (or similarly) sensitive. This is a form of framing. This could not be a list randomization because the questions are not yes/no, or are unlikely to have a known distribution.
If respondents were asked about drug use using a polling booth method with a dozen respondents per polling booth, which of the following methods would allow us to disaggregate responses by gender?
Ensure each group was of one gender only
Have different boxes for different genders
If each group had an even split of men and women, it would be impossible to disaggregate responses by gender. And if we used natural variation (e.g. the fact that some groups were mostly men, or mostly women), if we saw differential responses, we wouldn’t know if that was because of gender, or some other characteristic (like types of occupation) of the community that led the group to be skewed toward one gender or the other.
To check whether someone has contracted a highly contagious viral infection, there are three common test options. Test A has a sensitivity of 99% and specificity of 95%. Test B has a sensitivity of 95% and specificity of 99%. Test C has a sensitivity of 98% and specificity of 98%. Which test would we use if our largest concern was detecting the disease in those who have it?
Test A, Test B or Test C? (or not enough information)
The sensitivity of a test reflects the probability of getting a positive result when indeed the person is positive. If we care most about this, we would choose the test with the highest level of sensitivity: Test A.
Which of the following ingredients would you use to calculate incidence of a disease? (Select all that apply)
New cases in a specified time period Number of people who are infected Total population Total population of uninfected individuals Time period
New cases in a specified time period
Total population of uninfected individuals correct
Time period
Incidence refers to the number of new cases of a disease that occur during a specified period of time in a population at risk for developing the disease. The numerator includes number of new cases, and the denominator includes people who have the potential to develop the disease in the future (but do not have it now).
In one community, the prevalence of of a parasitic infection is 10%, and the average duration of the disease is 2 years. What is the incidence rate of the disease (rate per 1,000)? Note that the incidence rate is calculated per year
In a steady state where rates are not changing and in-migration equals out-migration, prevalence and incidence can be related by the following equation: prevalence = incidence x duration of disease. To calculate incidence rate with the given information, 0.10 is divided by 2 and then multiplied by 1,000 to give a rate of 50 per 1,000.
An evaluation of a financial literacy intervention shows that the intervention has an impact on the revenues of a family’s small business. What information do we need to see if the intervention improved overall household income?
Income from other sources
A measure of total business costs
How much of (remaining) profits are used by the household as income
Whether the household pays itself a salary from the business, and if so, how much
Which of the following is true regarding the MET indices of teacher effort:
Measures of teacher effort predict student achievement
Measures vary highly in their predictive validity depending on which tool is used
Measures vary highly in their predictive validity depending on the subject in school (e.g. math, language, etc)
Interventions that improve teacher learning lead to improved student learning
Measures of teacher effort predict student achievement
Measures vary highly in their predictive validity depending on which tool is used
Measures vary highly in their predictive validity depending on the subject in school (e.g. math, language, etc) correct
The fact that there was a positive correlation between measures of teacher effort and teacher value added (improvement in child test scores) suggests that effort predicts student acheivement. However, the strength of that correlation depends greatly on the subject in school. The correlation was not so dependent on the tool used. This is a correlation, however, not causation.
Which of these outcomes is unambiguously a sign of empowerment?
Reduction in family size (i.e. number of children)
Reduced gender conflict
Increase in time spent working for a wage
Participation in making financial decisions
Participation in making financial decisions
Only participation reflects a process where women are able to make a choice. In the other options they could go either way depending on women’s preferences