“The value of M&E comes not from conducting M&E or from having such information available; rather, the value comes from using it to help improve [policy and programme] design, implementation and performance” – World Bank
At DNA Economics we believe in the above principle: that monitoring and evaluation (M&E) should play a pivotal role in the design and implementation of development policies and programmes. In our work, we have conducted numerous studies and evaluations using various M&E methodologies. One aspect holds true across all this work:
The value of M&E is directly determined by the quality of the data collected.
In turn, the quality of data, and therefore the quality of the resulting findings, is only as good as the research tools and instruments used.
Many programme evaluations rely (almost exclusively) on perception based data collected through interviews with, or surveys of, implementers (i.e. programme providers) and/ or beneficiaries (i.e. programme users). This is understandable given the ease of collecting such data. However, such perception data suffers from a large number of weaknesses that are often not sufficiently considered or understood by evaluators or the users of the evaluation. This is not to say that interview data is not valuable, but rather that it is very dangerous to rely on it as a primary or only source of evidence.
To illustrate this point, we highlight some of the challenges we have encountered when conducting interviews (and surveys) within the South African schooling system, an area where such challenges are particularly common.
Interview data in South African schools: Common issues
Assessing the effectiveness of education programmes is notoriously difficult. Essentially, we want to know whether programmes improve learner performance. However, these learner assessments are costly to administer, and even when they can be done, a causal link between the programme interventions and learner performance is difficult to confirm.
As a result, most programmes rely heavily on self-reported perception data (including interviews) to assess the effectiveness of programmes. For example, many programmes will conduct interviews with teachers or principals and use such data to draw conclusions. These perceptions often deliver very positive results on programmes, regardless of whether the programme was able to deliver sustained improvements in learner performance (when it is measured).
Why is this? In practice, perception data faces a number of issues in this context, including:
1. Impartiality: Respondents (e.g. teachers or principals) are not truly impartial observers. As they have directly benefitted from the programme they often do not want these benefits to be taken away (as a result of negative feedback) or might simply might not want to seem ungrateful. Many educators also suffer from a degree of status quo bias, which in dysfunctional schools can result in low expectations for change or results.
2. Principal-agent problem: While learners are usually the final programme beneficiaries, they are often too remote from the interventions and / or are too young to be reliable respondents. Officials (e.g. teachers) are thus called on to speak on behalf of learners. What results is a type of principal-agent problem, where the principal (learner) has no say in ensuring that the agent (teachers) responds in their best interests.
3. Resistance: Educators are often understandably resistant to any interference or observation in their work – a view reinforced by the protective stance frequently adopted by teacher unions – and might simply give the expected response to avoid deeper scrutiny of their work.
4. Confidentiality: Despite researchers’ best attempts to assure respondents that answers are recorded anonymously, respondents are often still weary of expressing negative or controversial opinions, fearing that such opinions might have repercussions for them or their school.
5. Social desirability bias: Respondents often tell interviewers the answer that they think researchers want to hear or what they believe is socially expected of them. This bias is the result of the human nature to please and conform, and hence affects virtually all surveys regardless of context. However the factors mentioned in the preceding points make educational research particular susceptible to this bias.
A large number of other general response biases are observed across contexts. For descriptions of commonly observed response biases and potential resolutions, see https://measuringu.com/survey-biases/ or https://psychologenie.com/types-of-response-bias-explained-with-examples.
So what can we do about this?
There are two main ways that these challenges can be avoided or at least, mitigated.
1. Improve survey-based data collection: Effective research and questionnaire design can significantly improve the quality of data collected. For example:
- Question wording and sequencing should assure respondents of anonymity.
- Questions should whenever possible be verifiable rather than purely subjective, and interviewers should ask for documentary evidence in such cases.
- Questions can be subtly repeated with amended wording to verify responses.
- Fieldworkers should be trained to be aware of potential biases.
- Interviews should conducted at a time that reduces the risk of biases. For example, interviews should typically be conducted as soon as possible after the programme to avoid the tendency towards selective memory or recall bias.
2. Lessen reliance on perception data: In many cases, weaknesses of perception-based data can never be fully overcome, and as such other research designs, tools and data sources need to be used to supplement the interview data. In this regard it is important to:
Place greater value on results from more objective measures (such as learner or teacher assessments, classroom observation or learner workbook analysis).
Implement rigorous research designs (for example randomised controlled trials or quasi-experimental designs).
Develop evaluation frameworks at the outset of programmes that allow for the most appropriate research design and data sources to be used.
Depending on the situation, the above solutions can be pursued independently or jointly to improve the quality of research.
Perception data can be useful and valuable in its own right. However, this type of data collection suffers from a number of serious weaknesses that can lead to misguided conclusions being drawn about the impact of important development programmes. Given the amount of resources expended on such programmes and the critically important roles they can play in people’s lives, the use of appropriate designs and techniques is essential if we are to form a real understanding of what works and what doesn’t.