Evaluation and research use some similar methods, but they have different purposes. According to Schwandt., “evaluation is the act of judging the value, merit, worth or significance of things” (Evaluation Foundations Revisited, Cultivating a life of mind for practice, 2016). However, evaluation is a contested field; even the statement that evaluation is all about making value judgements is not universally agreed. Donaldson identifies four main purposes for evaluation: program and organisational improvement, oversight and compliance, assessment of merit and worth (value), and knowledge development (Roles for theory in contemporary evaluation practice: developing practical knowledge, Sage Handbook of Evaluation, 2006).
I’ve just started at ARTD and one of the biggest changes from my role as a researcher is working more closely with diverse stakeholders to understand a program and determine criteria on which to measure its overall value. At ARTD, we believe that what distinguishes evaluation as a profession and makes it valuable for decision-making is making judgements of value. But how do evaluators decide what is of value? How do we bring an objective approach to values—an inherently subjective concept?
We’ve been revisiting this process through Schwandt’s Evaluation Foundations Revisited. He highlights the need to identify and determine criteria, deal with multiple criteria and decide who should do the ‘valuing’.
When thinking about these issues I’m reminded of James Scott’s ‘Seeing like a State’, which highlights the tension between stakeholders’ differing worldviews – local, contextual knowledge and values and the incentives of the State, which are often to formalise and make things measurable and efficient.
In his blog, Scott Alexander summarises Scott’s framing anecdote: In 18th Century Prussia, peasants cut down whatever trees were growing in the forest. Enlightenment rationalists determined that a better way would be to clear all the forests and plant trees with the highest yield (Norway Spruces), in evenly spaced, grid formation. This increased efficiency, as they could cut down many trees in a short time. The intervention was celebrated, people were promoted and the system spread throughout Europe and the world. However, the eco-system could not support the wildlife and plants that the local villages relied on and they suffered economic collapse. Disease and forest fires resulted from the rows of identical trees and the Norway Spruces were unable to grow.
This brings to light the need to reconcile local knowledge with high-level objectives of economic efficiency. I can’t help but wonder that if there had been a process for agreeing on evaluative criteria they might have included not only the economic impact of the intervention but also the environmental and social impact. Through the use of weighting criteria, it might have been established that the significant economic benefit was untenable given the long-term impact on sustainability.
Schwandt identifies a range of criteria for valuing, relevant to evaluation but he also notes the ‘contextual pragmatics of valuing’. Stakeholders’ perspectives and vested interests, the social and political norms, cultural understandings and evaluators’ own values and perspectives will influence what criteria are considered important.
So, how then do we balance multiple criteria, and who should do the valuing and how? Schwandt proposes two options: that either stakeholders or stakeholders in combination with an evaluator should determine value.
A participatory approach acknowledges that value is context bound and mutual agreement of value criteria can be built through the interaction between the evaluator and diverse stakeholders. Then rigorous methods can be used to assess a program’s value and worth.
The challenge for us as evaluators is to think, had we been commissioned by the Prussian Government, would we have come up with more comprehensive criteria to assess the intervention and have changed history?