The Stakes of Simplicity: Evaluations in the DFV Prevention Space
May is Domestic and Family Violence (DFV) Prevention Month, a time dedicated to raising awareness about the impact of domestic and family violence, as well as support for victims and prevention strategies. Throughout the month, Abel and I read Kate Fitz-Gibbon’s Our National Shame, prompting reflections on our evaluations in this sector, how difficult measuring change in complex social systems can be, and why it’s so important to get this right. Fitz-Gibbon’s central argument is confronting but clear: violence against women is not just an issue of individual pathology – it is enabled by structural inequality, cultural attitudes and institutional failures.
Over the years, we’ve worked on prevention programs delivered to boys and men at risk of using violence. Generally, outcomes vary significantly across sites and participants. For some, these programs contribute to meaningful shifts in how they thought about masculinity and their own behaviour. For others, such programs can have little to no tangible impact. Although the variation is not surprising and is consistent with what broader primary prevention evidence tells us, it creates a real challenge for evaluators.
In program evaluation, almost every evaluator has experienced pressure to provide a simple answer to the question “did it work?”. The temptation is to aggregate findings across sites and participants to arrive at something more definitive, turning a complex and uneven picture into a single clean answer. But often, this is not the right approach. The problem is that averaging can bury the most important signal in the data. How should a program like this be judged? Understanding a program’s value requires a more nuanced approach.
Where a program like this works, it appears to work in a meaningful way. The challenge is to understand why, so that future delivery can be better targeted to reach the people for whom it can make the biggest difference. Even where change does occur, attribution is rarely straightforward. Consider a young man who completes a prevention program and goes on to have respectful relationships. How much of that change can reasonably be credited to the program he participated in? He may have had a trusted teacher, a stable home environment or a peer group that modelled healthy behaviour. The program was just one thread in a much larger fabric.
This is not unique to DFV prevention. Evaluators working across complex social programs must navigate this tension constantly. But it feels particularly acute here, where the stakes of overclaiming are high – both for program integrity and for the people who may experience harm if the program doesn’t work as intended. Overclaiming impact where the evidence is thin risks misallocating resources toward programs that feel promising but may not be achieving results. At the same time, overly narrow approaches to evidence can lead to reduced funding for programs that are genuinely contributing to change simply because their contribution is difficult to isolate and understand. This tension helps to explain why aggregation is so appealing. When dealing with uneven and context-dependant outcomes, we’re naturally pulled towards simple answers that offer clarity. But the evaluator’s job is not to manufacture certainty where none exists. It is to be transparent about what the evidence can and cannot tell you, and to build evaluation designs that are sensitive enough to detect meaningful change, even when that change is partial, uneven, and slow.
Domestic and Family Violence Prevention Month is a reminder that this work matters, and that doing it well is not straightforward. Evaluators working in this space carry a real responsibility: to resist the pull toward tidy findings, to be honest about what the evidence can and cannot support, and to design evaluations that are capable of capturing change in all its complexity.
Cover image credit – NAPCAN, 2018
