Rubrics are a (maybe the) hot topic in evaluation right now. The rubrics sessions at the Australian Evaluation Society (AES) International Evaluation Conferences in 2018 and 2019 were standing room only and recognition of Jane Davidson’s work advancing the use of rubrics through the 2019 AEA Paul F. Lazarsfeld Evaluation Theory Award underscores the contribution that rubrics are making to the field.
Riding the rubric wave, in the final AES NSW Chapter meeting of 2019, Dr George Argyrous of the UTS Institute of Public Policy and Governance put forward the idea of a “universal scale” for rubrics, drawing on his extensive experience in designing and delivering rubrics for clients in governments and non-profit organisations.
Dr Argyrous provided an introduction to rubrics and the role of ordinal scales ( (e.g. a survey response scale that ranges from ‘Strongly Disagree’ to ‘Strongly Agree’). All rubrics have some form of scale that denotes levels of performance. Much like a grading sheet, rubrics are an evaluative tool for defining different levels of performance and what these are expected to look like.
What’s in a scale?
From his work on organisations and organisational maturity, Dr Argyrous has identified core items in a rubric scale that could also be applied to a broader context. The difference between this scale and other rubric scales lies in the use of three types of scale points:
- End points: these represent the traditional ends of most scales; in this case the bottom point was ‘absent’, and the top point was ‘fully realised’ i.e. the outcome was either there or not there.
- Middle points: these represent the parts in between the two end points, to capture levels of progress towards an outcome. In the seminar, two points were provided (‘beginning but limited’ and ‘making progress’), but more points can be added to capture greater granularity of progress depending on the scenario.
- Points beyond the end points: these points are the most interesting part of the scale, capturing scenarios that go beyond traditional scales. At the positive end, ‘leading/ innovating’ captures cases that go above and beyond the intended outcome, while at the other end ‘opposed’ captures cases where there are active barriers to the outcome being realised (unlike ‘absent’ which assumes a benign scenario).
There isn’t a definitive answer about how many points you need in a scale. Dr Argyrous suggested more points would be useful for monitoring systems in which you need to see granular progress.
The collaborative development process
When developing a rubric, it is important to collaborate with stakeholders in the design of the dimensions and the descriptions of each level of the scale. Collaborations can range from a full co-design process through to input from advisory committees or discussions with program managers, but having stakeholder voice in the process ensures the rubric reflects the design of the program and helps build ownership of the rubric itself (something that is especially vital if the rubric is being used as a self-assessment tool). Working in small groups to write definitions for each of the scale points in an organisational maturity scenario, we found that participants had slightly different interpretations that, together, strengthened our resulting definition – highlighting the value of collaboration.
What else do you need to know?
- Getting the level of detail right: Developing a scale point description is not always easy, and performance at a certain level can combine many different areas of activity at once. For example, a rubric evaluating a speaker’s performance might consider eye contact, volume, and body language as being markers of good performance, but is it best to assess these together or separately, as three different dimensions? There’s no easy answer. A well-designed rubric balances simplicity with the ability to effectively capture different aspects of performance by breaking up dimensions when necessary (and only when necessary). If you find a performance could fit in more than one of the scale points, you’ve combined too many aspects in one scale.
- Defining the users and data collection: It is important to consider how the rubric will be used—who makes the assessment (the end user, the program manager or an external assessor) and the unit of assessment (the individual, the program or the organisation) changes the way the a rubric is designed and defined.
- Placing rubrics in context: A key part of the discussion was the value that can be added to rubrics by placing them in context. Rubrics are great tools for communicating high-level performance but supplementing them with qualitative information can enhance their value and use, for example on:
- the rationale and evidence for the assessment decision
- effective activities and program components
- enablers of change
- barriers to moving up the scale
- future needs
- other relevant outcomes and impacts not captured in the dimension.
- Knowing when to use a rubric: It is important to consider whether a rubric is the best solution for a given problem. If there are already established and reliable quantitative measures of program outcomes and impacts, then rubrics may not add value.
Where to next with rubrics?
The interest in rubrics is justified. Rubrics are an excellent tool for conducting and communicating the results of evaluation, and one of the few tools for making explicit evaluative judgments. A well-designed rubric makes clear the reasoning behind an assessment, and they can be designed to fit a range of operational and cultural contexts. They can even be used to compare programs and assess entire portfolios of programs, addressing a major challenge in government policy (check out Gerard Atkinson’s tips for Comparing Apples and Oranges Using Rubrics on the ARTD blog if you missed his presentation at the AMSRS ACT Conference Evidence, insights and beyond: Enhancing Government policies, programs and services).
While rubrics are a valuable part of the evaluators’ toolkit, they are not a silver bullet. They are an effective data collection and communication tool and have the ability to substantially contribute to effective evaluations, and even work in concert with tools such as program logics and outcomes matrices to harmonise language and enable cross-program comparisons.
However, rubrics are by no means a completed product. There is ongoing discussion and potential for innovation in this space, whether in applying rubrics to different types of programs, or by furthering Gerard’s work in applying them across portfolios of programs. The interest and innovation shows both the potential and capability of rubrics to continue to enhance evaluation practice.