By Gary Deel, Ph.D., JD
Faculty Director, School of Business, American Public University
Imagine a patient visits her doctor for a routine physical exam. At the conclusion of the exam, she and her doctor engage in the following dialogue:
Doctor: “OK, I’ve finished the exam and in terms of your overall health you earned 197 points.”
Patient: “I’m sorry…points?”
Doctor: “Yes…we have a new scoring tool that helps us to assess the health of our patients.”
Patient: “OK…so I earned 197 points. Out of how many?”
Patient: “Is that good?”
Doctor: “Well, there are 36 categories in the scoring tool. I can go over all of them with you, but it will take a while.”
Patient: “36 categories? Are they all equally important?”
Doctor: “No, definitely not! And many of the categories are somewhat redundant. For example, there is a category for heart rate measured at the wrist and another category for heart rate measured at the neck. But these two categories obviously measure the same thing.”
Patient: “Then why are there two categories?”
Doctor: “I’m not sure. I didn’t design the tool. But if you want, we can just go over the most important pieces.”
Doctor: “To start off, you received 81 out of 82 points for your stress test.”
Patient: “Why not 82 out of 82?”
Doctor: “Perfect scores in that category are virtually unattainable. You’d have to be an Olympic track athlete to score an 82.”
Doctor: “Now on the bloodwork. You scored 33 out of 41.”
Patient: “What was wrong there?”
Doctor: “Well, the 41 points are broken up into sub-categories, which are poor/poor, poor/good, good/good, good/great, and great/great. Your bloodwork appeared to be good/great, but not quite a great/great. And good/great is a 33.”
Patient: “What is the tangible difference between a good/great and great/great?”
Doctor: “It’s just my subjective opinion as your doctor. I felt like it was a good/great, but not a great/great.”
If you were to experience something like this with your doctor, I am willing to bet that you might walk out feeling worried and confused, both about your own health and about the metrics used to assess it.
Start a degree program at American Public University.
Fortunately, doctors are held to higher standards; they are expected to provide clear, objective and unambiguous evidence for their assessments and opinions.
Unfortunately, however, many instructors in higher education do not conform to these same expectations with the grading tools (i.e. rubrics) they use for assessing college coursework. As a result, students are left feeling frustrated and confused about how they are being measured, what they are doing wrong and how they can improve.
I have taught for eight higher learning institutions over the course of more than a decade. And in that time I have worked with rubrics for undergraduate and graduate classes and for brick-and-mortar and online institutions, using quantitative and qualitative assessments for a wide variety of disciplines.
What I have observed throughout this experience is that many rubrics are far more complicated and confusing than they need to be. Consequently, they serve more to distract from the learning process than to support it.
Here are three simple recommendations for educators designing and utilizing rubrics to simplify and align them with the most important goal of educators everywhere: to facilitate learning and to help students improve.
Number 1: The Point Total for All Rubrics Should Be 100
The point total used in the exaggerated doctor’s visit scenario was not fictitious. I have actually seen rubrics based on 328 total points. I have also seen rubrics based on 12 points, 29 points, 77 points, 114 points and many other numbers in between.
Rubrics can be complicated enough for legitimate reasons. There is simply no need to add a level of complexity to the total score such that students cannot easily understand how they did on assessments, even after they know their scores.
In our fictitious physical exam scenario, the doctor said his patient earned 197 points out of a possible 328. But what does that actually mean? It means her score was about 60%. Working that out is straightforward. Divide 197 into 328 and then multiply by 100.
But this begs the question: Why in the world didn’t the designer of the rubric just structure the point values on a 100-point system so that the final score is easily understood without any mathematical conversion?
Everyone is familiar with the concept of 100% and its fractions. For example, if I tell you that you scored 70% on a math test, and the average in the class was 90%, you have an immediate, intuitive sense — without any conversions required — of what that means in terms of how well you did on the test and where you stand in the class.
Note too that all components of a rubric can still maintain their proportionality on a 100-point system. For example, the doctor in our scenario said the stress test was worth a total of 82 possible points out of 328. If the doctor simply reduced the scoring to a total of 100 points, the stress test would still maintain its relative weight, but it would be worth 25 points instead of 82, because 82 is 25% of 328.
Therefore, any rubric that deviates from the intuitive 100-point scoring system needlessly complicates the learning process.
Number 2: Scaling for Rubric Components Should be Simple and Minimal
It’s also important to address the scaling for the rubric components. For example, if a rubric contains a category such as “Writing Quality,” how many possible scores should there be and how should they be defined? This is an area with more potential room for variety than the subject of total points.
However, it’s important to note that scales should be simple and minimal wherever possible. In other words, fewer is usually better.
In the physical exam scenario, you probably got a little dizzy by the doctor’s explanation about “poor/poor” and “good/good” and “good/great.” Regrettably, this kind of complicated grouping occurs all the time in higher education rubrics.
Often, there are either too many scale options or the options are not clearly distinguished from one another. So where ambiguity exists, instructors use their best judgment in assigning scores. However, if a student then challenges the instructor, that instructor cannot defend her own decision.
In my opinion, there is no hard and fast rule for scaling, but rubrics should not be so opaque that they become twisted by subjectivity. The most basic conceivable scale would be two levels in depth: “met” and “not met.”
There may well be good reasons to expand beyond this minimum. But if a rubric grows to where assessors cannot clearly articulate the difference among the levels and their criteria, the rubric has become bloated and needs to be downsized.
One last point on this subject: no scale level of any component should ever be viewed as unattainable. Just as our fictitious doctor told his patient that only Olympic track athletes could earn top scores on the stress test, instructors often view certain rubric levels, especially those that are so poorly defined as to allow for subjective interpretation — as beyond the reach of all but the most prodigious students. If the upper fringes of a rubric are viewed as accessible to only the smallest of elite minorities, then the rubric ought to be revised to better fit the majority of students it serves.
Number 3: Rubric Components Should Be Clear and Distinct
Finally, it’s important to think about the rubric components themselves. How many are there? How many are needed? Are any redundant?
For example, in our physical exam scenario the doctor mentioned two categories for heart rate which are nominally different but have the effect of measuring the same thing. Yet there was no clear reason why there were two measurements.
Similarly, if a rubric for a writing assignment has a component for grammar, a component for APA style and a component for writing quality, are all three really necessary? Perhaps they are, but if so, are the details that each component covers — and the ways in which they are distinct from one another — clear to instructors and students alike?
This is another area where there may be a good-faith reason for a variety of strategies with different rubrics. But, just as I cautioned about rubric scaling, I urge instructors to carefully consider the number of components they use, whether those components are truly necessary and relevant, and whether there is any potential for confusion about which components measure which elements of a student’s work. The rule of thumb should be that if the rubric is so ambiguous as to allow subjectivity to emerge, the rubric needs to be further simplified.
Better Rubrics Make for Better Learning
This article by no means covers all the potential challenges an instructor might face in devising rubrics. Each application requires careful attention to the purpose and function of the assignment in order to design the best possible rubric.
However, if instructors at least avoid the pitfalls we have discussed, they can make great strides in improving the learning experience for their students. Keeping in the theme of our doctor scenario, perhaps there should be a Hippocratic oath for rubric designers: “First, do no harm to students with needlessly complicated rubrics.”
Start a degree program at American Public University.
About the Author
Dr. Gary Deel is a Faculty Director with the School of Business at American Military University. He holds a JD in Law and a Ph.D. in Hospitality/Business Management. He teaches human resources and employment law classes for American Military University, the University of Central Florida, Colorado State University and others.