Adjusting ratings based on raters' tendencies
In our system, we have students and teachers. Some students retain well and are satisfied with our service but perpetually give their lessons low ratings (1 or 2 on a 5 point scale).
We'd like to reward teachers that receive high ratings but also want to control for cases where a student consistently rates their lessons 1 or 2 out of 5 thus dragging down the teacher's average (and median) lesson rating.
Does anyone know how this is usually handled?
Max Fridman Have you thought about rewarding for "points above median rating" instead of absolute points? Over time you can get better about predicting what a student should rate a class and reward "points above expected rating".
These approaches are nice in the abstract but you'll have the challenge of explaining it to teachers. If that's a big challenge for you, you can do something more naive but easier to explain. For example, take a global average of class ratings and reward teachers whose ratings beat the average. Or make a leaderboard and reward the top teacher each month.
Finally, all of these approaches incentivize gaming student ratings, but I'm guessing you know more about that than I do. 😉Reply
One thing you could try is to create a ratio score similar to net promoter score. Treat a 5 as a positive, a 4 as a neutral and a 1 - 3 as a negative. Ratios tend to smooth out idiosyncratic results a bit. But they are also hard to explain.
select teacher , sum(case when [score] = 5 then 1 when [score] = 4 then 0 else -1 end) / count(*) teacher_promoter_ratio from [table_of_scores]
Another method would be to filter out repeat low scorer scores. Perhaps with something like the following pseudocode that would filter out any students with more than 3 reviews and an average review score of 2 or less. This feels more arbitrary, but it is very easy to explain - "We filter out serial low scorers", and reasonable.
select teacher , avg(score) from [table_of_scores] where student in ( select student from [table_of_scores] group by 1 having count(*) <= 3 or avg(score) > 2 ) group by 1Reply
First instinct is to normalize each student against themselves, and then calculate ratings for teachers after that? I'm imagining a student that only rates 1s and 2s, with their "2s" being effectively equivalent to another students 5s. That doesn't seem exactly right though.
Check out this blog post, I'm not sure it fits your needs either, but I could imagine a solid implementation of this getting you closer to the truth of how good teachers _actually_ are.
To the point about "dragging down" ratings, it sort of depends on who the rating is for. If it's for prospective students, and that population of prospective students contains an average amount of "1s and 2s" raters, then a simple average + histogram will give them a sense of their true propensity to also rate it a 1 or a 2. If you're looking for a fair way to rank / compensate teachers, then it seems like across a high enough sample size, the 1s and 2s raters would balance out. If the problem is low sample size and confidence in early ratings, then I think the blog post above might work for you.Reply
It seems to me that a rating given by a student does not reflect their likelihood to retain or be satisfied with the service.
To answer your question though, if we assume the student rating distribution to be normal, you can evaluate the average rating per student and look at the number of standard deviations from the mean.Reply
For dealing with students who habitually rate low, I wonder if you could create rater scores and down-weight those with consistently low scores or low avg scores with low variability in scores. If your intent is to make scores count more for those who appear to rate honestly and to thoughtfully, then weighting something like variability is probably a good idea and will catch both the students who always rate low and those who always rate high.Reply