I don't really like star systems because I spend way too much time deciding if s...

adrianmonk · on Feb 6, 2020

Not only that, but I don't trust myself to remember how I calibrated myself to the rating system in the past. If I'm more stingy with the 5-star ratings now than I was 2 years ago, then I'm going to skew things in a way that the system probably can't understand.

In other words, it not only expects me to define my own scale, it also expects me to stay consistent with it over time. I don't think that's realistic.

I honestly wouldn't mind if from time to time, a service asked me to stack-rank movies. Because that would be a question I feel confident I can answer. Give me two movies and ask me to say I prefer A, prefer B, or don't have a clear preference. Or give me 5 movies and have me put them order from best to worst, allowing me to say two of them tied or exclude some. Maybe the UI on this would be too weird for average users, though.

smohare · on Feb 7, 2020

Relative individual consistency is not a necessary component in these systems, except perhaps in the most simplistic. It’s not exactly difficult to introduce temporal normalization / regularization into these models.

I don’t have any specific knowledge of Netflix per se, but I suspect the granularity of a 5 point rating system just proved to be superfluous. Even a +1 / -1 rating is probably sufficiently proxied by a simple measure of completion percentage (appropriately normalized).

TeMPOraL · on Feb 7, 2020

> Even a +1 / -1 rating is probably sufficiently proxied by a simple measure of completion percentage (appropriately normalized).

I'm just hoping they're not making the same mistake here that they're making in the UI: showing movies I watched to the end as "continue watching", because I closed them on the end credits roll.

derefr · on Feb 6, 2020

I've always thought the best design for a star system would be a pair of positive/negative buttons, where the rating you give is the truncated logarithm of the number of times you press the button. So you can like or dislike things a little very easily, but you have to expend 10x as much effort to really like/dislike things, and 100x as much effort to express absolute adoration/hatred.

Effectively, this would be a "proof of emotional work": it's not a measure of your own subjective experience of your reaction to the product, but rather measuring your level of objectively-observable motivation to [dis]recommend the product. Which, I would think, would neatly get around all the arguments about "what it means" for something to be 3/4/5 stars—provoking a large number of people to click a like (or dislike!) button 10 times seems like a pretty good predictor of some objective property that the product has. (Whether that property is "quality" is up for debate.)

gwern · on Feb 7, 2020

"Costly signaling theory from ecology posits that signals will be more honest and thus information will be accurately communicated when signaling carries a nontrivial cost. Our study combines this concept from behavioral ecology with methods of computational social science to show how costly signaling can improve crowd wisdom in human, online rating systems. Specifically, we endowed a rating widget with virtual friction to increase the time cost for reporting extreme scores. Even without any conflicts of interests or incentives to cheat, costly signaling helped obtain reliable crowd estimates of quality. Our results have implications for the ubiquitous solicitation of evaluations in e-commerce, and the approach can be generalized and tested in a variety of large-scale online communication systems."

https://www.pnas.org/content/116/15/7256

ehsankia · on Feb 6, 2020

Yep, as much as people hate it, when it comes to recommendation, up/down is all the data you need. Trying to build a system out of a 5 star system just adds unnecessary complexity, and the reality was that people used the star system differently making it even harder. up/down thumb is explicit and cleaner to work with.

glitcher · on Feb 6, 2020

The problem with the up/down system for me is not my own ability to like/dislike specific titles, but more the fact that Netflix no longer displays the average of all user votes. Sure different people used the 5 star system in different ways, and there were some who may have misused it by giving poor ratings to things they never intended on watching, but it was a great signal to me for the extremes.

Scenario: I'm considering some odd looking sci-fi movie to watch that I never heard of before. Ratings between 2-4 stars might not tell me much, but very reliably titles with only one star were terrible movies. Now Netflix happily recommends any and all sci-fi titles, saying they are a "98% match" for me! Sure by category, but when the movie is a low budget dumpster fire I no longer have that instant signaling that the previous rating system gave me.

rpdillon · on Feb 6, 2020

IIRC, Netflix never showed the average, but rather the rating they predicted you would give it, taking into account your previous viewing and rating.

stubish · on Feb 7, 2020

I agree that thumbs up/thumbs down is probably better to encode appreciation than a 0-5 scale. But for building a recommendations engine a single bit of information is not enough. If I and everyone else flags shows we are not interested in viewing 'thumbs down', then all shows end up with terrible ratings. Similarly 'thumbs down' for a great movie, but seen it already thanks so stop screaming it at me. Thumbs up and Thumbs down is certainly not all the data you need, as is demonstrated by Netflix as it stands today.

jaymmartin · on Feb 6, 2020

Netflix used to define the ratings as "Hated It", "Didn't Like It", "Liked It", "Really Liked It", and "Loved It". I didn't find it difficult to memorize.

What's to stop you from applying whatever textual descriptions you want to the numbers?

huebomont · on Feb 6, 2020

Because I might apply different ones, making them useless in aggregate if I assume anything I like has to be a 4 or 5, but you assume that you'll rate anything you like as low as a 2, and reserve just 1 for "don't watch it"

kelnos · on Feb 6, 2020

This is why the Uber/Lyft rating system is effectively useless. Five stars is basically "was not unsafe", and four and below indicate significant safety, cleanliness, rudeness, or other problems.

If each star rating had a textual description of what it meant, and drivers didn't have to maintain something like a 4.3 (or whatever it is) in order to stay on the platform, the ratings would actually mean something.

hombre_fatal · on Feb 7, 2020

Then you have my girlfriend who rates everything a 3/5 because it's average or to be expected. 4/5 for particularly good service. And nobody gets a 5/5.

I think star systems are a waste of time.

brewdad · on Feb 7, 2020

This is why I never fill out my car dealership's service department surveys. I'm not going to give a normal oil change visit 10/10 in every category because that's simply bullshit. Giving anything less gets my service advisor punished for not delivering excellent service.

I'm honest with him and tell him exactly why I won't be filling it out. If I do ever fill one out, he'll know exactly why before I send it in.

gvjddbnvdrbv · on Feb 7, 2020

I usually do the same but for Uber I rated decent drivers 5/5 because I'd been told everyone does this hopefully she does the same.

Anthony-G · on Feb 7, 2020

That’s still the way the star ratings appear when I watch Netflix using my 2013 Samsung Blu-ray player. For the first few years of watching Netflix, I would religiously rate films and TV show (even those I’ve already seen in the cinema) to help its recommendation engine but it didn’t really help much.

frandroid · on Feb 12, 2020

My 5-star system:

: I want to give this more than 5-stars

: This is pretty good, but not MORE than 5 stars

: I've enjoyed this.

: I need to keep this for completist reasons

*: Delete this.

Oh wait, that's my iTunes rating system

Marsymars · on Feb 7, 2020

goodreads has textual descriptions for its star ratings of "did not like it", "it was ok", "liked it", "really liked it", "it was amazing".

ravenstine · on Feb 6, 2020

Netflix should have an "So bad it's good" rating. ;)

frandroid · on Feb 12, 2020

We need to invent a quantum ratings system.

pfedak · on Feb 7, 2020

taste.io does Awful/Meh/Good/Amazing