Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Would you sort

  1.10
  1.2
or

  1.2
  1.10
?

I would not know how an OS treats those if we do not assume mindreading vs proper lexicographic order. Why would we need to substitute precision with vagueness for something that simply taking care of proper naming would suffice?



Ah yes sorry, 1.10 comes after 1.2 because 10 is bigger than 2 (so in fact different from your example). But assuming your original list is a list of versions (which seems reasonable given the presence of multiple decimal points for some cases), then that’s the order you’d want.

If you have non-integer numbers in your filenames then it won’t give the order you want, but there isn’t going to be a rule that works for all cases.


I was with you until this point, but 1.2 is bigger than 1.10, because 1.2 is a shortened version of writing 1.20 _unless_ you explicitely want these to be version numbers or something like that. The normal expectation would be to treat numbers as, well, mathematical numbers, and not SemVer, especially if we only have one decimal point, don't you think?


As I said, the sorting rule won’t always give pleasing results, but it seems to me like a simple and reasonable modification of lexicographic ordering.


It is neither simple, nor reasonable.

1.10, the number, is equivalent to 1.1. It is less than 1.2. You say you want numbers to sort as numbers, but you want 1.10 to be greater than 1.2.

Do you consider '1/4' to be a number? Should it come before or after '1/3'?

I'm guessing that you don't want to sort one character at a time if you encounter one of [0-9]. Instead, you want to group all consecutive [0-9] as a single sortable number. But aren't characters '.', ',', '/', '-' also part of numbers?

What about numbers like ↋, 五, π, B, ⅔, or -1?


It doesn’t work for decimals. It also doesn’t work for pi, or most dates. That’s okay. Supporting those cases would require “reading your mind” / trying to guess what the user wants by applying opaque rules. I certainly don’t want that.

Treating consecutive digits as numbers is a simple modification (I still think it’s quite simple) that is easy to understand and supports 99% of real-world use cases.


> But assuming your original list is a list of versions (which seems reasonable given the presence of multiple decimal points for some cases), then that’s the order you’d want.

What level of assumption is here expected from the sorting-system, would it have to process ALL entries of the list to find multiple decimal-points and then assume that they are ALL versions and not numbers?

How to treat this on different locales, where the decimal point is a comma and thousands-separator is a dot. Should the locale then also be considered by that system? Also when listing the folder of a remote-system with a different locale?

What about dates, should that system attempt to sort entries with multiple date-formats (yyyy-mm-dd, dd-mm-yyyy, dd-MMM-yyyy,...)?

The topic is far more complex than this narrow example. If we expect such a system to alter its sorting based on some data format interpretation, there is a risk of misinterpretation which might make the whole list unusable...


It has nothing to do with decimal points. It just looks at any contiguous sequence of digits and treats it as a single character for the purposes of sorting. The decimal point could be any other character and the behavior would be the same.


So only whole numbers are sorted as numbers then.

Decimal numbers are treated as strings and will have a completely different order, with digits after the decimal point sorted differently to whole numbers without fractions?

Or you mean every set of continuous digits within the same string are considered as individual whole number?

Depending on the decision, either lists of decimal numbers or lists of version numbers will be sorted wrong.

--> This could be covered by adjusting the logic based on the amount of decimal points.

And the logic complexity keeps increasing, up to an arbitrary point of "no, this will not be considered", resulting in an unpredictable user-experience of sorting...


>Depending on the decision, either lists of decimal numbers or lists of version numbers will be sorted wrong.

Yes. I don’t see why this is a big deal.

I didn’t suggest adjusting the logic based on the number of decimal points.


Ah ok.

I understand that you found your perfect trade-off for sorting based on longer considerations. But it will be difficult to communicate such a concept to a user.

Applying partial rules to improve sorting in one direction is not a lossless activity, it makes the UX actually worse in other scenarios as the user is first guided to assume a certain behavior, but then learns that his expectation is broken in adjacent scenarios (Which is more or less the bottom-line of that article to begin with).

In the end it'll be just "another standard" for sorting [0]

[0] https://xkcd.com/927/


> But it will be difficult to communicate such a concept to a user.

This isn't a prerequisite, since the existing naive character sort approach is not communicated either. In fact, it's almost universally unexpected by any user who hasn't written a naive string sort. Apple doesn't do this, and I very much did not need it communicated to me why 10 was coming after 2, because that's what everyone, who's not a programmer, expects.

As a litmus test, go ask some people, who are not programmers, without loading the question beyond "here are some files, how would you expect for them to be displayed in a list?". Show the lists side by side. It should not surprise you.


I consider 八 to be a whole number.


There is a rule that works for all cases. It's lexicographical sorting.

Simple. Consistent. Easy to manipulate to get what you want.


We just discussed a situation where lexicographical sorting doesn’t work. Adding in a rule to treat consecutive digits as one number doesn’t significantly complicate the logic and makes sorting work for a major additional use case. It doesn’t magically fix every case but it fixes a common one with minimal downsides.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: