Why? All we know it that both are not-numbers. It's just a label for something that cannot usefully be further identified.
I'm pretty sure mathematicians can come up with many different not-numbers that have to share the same label, with a rather slim chance of being equal ...
They don't have to share the same label. 64 bit floating points have 2^52 different NaN values. Even Float16 has 1024 which should be more than enough for all the indeterminent forms.
Collisions will come up. If you have some allocator for NaN, it will have to start reusing values. The allocator could be inefficient. How do you handle threads? Does each thread have its own NaN-allocating counter? What if threads communicate numeric results to each other and have NaN-counters close in value?
The underlying problem is that a system of calculation which propagates error symbols up the expression tree to indicate error is mixed with a system of Boolean calculation which has no such symbol. The bad calculation bubbles up a NaN up to the level of the comparison. There, the not-a-number gets eaten and becomes a Boolean true or false --- rather than becoming not-a-truth and continuing to bubble up.
There is no satisfactory way to plug NaNs into an expression that produces a clean two-valued Boolean truth with no error indication. You must separately test for the NaN.
If X and Y are the same label, they should compare equal to satisfy the Law of Identity.
If you perform two calculations, whose values are coming up as equal, but you didn't check the fact that both calculated the same NaN, that is your problem.
Suppose you are looking for the result of the two calculations being unequal, and they produce NaN (or at least one of them does). That's also false positive. There is no way to get around checking for NaN.
Nothing which cheerfully concludes that two bitwise-identical operands are different can be called equality.
The equality operation should only apply its own specific logic to a pair of operands which fail the bitwise test.
(That doesn't mean all bits have to be looked at, like if an object has padding bits that don't contribute to the value.)
In floating-point, like IEE754, a positive and negative zero still compare equal; they failed the bitwise test, so then the numeric logic still concludes they are the same number.
>Nothing which cheerfully concludes that two bitwise-identical operands are different can be called equality.
You are asserting that as if it were some kind of law of nature, but I'm afraid that's just your personal misconception. The IEEE754 equality function is defined such that it can sometimes be false when applied to identical operands. There are very good reasons why that is the case, just like there are very good reasons why NULL = NULL is false in SQL.
See upthread: law of identity. That's even higher than nature.
> The IEEE754 equality function is defined such that it can sometimes be false when applied to identical operands.
Cool! So, OK, (1) don't promote that function into the fundamental equality operator of programming languages; put it in a library. (2) provide a real equality in parallel.
I should be able to do things like this:
double x = find_number_in_list(list, number, not_found_nan);
if (x == not_found_nan) { /* wasn't found */ }
I see you are referring to "identical operands"; that requires the intellectual basis of an identity relation under entities are identical to themselves. Otherwise there is no foundation for your sentence.
We can have all sorts of weird functions whose properties are useful insofar that they remove complexity or verbosity from specific scenarios in which they are used.
For instance, it's useful to have a numeric equality operator which tests that two values are within a small epsilon of each other and reports true. That's not even an equivalence relation though; near(A, B) and near(B, C) does not imply near(A, C).
It would be pretty irresponsible to put this behavior into the principal equality, like the one and only built-in == operator of a language or what have you.
> there are very good reasons why NULL = NULL is false in SQL.
I don't know much about SQL, but I understand why, in a join of two tables based on some field equivalence, you wouldn't want to include the entries where the fields are NULL.
However, there is no conflict here with the Law of Identity.
In a database, a NULL field is an external representation stored in a table. We can define a model of computation whereby when the database record is internalized for processing, each NULL field value maps to a unique object. This is very similar to the concept of an uninterned symbol in Common Lisp:
(eq '#:null '#:null) -> NIL
The #: syntax, whenever scanned, creates a new symbol object. There are two different objects in the above eq call which only look the same when printed, because they have the same name. This idea could be used to implement null database field values.
However, we can retain the idea that if we pull a null value from a specific field from a specific record, that null value is equal to itself. We do not have to throw the Law of Identity under the bus, in other words:
(let ((sym '#:null)) (eq sym sym)) -> T
If we do the following in SQL (pardon me if I don't have the syntax right), I expect the table to be replicated in the selection:
select * from X where X.Y == X.Y
records where Y is NULL should not be missing. If A and B are different records in this X table, such that A.Y and B.Y are NULL, then A.Y == B.Y can be false; that is perfectly okay.
No one was talking about bitwise equality though. Bitwise equality means positive and negative zero do not compare as equal, and some NaNs compare as equal while others don't, in ways that will vary according to exactly where those NaNs came from. There may be times where that might be useful but it would be uncommon.
Usable and reliable is not the same thing as good design or good idea.
The fact that Windows reserves the PRN file name in every directory is probably usable and reliable to someone. That doesn't mean it's a good design for providing access to a printer device.
To test a set membership property of an object, you want a predicate function which takes the object as its only argument. For instance isnan(x).
This predicate can be efficiently implemented without relying on a bastardized equality operation.
Consider that the compiler cannot optimize A != A into false, or A == A into true, because NaN values can occur at run time.
While you might not explicitly write A == A into your code, it could occur implicitly due to some macro expansion, inline expansion or other code transformation.
I think GCC with -ffast-math gets rid of this NaN rule and does such optimizations anyway. (Your code just has to avoid generating NaNs so that the optimizations are valid.)
If it has no meaning, then the comparison should have an unspecified result, not true. Otherwise it has a meaning: the meaning of producing true! However, it is a poorly considered meaning which requires a thing to be different from itself.
Since NaN values are valid representations which play a role in the system, and can be used in operations (such as comparing a NaN to a number, which is false), each of them must compare equal to itself.
If the bits on the left are the same as the bits on the right, the comparison is true. Distinct NaN bit patterns are unequal. Simple as that.
Whatever I would do, I would make sure that a comparison observes the Law of Identity.
(I'd rather not have Inf and NaN at all; operations should just generate an exception if they can't come up with a number.)
You have a fairly complex logic being stuffed into a binary operation — NaN == NaN and NaN != NaN are both irresponsible. The same with comparing to INF. The correct answer is that boolean operations don’t successfully represent the possibilities, and shouldn’t be offered in the first place.
That is a valid view. NaN is supposed to propagate an error value, and that concept should continue through Boolean expressions. So that is to say, there has to be a NaT (not a truth) value which results instead of true or false, if a NaN is involved in a relational expression.
Problem is, that is impractical. Programming languages tend to have two-valued Boolen baked into their DNA; it's implicit in if/then/else conditionals which will have to treat NaT as false --- back to square one.
Programming languages with two-valued Booleans are not going to accommodate such a thing (it is not as easy to sneak in as NaN into floating-point). Even if they were to, programmers are going to be reluctant to turn every if/then situation into a three-way switch.
Besides the case of floating-point numbers with NaNs there are many other cases of partial order relations.
The problem is that most people know how to handle only total order relations, for which only 6 operations with Boolean results can be defined (equal and not-equal, less and greater-or-equal, greater and less-or-equal).
While it is possible to handle partial orders using ternary logic, it is easier to handle them with operations with Boolean results, so this is what all programming languages either provide or they should provide.
The difference is that for partial orders you no longer have only 6 operations with Boolean results (3 plus their negations), but you have 14 operations (7 + their negations).
One operation pair is ordered / unordered (ordered means either equal or less or greater).
The other 6 pairs correspond to the 6 well known relational operators from the total order, which now no longer are each other's negation, together with their 6 negations, which now include the possibility that the 2 operands are unordered.
For example, corresponding to the negation pair less and greater-or-equal from the total order, for a partial order there are 2 negation pairs, less and not-less (i.e. greater, equal or unordered) and greater-or-equal and neither-greater-nor-equal (i.e. less-or-unordered).
In the education of the programmers there should be more stress on the possible relational operators for partial order relations, because they appear in many situations, both for FP computations and for databases, and handling partial orders is only slightly more complex than handling total orders, but many people are not accustomed to this.
Read that Chesterton quote somebody posted above. It sounds like you need to study the standard a bit more, because you don’t seem to understand why NaNs are the way they are. You’re arguing from principles that don’t apply.
I don't know enough to say; all I remember from the few interactions I've had with SQL over my long programming career is that it's intellectually unsavory as a whole.
You must like dividing by zero and never knowing about it. There's a reason why NaN blows things up. It's by design so that math errors don't propagate everywhere.
A != A if A is a NaN: that's pretty sleazy.
https://en.wikipedia.org/wiki/Law_of_identity