> Most strikingly DEC64 doesn't do normalization, so comparison will be a nightmare (as you have to normalize in order to compare!). He tried to special-case integer-only arguments, which hides the fact that non-integer cases are much, much slower thanks to added branches and complexity. If DEC64 were going to be "the only number type" in future languages, it had to be much better than this.
The set of real numbers is continuous and uncountably infinite. Any attempt to fit it into a discrete finite set necessarily requires severe tradeoffs. Different tradeoffs are desirable for different applications.
Almost All real numbers are non-Computable, so what we're most commonly reaching for is only the rationals, but the thing is DEC64 can't represent lots of those, it seems like a very niche type rather than, as the author asserts, the only numeric type you need.
I wasn't a big fan of floating point until I worked with a former college professor who had taught astrophysics. When possible, he preferred to use respected libraries that would give accurate results fast. But when he had to implement things himself, he didn't always necessarily want the fastest or the most accurate implementation; he'd intentionally make and document the tradeoffs for his implementation. He could analyze an algorithm to estimate the accumulated units in the last place error ( https://en.wikipedia.org/wiki/Unit_in_the_last_place ), but he realized when that wasn't necessary.
IEEE 754 is a floating point standard. It has a few warts that would be nice to fix if we had tabula rasa, but on the whole is one of the most successful standards anywhere. It defines a set of binary and decimal types and operations that make defensible engineering tradeoffs and are used across all sorts of software and hardware with great effect. In the places where better choices might be made knowing what we know today, there are historical reasons why different choices were made in the past.
DEC64 is just some bullshit one dude made up, and has nothing to do with “floating-point standards.”
It is important to remember that IEEE 754 is, in practice, aspirational. It is very complex and nobody gets it 100% correct. There are so many end cases around the sticky bit, quiet vs. signaling NaNs, etc, that a processor that gets it 100% correct for every special case simply does not exist.
One of the most important things that IEEE 754 mandates is gradual underflow (denormals) in the smallest binate. Otherwise you have a giant non-monotonic jump between the smallest normalizable float and zero. Which plays havoc with the stability of numerical algorithms.
Sorry, no. IEEE 754 is correctly implemented in pretty much all modern hardware [1], save for the fact that optional operations (e.g., the suggested transcendental operations) are not implemented.
The problem you run into is that the compiler generally does not implement the IEEE 754 model fully strictly, especially under default flags--you have to opt into strict IEEE 754 conformance, and even there, I'd be wary of the potential for bugs. (Hence one of the things I'm working on, quite slowly, is a special custom compiler that is designed to have 100% predictable assembly output for floating-point operations so that I can test some floating-point implementation things without having to worry about pesky optimizations interfering with me).
[1] The biggest stumbling block is denormal support: a lot of processors opted to support denormals only by trapping on it and having an OS-level routine to fix up the output. That said, both AMD and Apple have figured out how to support denormals in hardware with no performance penalty (Intel has some way to go), and from what I can tell, even most GPUs have given up and added full denormal support as well.
There are a bunch of different encodings for decimal floating-point. I fail to see how this is the standard that all languages are converging to.
IEEE754 normalizes two encodings, BID and DPD, for decimal32, decimal64 and decimal128 precision presets. This is neither of those.
Many libraries use an approach with a simple significand + exponent, similar to the article, but the representation is not standardized, some use full integral types for this rather than specific bits (C# uses 96+32, Python uses a tuple of arbitrary integers). It's essentially closer to fixed-point but with a variable exponent.
The representation from the article is definitely a fairly good compromise though, specifically if you're dealing with mostly-fixed-point data.
> I fail to see how this is the standard that all languages are converging to.
Yes, you are failing to see it because it's not there.
Crockford had a hot book amongst junior and amateur JavaScript developers 17 years ago. But he's never really been involved in any language standardization work. Even his self-described "invention" of JSON I wouldn't really call an invention and rather a discovery that one could send JS object literals instead of XML over the wire. This discovery also opened a new class of XSS exploits until browsers implemented JSON.parse.
So, when he says "DEC64 is intended to be the only number type in the next generation of application programming languages," it's just the same, old bloviation he has always employed in his writing.
...primary source: """I do not claim to have invented JSON. I claim only that I discovered it. It existed in nature. I identified it, gave it a name, and showed how it was useful. I never claimed to be the first to have discovered JSON. I made my discovery in the spring of 2001. There were other developers who were using it 2000."""
I see value in this semi-simplistic representation of DEC64, and to respond to a peer comment, consider "DEC64-norm", where non-normalized representations are "illegal" or must be tainted/tagged/parsed like UTF-8 before processing.
His other "useful" contribution which I lament not seeing anywhere else is "The Crockford Keyboard" another "near-discovery" of existing useful properties in an endemic standard (the English alphabet): https://www.crockford.com/keyboard.html
I really wish that the X-Box onscreen keyboard used this layout! You could imagine that if you could "hot-key" back to the left row (vowels), and have "tab-complete" (ie: shift-tab/tab) it could be a pretty comfortable typing experience compared to the existing "let's copy QWERTY(?!)".
...I feel him as a kindred spirit of pragmatism and complexity reduction (which necessarily) introduces more complexity since we live in a world where complicated things exist. Compared to IEEE floating point numbers, this DEC64 or "Number()" type seems like a breath of fresh air (apart from normalization, as mentioned!).
I do think DEC64 has merit: Most of the time, I'd prefer to be able to reason about how non-integers are handled; and I just can't reason about traditional floating point.
Yes, Doug does bloviate quite a bit. (And when I asked him if there was a way to make callbacks in Node.js / Javascript simpler, he mocked me. A few years later, "await" was added to the language, which generally cleans up callback chains.)
Either you want fixed point for your minimum unit of accounting or you want floating point because you’re doing math with big / small numbers and you can tolerate a certain amount of truncation. I have no idea what the application for floating point with a weird base is. Unacceptable for accounting, and physicists are smart enough to work in base 2.
I'm pretty confident that dfp is used for financial computation. Both because it has been pushed heavily by IBM (who certainly are very involved in financial industry) and because many papers describing dfp use financial applications as motivating example. For example this paper: https://speleotrove.com/decimal/IEEE-cowlishaw-arith16.pdf
> This extensive use of decimal data suggested that it would be worthwhile to study how the data are used
and how decimal arithmetic should be defined. These
investigations showed that the nature of commercial
computation has changed so that decimal floating-point
arithmetic is now an advantage for many applications.
> It also became apparent that the increasing use of decimal floating-point, both in programming languages and
in application libraries, brought into question any
assumption that decimal arithmetic is an insignificant part of commercial workloads.
> Simple changes to existing benchmarks (which used incorrect binary approximations for financial computations) indicated that many applications, such as a typical Internet-based ‘warehouse’ application, may be spending 50% or more of their processing time in decimal arithmetic. Further, a new benchmark, designed to model an extreme case (a telephone company’s daily billing application), shows that the decimal processing overhead could reach over 90%
Wow. OK, I believe you. Still don’t see the advantages over using the same number of bits for fixed point math, but this definitely sounds like something IBM would do.
Edit: Back of the envelope, you could measure 10^26 dollars with picodollar resolution using 128 bits
Decimal128 has exact rounding of decimal rules and preserves trailing zeros.
I don’t think Decimal64 has the same features, but it has been a while.
But unless you hit the limits of 34 decimal digits of significand, Decimal128 will work for anything you would use fixed point for, but much faster if you have hardware support like on the IBM cpus or some of the sparc cpus from Japan.
OPAP Agg functions as an example are an application.
> I don’t think Decimal64 has the same features, but it has been a while.
Decimal32, Decimal64, and Decimal128 all follow the same rules, they just have different values for the exponent range and number of significant figures.
Actually, this is true for all of the IEEE 754 formats: the specification is parameterized on (base (though only 2 or 10 is possible), max exponent, number of significant figures), although there are number of issues that only exist for IEEE 754 decimal floating-point numbers, like exponent quantum or BID/DPD encoding stuff.
You are correct, the problem is that Decimal64 has 16 digits of significand, while items like apportioned per call taxes need to be calculated with six digits past the decimal before rounding which requires about 20 digits.
Other calculations like interest rates take even more and cobol requires 32 digits.
As decimal128 format supports 34 decimal digits of significand, and has emulated exact rounding, it can meet that standard.
While items is more complex, requiring ~15-20% more silicon space in the ALU plus larger dataset size, compared to arbitrary precision libraries like BigNum it is more efficient for business applications.
What's the point of saying that it is "very well suited to all applications that are concerned with money" and then write 3.6028797018963967E+143, which is obviously missing a few gigamultiplujillion.
No point whatsoever. If you have to deal with money you never use floating point. Either use arbitrary precision, or integers with a sufficiently small base like blockchains do (which can be also though of as fixed point). Also you would never be multiplying two money value (there are no "square dollars").
> It can precisely represent decimal fractions with 16 decimal places, which makes it very well suited to all applications that are concerned with money.
How many decimal places do people use for financial calculations in practice? Google search’s AI answer said 4, 6, or sometimes 8. Is that true for large financial institutions, like banks and hedge funds and governments and bitcoin exchanges?
I’ve heard lots of people saying floating point isn’t good for financial calculations, and I believe it. Would DEC64 actually be good for money if it has 16 base 10 digits, and if so, why? If you need 8 decimal places, you have 8 digits left, and start losing decimal places at around 100M of whatever currency you’re working with. I’m just guessing that working with large sums is exactly when you actually need more decimal places, no?
I hate to break the news but banks use floating point numbers all the time, for financial calculations. They shouldn't but they do and they use all sorts of tricks like alternate rounding schemes and some pretty deep knowledge of floats to try and keep the books roughly right. I have never seen a Money type nor anything like DEC64 in a real commercial system, its all floats or ints/longs for pennies.
> I have never seen a Money type nor anything like DEC64 in a real commercial system, its all floats or ints/longs for pennies.
I've been working in ERP land for a long time working with many different commercial systems of different sizes and market share and I've only seen NUMERIC/DECIMAL types used in DB for things like Price, Cost, Amount etc.
The only time I've ever seen floating point is for storage of non-money type values, like latitude and longitude of store locations.
That’s not too surprising to me, I imagine that many number types are used, and it depends entirely on the task at hand? If you have deeper knowledge of what gets done in practice, I’m still curious what criteria and types and how many decimal points might get used in the most restrictive cases. What do people use in practice for, say, interest compounding on a trillion dollars? I can calculate how many decimal places I need at a minimum for any given transaction to guarantee the correct pennies value, but I don’t have first-hand experience with banks and I don’t know what the rules of thumb, or formal standards might be for safe & careful calculation on large sums of money. I would imagine they avoid doing floating point analysis in every case, since that’s expensive engineering?
My electricity and gas bills both charge me $0.xxxxxx per unit price, or 10,000ths of a ¢, although the last digit is invariably a zero. I've also seen 4-5 digits on currency exchange places.
I'd have to break out a calculator to be certain, but my guess is that most of these transactions amount to sum( round_to_nearest_cent(multiply(a * b))), where a is a value in thousandths of a cent--which is to say, there isn't a single "this is the fixed-point unit to use for all your calculations."
For financial modeling, the answer is definitely "just use floats," because any rounding-induced error is going to be much smaller than the inherent uncertainty of your model values anyways. It's not like companies report their income to the nearest cent, after all.
Right, the number of digits you need for any given calculation depends on the magnitudes of the numbers involved, and the acceptable precision of the result. BTW round to nearest might be unsafe, I’m certain that there are many situations where people will specifically avoid rounding to nearest, I would not assume that companies use that scheme.
It seems like a decent assumption that financial companies are not spending engineering time (money) on a floating point analysis of every single computation. They must generally have a desired accuracy, some default numeric types, and workflows that use more bits than necessary for most calculations in exchange for not having to spend time thinking hard about every math op, right? That’s how it works everywhere else.
The accuracy used for reporting doesn’t seem relevant to the accuracy used for internal calculations. It’s fine for large companies to report incomes to rounded millions, while it’s absolutely unacceptable to round compound interest calculations in a bank account to millions, regardless of the balance.
Oh now that is somewhat surprising! Is this 64 bit or 128 bit floats? 32 bit floats aren’t accurate enough to represent $1B to the penny. Do error values get stored as deltas and passed around with the float values? Would love to hear more about this, do you know of any good summaries of the engineering workflow online?
Wonder if it depends on which side of banking: investments, accounting, clients side (web pages, and ui) vs the backend. Or if we're talking about central banks vs a small credit union.
I have done retail and investment banking as well as hedge funds for front and back office. Everything from giant banks everyone knows through to small hedge funds no one has.
Thanks. That’s fascinating. I have zero experience with any banking/finance stuff but through tropes and various anecdotal accounts was somehow sure floats were not used by banks. I guess it’s one of those persistent rumors that just refuses to die.
The noob-expert meme is really apropos here, as floating-point is really the option to go for if you know nothing about it or you know a lot about it.
The "problem" with floating-point is that you have to deal with rounding. But in financial contracts, you have to deal with rounding modes. Fixed-point (aka integers) gives you one rounding mode, and that rounding mode is wrong.
With ints/longs they usually use something that alternates the rounding so that it spreads, but that alternation can be done on a system wide basis, on a machine basis or in an individual account.
For floats what they tend to do is carry around the estimated error when they are doing a lot of calculations and then adjust for the error at the end to bring the value back.
Both of them are trying to deal with the inherent issues with the representation being biased and ill suited for money in practice. But oddly this is never pulled together into a Money type because it really depends what you are doing at the time, sometimes you just round and move on and sometimes the error is expected to impact the result because you are dealing with millions/billions with calculations on 10/100s so its going to matter.
But reality is the books are basically off by pennies to pounds every day because of these representations and its part of the reason no one worries about being off a little bit because the various systems do this differently.
> The BASIC language eliminated much of the complexity of FORTRAN by having a single number type. This simplified the programming model and avoided a class of errors caused by selection of the wrong type. The efficiencies that could have gained from having numerous number types proved to be insignificant.
DEC64 was specifically designed to be the only number type a language uses (not saying I agree, just explaining the rationale).
> Languages for scientific computing like FORTRAN provided multiple floating point types such as REAL and DOUBLE PRECISION as well as INTEGER, often also in multiple sizes. This was to allow programmers to reduce program size and running time. This convention was adopted by later languages like C and Java. In modern systems, this sort of memory saving is pointless.
More than that, the idea that anyone would be confused about whether to use integer or floating-point types absolutely baffles me. Is this something anyone routinely has trouble with?
Ambiguity around type sizes I can understand. Make int expand as needed to contain its value with no truncation, as long as you keep i32 when size and wrapping does matter.
Ambiguity in precision I can understand. I'm not sure this admits of a clean solution beyond making decimal a built-in type that's as convenient (operator support is a must) and fast as possible.
But removing the int/float distinction seems crazy. Feel free to argue about the meaning of `[1,2,3][0.5]` in your language spec - defining that and defending the choice is a much bigger drag on everyone than either throwing an exception or disallowing it via the type system.
There's something to say for languages like Python and Clojure were plain ordinary math might involve ordinary integers, arbitrary precision integers, floats or even rationals.
In grad school it was drilled into me to use floats instead of doubles wherever I could which cuts your memory consumption of big arrays in half. (It was odd that Intel chips in the 1990s were about the same speed for floats and doubles but all the RISC competitors had floats about twice the speed of doubles, something that Intel caught up with in the 2000s)
Old books on numerical analysis, particularly Foreman Acton's
teach the art of how to formulate calculations to minimize the effect of rounding errors which resolves some of the need for deep precision. For that matter, modern neural networks use specialized formats like FP4 because these save memory and are effectively faster in SIMD.
---
Personally when it comes to general purpose programming languages I've watched a lot of people have experiences that lead them to thinking that "programming is not for them", I think
>>> 0.1+0.2
0.30000000000000004
is one of them. Accountants, for instance, expect certain invariants to be true and if they see some nonsense like
>>> 0.1+0.2==0.3
False
it is not unusual for them to refuse to work or leave the room or have a sit-down strike until you can present them numbers that respect the invariants. You have a lot of people who could be productive lay programmers and put their skills on wheels and if you are using the trash floats that we usually use instead of DEC64 you are hitting them in the face with pepper spray as soon as they start.
JavaScript engines do optimize integers. They usually represent integers up to +-2^30 as integers and apply integer operations to them. But of course that's not observable.
You are half correct about 2^53-1 being used (around 9 quadrillion). It is the largest integer representable with 64-bit float. JS even includes a `Number.MAX_SAFE_INTEGER`.
That said, these only get used in the rare cases where your number exceeds around 1 billion which is fairly rare.
JS engines use floats only when they cannot prove/speculate that a number can be an i32. They only use 31 of the 32 bits for the number itself with the last bit used for tagging. i32 takes fewer cycles to do calculations with (even with the need to deal with the tag bit) compared to f64. You fit twice as many i32 in a cache line (affects prefetching). i32 uses half the RAM (and using half the cache increases the hit rate). Finally, it takes way more energy to load two numbers into the ALU/FPU than it does to perform the calculation, so cutting the size in half also reduces power consumption. The max allowable size of a JS array is also 2^32.
JS also has BigInt available for arbitrary precision integers and these are probably what someone should be using if they expect to go over that 2^31-1 limit because hitting a number that big generally means you have something unbounded and might go over that 2^53-1 limit.
Atari 8-bit basic used something pretty similar to this [1], except it did have normalization. It only had 10 BCD digits (5 bytes) and 2 digits (1 byte) for exponent, so more of a DEC48 but still… That was a loooong time ago…
It was slightly more logical to use BCD on the 6502 because it had a BCD maths mode [2], so primitive machine opcodes (ADC, SBC) could understand BCD and preserve carry, zero etc flags
The memory savings from 32 bit or even 16 bit floats are definitely not pointless! Not to mention doubling simd throughput. Speaking of which, without simd support this certainly can't be used in a lot of applications. Definitely makes sense for financial calculations though.
This is all a mistake!! IEEE put a lot of thought into making NaN != NaN and having Inf and NaN be separate things. As it stands 1e130 / 1e-10 == 5 / 0 == 0 / 0. Should that be the case? No. Why might this come up accidentally without you noting this? Imagine each of the following funcs is in a separate module written by separate people. frob() will be called errorneously with DEC64 and will not with float or double
https://news.ycombinator.com/item?id=7365812 (2014, 187 comments)
https://news.ycombinator.com/item?id=10243011 (2015, 56 comments)
https://news.ycombinator.com/item?id=16513717 (2018, 78 comments)
https://news.ycombinator.com/item?id=20251750 (2019, 37 comments)
Also my past commentary about DEC64:
> Most strikingly DEC64 doesn't do normalization, so comparison will be a nightmare (as you have to normalize in order to compare!). He tried to special-case integer-only arguments, which hides the fact that non-integer cases are much, much slower thanks to added branches and complexity. If DEC64 were going to be "the only number type" in future languages, it had to be much better than this.
reply