I don't think anachronistic is the right description, but I agree that in some c...

woodruffw · on Feb 21, 2021

> Postel's law is a good rule of thumb if you're writing a JSON library, but not if you're writing anything that has to do with security.

I'm a big fan of Postel's law in the context of UNIX tools, but I think that JSON is exactly the wrong example: JSON parsing is a security/format boundary, and I don't want my JSON (or any other parser) trying to suss structure out of something that's under- or unspecified. That way lies input confusion vulnerabilities.

DaiPlusPlus · on Feb 21, 2021

Well, one example is having a JSON object property that is nominally typed as `number` but a JSON parser/library would also accept a number-in-a-string, so both `{ "foo": "123" }` and `{ "foo": 123 }` would be accepted.

Storing numbers inside strings can be necessary if you're encountering JavaScript's 2^56 limit on integers, for example.

woodruffw · on Feb 21, 2021

> Well, one example is having a JSON object property that is nominally typed as `number` but a JSON parser/library would also accept a number-in-a-string, so both `{ "foo": "123" }` and `{ "foo": 123 }` would be accepted.

This is a great example of "paving over bad behavior with worse behavior."

If JSON is specified to only represent numbers that fit within a JavaScript double, then a correct parser must fail on numeric literals that don't fit into that type. It's up to me as a consumer to interpret strings that represent larger numbers.

I've had this exact kind of "helpful" behavior cause potentially exploitable bugs in programs before: a complex system was using more than one JSON parser, and parser Foo would accept numeric inputs that parser Bar would silently fail on (mangling large numbers into garbage). The result was a potential arbitrary read primitive.

deathanatos · on Feb 21, 2021

> If JSON is specified to only represent numbers that fit within a JavaScript double,

It isn't / doesn't make that restriction. For example,

  10000000000000000000000000000000000000000000000000000000000000000
  00000000000000000000000000000000000000000000000000000000000000000
  00000000000000000000000000000000000000000000000000000000000000000
  00000000000000000000000000000000000000000000000000000000000000000
  00000000000000000000000000000000000000000000000000000000000000000
  000000000000000000000000000000000000000

is a valid JSON value. Python will decode it to an integer with the equivalent value; JavaScript will decode it to "Infinity", as it exceeds the limits of Number. (Despite nowadays having a bigint type that could represent it.)

It is possible to write a JSON parser in JavaScript that would handle decoding large numbers to BigInts. Most people just don't bother, as it's pretty rare to need to go above the 2 * 53 limit of JS's number.

woodruffw · on Feb 21, 2021

You're absolutely right about that, and that's even worse than I had remembered! Not only can two parsers not be trusted to have equivalent acceptance contracts (in the absence of bugs), but they can correctly disagree on the acceptance of a single input!

The actual RFC language[1]:

> This specification allows implementations to set limits on the range and precision of numbers accepted. Since software that implements IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by implementations that expect no more precision or range than these provide, in the sense that implementations will approximate JSON numbers within the expected precision.

[1]: https://tools.ietf.org/html/rfc7159#section-6

dang · on Feb 21, 2021

I added some newlines to your number because it was borking the page layout. Sorry; it's our bug.

DonHopkins · on Feb 21, 2021

Not so liberal about what you accept, ehe? ;)

dang · on Feb 21, 2021

We try to make up for it in the extreme narrowness of what we produce.

DaiPlusPlus · on Feb 23, 2021

Are you hiring? <_<

yjftsjthsd-h · on Feb 21, 2021

JSON libraries are security-sensitive, though; they tend to get used on content coming from 3rd parties over the internet.

josefx · on Feb 21, 2021

I prefer not having to maintain code that is overly flexible with its input. Last time I ran into half a page of workarounds for bad input I spend a day tracking down all affected files and making sure I could either ignore or fix them. The result was a neat little string comparison that didn't have half a dozen unpredictable edge cases.

jeff-davis · on Feb 21, 2021

Everything has to do with security. Especially message formats like JSON.

Accepting a non-conforming message could easily lead to injection.