Yeah, I don’t get the point of these RFC “gotcha” posts. For instance, an email address can include the local (username) part in quotes, the domain can be a bracketed IP, the domain can include comments in parentheses, etc.
In practice, NO one uses weird forms like this, because it would be impossible to use most online services. Supporting pathological edge cases has literally no upside, but plenty of downside.
> I don’t get the point of these RFC “gotcha” posts
The author is pretty clearly trying to rely on something that is guaranteed by the spec (zero-length path components in a URI) but frustrated by poorly behaving implementations that take it for granted that it's okay to assume after spotting runs of consecutive slashes that they can be "normalized" into a single U+002F, even though it's not okay to assume that.
It's not a contrived, academic, "gotcha post". This person is frustrated, and it's not hard to make that out.
> In practice, NO one uses weird forms like this, because it would be impossible to use most online services.
That's not true. It's not as if buggy middleware is but one thing standing in their way of obtaining what they want among a sea of other obstacles that will loom in front immediately after the previous was overcome—and even if it were, they'd still be right to call them out; it really is the one thing causing them issues in pursuit of their use case. Web browsers cope with these URIs just fine (doing as they're supposed to).
> It's not a contrived, academic, "gotcha post". This person is frustrated, and it's not hard to make that out.
Where in the article do you get that impression? The closest I can see is "Sometimes it’s useful to have a separator between different parts of a path."
I get the author's point that the zero-width behavior should be supported, and I don't begrudge them getting the word out. But in the end, if a technically correct syntax is not widely supported, you have to choose whether that syntax is actually something you can depend on.
For example, RFC 3986 (URI) does not define a max length for the fragment (hash). It can be 2MB, it can be 16TB. I ran into the actual limits when I tried to store image data in the fragment and promptly crashed Safari (CVE-2013-0983). What did I do next? I abandoned that half-baked idea, because the amount of storage available for the fragment was completely undefined.
It's all over the article, culminating ultimately in the passage at the bottom, but that's beside the point: what evidence did you have that it was purely a "gotcha" post? Surely if you're allowed to take that (il)logical leap, then I'm allowed my reading? Or we can just not make any assumptions?
> if a technically correct syntax is not widely supported
The existence of notable outliers does not mean it's not widely supported. The most important implementations (browsers—i.e. those you don't have any control over) have no issues doing the right thing. Broken and broken-by-default middleware can at least be abandoned or reconfigured. Whether that should be necessary is another matter.
> RFC 3986 (URI) does not define a max length for the fragment (hash). It can be 2MB, it can be 16TB. I ran into the actual limits when I tried to store image data in the fragment and promptly crashed Safari (CVE-2013-0983). What did I do next? I abandoned that half-baked idea
You're right. That is a half-baked idea. And it's nothing like the author's frustration at not being able to rely on what's guaranteed by the spec.
In practice, NO one uses weird forms like this, because it would be impossible to use most online services. Supporting pathological edge cases has literally no upside, but plenty of downside.