Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Maturing of QUIC (fastly.com)
190 points by kungfudoi on Nov 11, 2019 | hide | past | favorite | 46 comments


The spin bit reminds me of the evil bit[1]. Endpoints/clients/servers don't have much reason to set it, except when explicitly running a trace. It may work under initial implementations, but I expect it will eventually become a dead field. Why maintain the code complexity to expose this information to a network operator? Even worse, "malicious" clients could spin the bit randomly.

[1] https://en.m.wikipedia.org/wiki/Evil_bit


What developer out there reads "The spin bit is an OPTIONAL feature of QUIC", then sees "Implementations MUST allow administrators of clients and servers to disable the spin bit either globally or on a per-connection basis" and thinks "Yes, this is definitely something I will implement and test out all the edge cases." Especially stuff like "The random selection process SHOULD be designed such that on average the spin bit is disabled for at least one eighth of network paths."

This whole thing is going to be a congestion control arms race. With TCP out the window this is going to be an epic battle of network operators trying to heuristically classify important vs. bulk traffic any way they can, content servers trying every trick in the book to get their traffic prioritized ahead of competitors, and savvy users running tweaked router firmware to jump the Fair Share queue.


I dunno, there is far less information than before.


> but I expect it will eventually become a dead field.

I hope that's all that happens. My worry is that middleboxes will start to depend on it, and any implementation which doesn't spin it "correctly" (where "correctly" is defined by the middlebox vendor, not the standard) will be blocked.


Depend on it to do /what/ though?

Middleboxes can equally well insist upon tampering with other data they can't interpret in QUIC, and that too just causes the packets to be discarded. To what end? Middleboxes aren't actively malevolent they're just built by idiots.

Our experience has been that middlebox vendors _ossify_ things. They see a field that's defined "Zero: Reserved for future use" and they say "Aha, that's supposed to be zero" and so now that "future use" can never happen.

For example the Certificate in TLS 1.2 and earlier isn't intended to be useful to anyone except the TLS Client. But middleboxes interpret it anyway, they insist it should be an X.509 certificate, possibly from a known CA - and so it's impossible to add a "Gzip'd Certificate" feature to TLS 1.2 even though if you look at the protocol description it's easy to see how, if you actually tried you would find middleboxes throw a fit since this isn't X.509 and now your software can't be deployed.

But in TLS 1.3 compressed certificates are a thing, because TLS 1.3 encrypts almost everything and that includes Certificate. A client can say "I grok Compressed Certificates" and the server goes "Here's a Compressed Certificate" and nobody else is any the wiser. A middlebox doesn't know about it and so it can't meddle.


> Depend on it to do /what/ though?

My specific worry is that some "firewalls" will try to use the little that's still visible (the spin bit, packet sizes, packet timing, etc) to try to guess whether the connection comes from a "legitimate" source like a common web browser, and discard the packets otherwise. "You flipped the bit when you should have flopped it, so you're a malware and I'm going to blackhole your whole IP address for the rest of the day", things like that.


In a lot of corp env, a corp CA is injected, which allows middleware to MITM sessions. TLS 1.3 still allows this nonsense. Most current gen firewalls rely on this to be useful.


In TLS 1.3 you can only do this by proxying all connections. Whereupon all these hypothetical non-compliant QUIC streams are yours and nobody on the rest of the Internet cares.

This is a significant (and for the "IT security" industry expensive) change compared to previous versions but it was done with this in mind. The methods often employed to avoid doing proxying don't actually deliver security though they did ossify the protocol but they are cheaper. Too bad.

Let's take my compression example again. Your "firewall" won't tell you it understands compressed certs since it has no idea what they are, and it won't request them from the server, no ossification. New features don't work for you inside your "firewall" making the Internet worse for you, but for everybody else improvements continue.

The same for the spin bit. Corp A enforces a completely broken spin bit for "security", it suffers mysterious and hard to diagnose network faults but the rest of the Internet isn't affected and life continues as normal.


There isn't much they can do about it?

All they can do is corrupt or block the packet, which looks to the end user like high packet loss on random connections... earning that router type a reputation for being very slow and pushing it out of he market.


The average user doesn't have the level of visibility into the system to know that the middlebox is the one causing the problem. Instead whatever changed last gets the blame, regardless of what's actually at fault. So if a user's existing applications set the spin bit "correctly" and a new application does not, the new application will be judged as buggy, not the middlebox which has been working fine until now.


As I read the RFC[1], every endpoint is supposed to disable the spinning on about 1 in 8 connections. I presume to force middleboxes to accept connections with and without it set.

> Even when the spin bit is not disabled by the administrator, implementations MUST disable the spin bit for a given connection with a certain likelihood. The random selection process SHOULD be designed such that on average the spin bit is disabled for at least one eighth of network paths.

[1] https://tools.ietf.org/html/draft-ietf-quic-transport-23#sec...


(and as long as there is at least one actor interested in RTT, and this actor happens to controlling both endpoints for a significant amount of traffic crossing the internet (Chrome/Google), there will be good mix of connections with and without spinbit)


> Even worse, "malicious" clients could spin the bit randomly.

That's how its supposed to work. https://tools.ietf.org/html/draft-ietf-quic-transport-23#sec...

> When the spin bit is disabled, endpoints MAY set the spin bit to any value, and MUST ignore any incoming value. It is RECOMMENDED that endpoints set the spin bit to a random value either chosen independently for each packet or chosen independently for each connection ID.



QUIC has the concept of multiple streams within a single connection. I wish some of those streams could be set as unrealiable/unordered. The QUIC stack (with its congestion mitigation, encryption etc) could replace barebones UDP in online games, videoconferencing, WebRTC etc.


Agree. There have been a handful of proposals to add unreliable or datagram streams to QUIC, including:

https://tools.ietf.org/html/draft-pauly-quic-datagram-05

https://github.com/tfpauly/draft-pauly-quic-datagram/blob/ma...


IMO if QUIC truly wants to become TCP 2.0 it needs to accept something else than TLS.

TLS is an Ok protocol for the web, but for other scenarios it makes zero sense. Noise makes much more sense.


Noise isn't a protocol, it's a recipe for building protocols. So this is like suggesting QUIC WG having bought all the rest of the furniture for the house should instead of buying this ready made dining table & chairs that fit the rest of the house perfectly buy a book on carpentry and chop a tree down to start making the table by hand.

Most situations which would /not/ be happy with TLS here will likewise find things not to like in any particular hypothetical Noise-based TCP replacement, because the parameters of the recipe matter to them. They'd like a different hash function, different type of symmetric encryption, or different DH. In Noise the only way to do that is start over, like if during final assembly of the table you decided you want a pine table not an oak one. So these people would most likely insist on using Noise for their own custom protocol anyway regardless of what QUIC does.

And if you _do_ use the Noise recipe it's not as though you just throw in the elements you've chosen and out pops a replacement for TLS. Noise isn't interested in the handshake problem, which is 95% of TLS, it's just waved away as something you should solve for yourself. For a VPN like WireGuard that's not crazy - in practice many VPN setups did this, or worse, already. But for most TCP connections (even ignoring the Web) it's a non-starter.

So Noise replaces 5% of what TLS does for QUIC, and only once you've picked your ingredients. It's a bad trade.


> Noise isn't a protocol, it's a recipe for building protocols. So this is like suggesting QUIC WG having bought all the rest of the furniture for the house should instead of buying this ready made dining table & chairs that fit the rest of the house perfectly buy a book on carpentry and chop a tree down to start making the table by hand.

Take a look at nQUIC: https://eprint.iacr.org/2019/028

> Noise isn't interested in the handshake problem, which is 95% of TLS

That's part of my point. If you do not need TLS, now suddenly you rely on this huge dependency and you have to figure out x509 certificates.


Does the QUIC proposal do anything to address encrypting SNI?


> HTTP/3 relies on QUIC as the underlying transport. The QUIC version

> being used MUST use TLS version 1.3 or greater as its handshake

> protocol. HTTP/3 clients MUST indicate the target domain name during

> the TLS handshake. This may be done using the Server Name Indication

> (SNI) [RFC6066] extension to TLS or using some other mechanism.

So if TLS 2.0 starts using ESNI (either draft-ietf-tls-esni or draft-ietf-tls-sni-encryption) QUIC will benefit from that too. Otherwise, QUIC is neutral.


No.


I'm actually excited. So rarely does this industry clean out the old ossified stuff, vs slap a new layer on top. Here we finally have solid layering for networking, is QUIC is now a suite of protocols as it should be. Truly great news.


> encrypt the packet first using the packet number as a nonce, and then encrypt the packet number using some of the encrypted packet as a nonce (and a different key).

I may be missing any number of details, but I would expect a simpler scheme to work well: encrypt the packet number in ECB mode using a separate derived key.


> ...chatter within the community made it clear that many organizations were chomping at the bit to start implementing as soon as the IETF...

It’s “champing at the bit”, not “chomping at the bit”

/internet pedantry

Fantastic article though and I’m very interested in the future of QUIC/HTTP3.


Useful


Why do I feel like one of the most meaningful things I can do with my professional life would be to find some like-minded engineers and design HTTP 1.2?


It is probably not the most meaningful thing you can do unless you are positioned to get vendors to adopt your proposal. A proposal without adoption is just paper. I suspect that the largest relevant vendor will be very reluctant to take up any proposal in this direction, which will constrain the amount of impact your idea can have. This isn't a comment on the quality your idea of what HTTP/1.2 should be, just a comment on the likelihood of your ideas being congruent with what's important to Chromium's primary sponsor.


Many things would benefit from being redone "right" but the hurdles to get over are almost insurmountable.

I'd like to rebuild unix front the ground up.


> I'd like to rebuild unix front the ground up.

Pls implement proper async syscalls this time. Thx.


what do you mean by "proper"?


Many — most? — of the system calls in nixes, even Linux, that deal w/ I/O are blocking calls. Sure, you can chuck them into a background thread, but even that isn't quite what you want, since you can't interrupt the background thread while the blocking call is blocking.

Sure, we've got async networking, timers, events nowadays. But disk I/O is almost completely async unaware (and even if you can work around the main problem of I/O to a file, there's a ton of auxiliary calls like link, unlink, rename, mkdir, chmod/chown, etc. that are blocking and, AFAIK, have no async equivalents). I'm not sure if you can wait on a process (I think kqueue might support this, but IDK about Linux).

And more fundamentally, I don't think you want synchronous I/O calls that only allow you to do one thing per thread at a time. You want to schedule as much I/O as you have available with the OS as it becomes available, s.t. the OS has a more complete picture of what I/O is actually outstanding, to better schedule it.


I got halfway through a back-of-the-envelope spec for how this might work. Change "everything is a file" to "everything is shared pages", but impose the additional sanity restriction that each accessor of a shared page should choose to mostly write or mostly read using lockfree structures.

Then instead of the operating system allocating a "standard in" for you, every process would be handed two event ring buffers, one for incoming and one for outgoing events. On waking, all the process has to do is scan the ring buffer.

Potentially easy to make suitable for microkernels too, by passing the buffers directly from process to disk subsystem.


Take that one step further and do what NFSv4 does: do more per command (system call); instead of sending a single command (with a single response coming back), send a sequence of command and the responses from all of them (terminating early if any of them fails).


OS is easier than network because less coordination is needed. Stay tuned, Unix will be replaced.


That's Plan9 isn't it?


The one thing that HTTP/2 gives me that I find useful is a clear indication when the server closed an idle connection, so that clients can tell if their request was likely seen or not when the connection is closed.


Probably because HTTP is important! What would you change in 1.2?


I would replace cookies with a browser controlled session id, 128 random bits encoded as hex.



Yes, something like that!

Except also remove support for old-style cookies, which from a quick scan that spec doesn't do.


Ideally, everything related to sessions and domains. The security model we operate under on the web is completely outdated, which adds a ton of complexity to everything else.

Secondly, I would look into cutting out all the parts of the HTTP protocol normal websites aren't commonly using to reduce the complexity of the implementations.

A lot of things HTTP does are probably better off being move to other protocols altogether.


>QUIC is a brand-new internet transport protocol

In 1979 QUIC was software (for IBM mainframes IIRC) that produced a printout of lines in lexicographic order with the twist that the words on the line were shifted to eliminate initial noise words. E.g., THE TWO TOWERS appeared in the printout as TWO TOWERS THE (and maybe also appeared as TOWERS THE TWO).


I think you may be referring to the concept of KWIC [1], "Keyword in Context", which has been used in text-based indexes for publications for quite some time.

[1] https://en.wikipedia.org/wiki/Key_Word_in_Context


Thankfully we've now realized that "initial noise words" are fine and shouldn't be shifted when sorting.


Mmm?

No, there are a bunch of places where it doesn't matter as much because digitisation means searching is unnecessary, but when it matters, it still really matters.

In my music player under artists The Chemical Brothers sorts with Chvrches not with The KLF.

Human indexes are still really valuable in reference works, maybe we're getting close to AI making a useful stab at it, but pick up a copy of K&R and compare most auto-generated indices (if you can find a book that even bothers with an index) today, it's night and day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: