I don't think I could implement it in under 4000 sloc, and I'd guess it'd be over two months worth of work for just a minimal TCP stack that could interoperate with the internet at large.
TCP is only complex because of all the optional bits for performance tuning, but if you are only interested in size, look on the embedded side. The actual core functionality is much smaller, and I would even go as far as to say that a lot of the widely used open-source TCP/IP stacks obfuscate this with their added complexities.
Uh. No. TCP is complex because it works in the scenario where you, me, and 20 coworkers share a fast LAN hooked up to a slow pipe to the Internet and we all have to share that pipe constantly but nobody actually knows exactly who's trying to do what with it. TCP congestion control is a minor miracle even before you realize that this plays out writ large across the whole Internet, large-pipe-huge-pipe-small-pipe-big-pipe, without the Internet collapsing, which is what it used to do. One of the big challenges in designing fast scalable transports that gain speed by allowing drops or out of order delivery is in making them compatible with the congestion control regime that TCP implements.
TCP is complex because it solves a mindbogglingly complex problem. That it is as small as it actually is makes it one of the more elegant things ever to come out of computer networking.
> One of the big challenges in designing fast scalable transports that gain speed by allowing drops or out of order delivery is in making them compatible with the congestion control regime that TCP implements.
I would argue that this very thing makes TCP inelegant; congestion-control should really have been its own layer between IP and TCP, instead of being something that every protocol that's not UDP has to carefully reimplement.
You're not necessarily wrong, but uIP may not be the best example of a good minimal TCP implementation. I've seen it routinely put the wrong value into the TCP window-size field. This creates an illusion of packet loss, causing senders to retransmit and performance to drop exponentially to zero.
As a workaround, I wrote a custom tool that avoids sending more than one TCP segment at a time:
I don't think I could implement it in under 4000 sloc, and I'd guess it'd be over two months worth of work for just a minimal TCP stack that could interoperate with the internet at large.