This isn't true. You can build awesome high bandwidth clusters for extremely che...

lorenzhs · on July 31, 2015

You say "awesome high bandwidth" but at 1 Gbit/s per node you're still a long way from an InfiniBand 4X FDR Interconnect (54 Gbit/s and sub-microsecond latency, significantly lower than your network ). As you write, these are built with multistage routers, which add even more latency. So in effect they have reduced (but still high) communication capabilities to keep costs manageable, just as I said.

dekhn · on July 31, 2015

1 Gbit/sec, if you look at the Jupiter paper, was the host speed in 2004. The Jupiter system works with 10G and 40G interfaces on the host.

What's important to recognize is you simply cannot buy Infiniband switches that let you contact a lot (10K+) of hosts together. The vendors won't sell you this, they won't do the R&D to make it, and it would cost infinite anyway.

This is a deliberate choice: for most Internet work, it's better to have really fat bisection bandwidth and non-blocking fabrics, and latency is ignored due to the high cost of building a crossbar that supports that with high radix.

Only if you have an algorithm that absolutely requires, and simply cannot be fixed, low latency, you are almost always better off building a cheaper, fatter fabric, and hiring engineers who know how to write applications that are latency tolerant.

dekhn · on July 31, 2015

here we go, the Jupiter paper is now published: http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183....