Interesting that this paper contains hard numbers hinting at Google's absolute s...

vii · on Aug 29, 2020

they also estimate close to a petabyte of RAM; if all the RAM is spent in leaves that is about 6-7GB per leaf. I don't know that we can say it would be unreasonable to have 1 core per leaf replica; presumably some leaves have low utilisation so it might make sense to share the core with another workload. From a capacity planning standpoint, I think they leave this open though they do indicate that they are sometimes CPU bound so don't try to compress beyond delta compression. That might suggest multiple cores per leaf.

A single core design allows a very simple concurrency model, without having to worry about cache pingponging, false sharing, or myriad other issues. The parallelism is applied at higher layers, as there are multiple replicas for each leaf and obviously they can use many cores effectively overall.

I don't see that the paper gives enough information to help us prune the design space here.

dekhn · on Aug 30, 2020

We don't run exacycle any more (I built and ran exacycle for several years). It's not a cost-effective way to do science, but yes, the scale was absolutely insane. These days, I'm more interesting in seeing if there are ways to use TPUs, rather than CPUs, for similar kinds of opportunistic computing.

saddlerustle · on Aug 29, 2020

Yeah the "supercomputer" ranking is a bit of a joke. Every mid-sized google dc would count as a top 10 supercomputer.

jankeymeulen · on Aug 29, 2020

I work at one such "medium sized" Google DCs. Supercomputers are typically much more interconnected, whilst we have a much more traditional topology.

Aunche · on Aug 30, 2020

Supercomputers are more about the network topology than raw processing power.

dekhn · on Aug 30, 2020

supercomputers, by tradition, are tightly coupled in a way that Google datacenter servers aren't. The closest thing Google has to supercomputers are GPUs linked by high performance networks, and TPUs (which have their own custom toroidal mesh).

mhh__ · on Aug 30, 2020

Isn't the whole point of a supercomputer that it isn't just a datacentre with an LED display on the front?

gttalbot · on Aug 31, 2020

When I had done the presentation at Facebook's @Scale NYC conference last summer, I talked the powers that be into using more specific numbers, and they allowed it for the paper too.

the-rc · on Aug 29, 2020

2014: https://twitter.com/akpurtell/status/463747917782589441

sukilot · on Aug 31, 2020

It's not a supercomputer.

The reason supercomputers are so tiny compared to cloud data-centers is that cloud computing is highly and mostly-triviallly parallel, but supercomputers are mesh and serial.

jeffbee · on Aug 31, 2020

That appears to be your own private definition of the term. There are lots of things in the top500 list that are just a pile of Xeon boxes with Ethernet.

ikiris · on Aug 30, 2020

*that's using monarch

paxys · on Aug 29, 2020

Is the entire internet the world's largest supercomputer?