Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Modern supercomputers are obscenely parallel machines built to chew through embarrassingly parallelizable tasks.


Forgive me if I'm wrong but I thought supercomputer time was usually not allocated to embarrassingly parallel tasks. While they certainly can do those tasks well they're a waste of a distributed system with expensive, high bandwidth fiber connections between nodes.

When I spent (a relatively small amount of) time working with one this was the main thing the director drilled into my head. Use it to solve large, parallel problems that require lots of intranode communication of intermediate results. Embarrassingly parallel problems can be solved on cheaper hardware like GPUs.


When the data no longer fits inside a compute resource (a node, or even a rack), you are by necessity going to be distributed. Communication is a fact of life when the problem-size grows. This is true also for GPU-based computing.


So is a GPU


Might depend on the level of embarrassment.

Interesting that they did this with Julia, with 83% of instructions being AVX-512 (if I'm reading it correctly).

Does anyone know if Julia's GPU capabilities could have been leveraged on say a cluster of NVIDIA A100/V100?


This work is four years old (with development happening before that), so the Julia GPU capabilities probably weren't good enough at the time. If you wanted to do it today, that'd probably be the way to do it, but would need some benchmarking.


A lot of modern supercomputer use/have GPUs. But most GPUs had very bad fp64 compute capabilities, so they were not really used for anything requiring precision for a long time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: