It's not isolation which hampers throughput. That's a red herring. In fact, isolation increases throughput, because it reduces synchronization. A group of isolated tasks are embarrassingly parallel by definition.
The throughput loss stems from a design which require excessive communication. But such a design will always be slow, no matter your execution model. Modern CPUs simply don't cope well if cores need to send data between them. Neither does a GPU.
The grand design of BEAM is that you are copying data rather than passing it by reference. A copy operation severs a data dependency by design. Once the copy is handed somewhere, that part can operate in isolation. And modern computers are far better at copying data around than what people think. The exception are big-blocks-of-data(tm), but binaries are read-only in BEAM and thus not copied.
Sure, if you set up a problem which requires a ton of communication, then this model suffers. But so does your GPU if you do the same thing.
As Joe Armstrong said: our webserver is a thousand small webservers, each serving one request.
Virtually none of them have to communicate with each other.
The throughput loss stems from a design which require excessive communication. But such a design will always be slow, no matter your execution model. Modern CPUs simply don't cope well if cores need to send data between them. Neither does a GPU.