> GPT4 does inference at 560 teraflops. Human brain goes 10,000 teraflops AFAICT...

jart · on March 21, 2024

They're not guesses. We know they use A100s and we know how fast an A100 goes. You can cut a brain open and see how many neurons it has and how often they fire. Kurzweil's 10 petaflops for the brain (100e9 neurons * 1000 connections * 200 calculations) is a bit high for me honestly. I don't think connections count as flops. If a neuron only fires 5-50 times a second then that'd put the human brain at .5 to 5 teraflops it seems to me. That would explain why GPT is so much smarter and faster than people. The other estimates like 1e28 are measuring different things.

ben_w · on March 21, 2024

> They're not guesses. We know they use A100s and we know how fast an A100 goes.

And we don't know how many GPT-4 instances run on any single A100, or if it's the other way around and how many A100s are needed to run a single GPT-4 instance. We also don't know how many tokens/second any given instance produces, so multiple users may be (my guess is they are) queued on any given instance. We have a rough idea how many machines they have, but not how intensively they're being used.

> You can cut a brain open and see how many neurons it has and how often they fire. Kurzweil's 10 petaflops for the brain (100e9 neurons * 1000 connections * 200 calculations) is a bit high for me honestly. I don't think connections count as flops. If a neuron only fires 5-50 times a second then that'd put the human brain at .5 to 5 teraflops it seems to me.

You're double-counting. "If a neuron only fires 5-50 times a second" = maximum synapse firing rate * fraction of cells active at any given moment, and the 200 is what you get from assuming it could go at 1000/second (they can) but only 20% are active at any given moment (a bit on the high side, but not by much).

Total = neurons * synapses/neuron * maximum synapse firing rate * fraction of cells active at any given moment * operations per synapse firing

1e11 * 1e3 * 1e3 Hz * 10% (of your brain in use at any given moment, where the similarly phrased misconception comes from) * 1 floating point operation = 1e16/second = 10 PFLOP

It currently looks like we need more than 1 floating point operation to simulate a synapse firing.

> The other estimates like 1e28 are measuring different things.

Things which may turn out to be important for e.g. Hebbian learning. We don't know what we don't know. Our brains are much more sample-efficient than our ANNs.

queuebert · on March 21, 2024

Synapses might be akin to transistor count, which is only roughly correlated with FLOPs on modern architectures.

I've also heard in a recent talk that the optic nerve carries about 20 Mbps of visual information. If we imagine a saturated task such as the famous gorilla walking through the people passing around a basketball, then we can arrive at some limits on the conscious brain. This does not count the autonomic, sympathetic, and parasympathetic processes, of course, but those could in theory be fairly low bandwidth.

There is also the matter of the "slow" computation in the brain that happens through neurotransmitter release. It is analog and complex, but with a slow clock speed.

My hunch is that the brain is fairly low FLOPs but highly specialized, closer to an FPGA than a million GPUs running an LLM.

mlyle · on March 21, 2024

> I don't think connections count as flops. If a neuron only fires 5-50 times a second then that'd put the human brain at .5 to 5 teraflops it seems to me.

That assumes that you can represent all of the useful parts of the decision about whether to fire or not to fire in the equivalent of one floating point operation, which seems to be an optimistic assumption. It also assumes there's no useful information encoded into e.g. phase of firing.

jart · on March 21, 2024

Imagine that there's a little computer inside each neuron that decides when it needs to do work. Those computers are an implementation detail of the flops being provided by neurons, and would not increase the overall flop count, since that'd be counting them twice. For example, how would you measure the speed of a game boy emulator? Would you take into consideration all the instructions the emulator itself needs to run in order to simulate the game boy instructions?

mlyle · on March 21, 2024

Already considered in my comment.

> Imagine that there's a little computer inside each neuron that decides when it needs to do work

Yah, there's -bajillions- of floating point operation equivalents happening in a neuron deciding what to do. They're probably not all functional.

BUT, that's why I said the "useful parts" of the decision:

It may take more than the equivalent of one floating point operation to decide whether to fire. For instance, if you are weighting multiple inputs to the neuron differently to decide whether to fire now, that would require multiple multiplications of those inputs. If you consider whether you have fired recently, that's more work too.

Neurons do all of these things, and more, and these things are known to be functional-- not mere implementation details. A computer cannot make an equivalent choice in one floating point operation.

Of course, this doesn't mean that the brain is optimal-- perhaps you can do far less work. But if we're going to use it as a model to estimate scale, we have to consider what actual equivalent work is.

jart · on March 21, 2024

I see. Do you think this is what Kurzweil was accounting for when he multiplied by 1000 connections?

mlyle · on March 21, 2024

Yes, but it probably doesn't tell the whole story.

There's basically a few axes you can view this on:

- Number of connections and complexity of connection structure: how much information is encoded about how to do the calculations.

- Mutability of those connections: these things are growing and changing -while doing the math on whether to fire-.

- How much calculation is really needed to do the computation encoded in the connection structure.

Basically, brains are doing a whole lot of math and working on a dense structure of information, but not very precisely because they're made out of meat. There's almost certainly different tradeoffs in how you'd build the system based on the precision, speed, energy, and storage that you have to work with.

MAXPOOL · on March 21, 2024

That's is based on old assumption of neuron function.

Firstly, Kurzweil underestimates the number connections by order of magnitude.

Secondly, dentritic computation changes things. Individual dentrites and the dendritic tree as a whole can do multiple individual computations. logical operations low-pass filtering, coincidence detection, ... One neuronal activation is potentially thousands of operations per neuron.

Single human neuron can be equivalent of thousands of ANN's.