Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Data-efficiency matters, but compute-efficiency matters too.

LLMs have a reasonable learning rate at inference time (in-context learning is powerful), but a very poor learning rate in pretraining. And one issue with that is that we have an awful lot of cheap data to pretrain those LLMs with.

We don't know how much compute human brain uses to do what it does. And if we could pretrain with the same data-efficiency as humans, but at the cost of using x10000 the compute for it?

It would be impossible to justify doing that for all but the most expensive, hard-to-come-by gold-plated datasets - ones that are actually worth squeezing every drop of performance gains out from.



We do know how much energy a human brain uses to do whatever it does though.

That it takes vast power to train the LLM’s (and run them) to not get intelligence is pretty bad when you compare the energy inputs to the outcomes.


Energy is even weirder. Global electricity supply is about 3 TW/8 billion people, 375 W/person, vs the 100-124 W/person of our metabolism. Given how much cheaper electricity is than food, AI can be much worse Joules for the same outcome, while still being good enough to get all the electricity.


Rice is 45 cents per kg in bulk, and contains the equivalent of 4kWh. Electricity is not actually much cheaper than food, if at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: