Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Alpha zero used 5000 TPUs to generate games (inference only), and 16 to train the networks.

The split definitely depends on what you're doing past developing/deploying.

(Source: https://kstatic.googleusercontent.com/files/2f51b2a749a284c2...)



Completely agreed. For some of these large language models, it would take a long time before inference spend dominates training spend.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: