Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Quantized, a top-end Mac can run models up to about 200B (with 128GiB of unified RAM). They'll run a little slow but they're usable.

This is a pricey machine though. But 5-10 years from now I can imagine a mid-range machine running 200-400B models at a usable speed.



They are pretty cheap compared to _actual_ costs of GPU farms, or buying A100 though. Of course not everybody will buy these machines, but everybody don’t really need high powered LLM’s either. Prob 13B Mistral can be trained to do your homework and pretend to he your girlfriend.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: