Quantized, a top-end Mac can run models up to about 200B (with 128GiB of unified... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		api 23 days ago \| parent \| context \| favorite \| on: How OpenAI uses complex and circular deals to fuel... Quantized, a top-end Mac can run models up to about 200B (with 128GiB of unified RAM). They'll run a little slow but they're usable. This is a pricey machine though. But 5-10 years from now I can imagine a mid-range machine running 200-400B models at a usable speed.

delis-thumbs-7e 22 days ago [–]

They are pretty cheap compared to _actual_ costs of GPU farms, or buying A100 though. Of course not everybody will buy these machines, but everybody don’t really need high powered LLM’s either. Prob 13B Mistral can be trained to do your homework and pretend to he your girlfriend.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact