Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As LLMs are productionised/commodified they're incorporating changes which are enthusiast-unfriendly. Small dense models are great for enthusiasts running inference locally, but for parallel batched inference MoE models are much more efficient.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: