Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Mistral's partnership with Cerebras for inference hardware has received less commentary than I expected. They're basically blowing the competition out of the water, with Le Chat getting 1,100+ tokens per second of per-user throughput.


Yes, I'm really impressed by the speed as well.

A bit more about the collaboration can be found here:

https://cerebras.ai/blog/mistral-le-chat


For those that haven’t, best to see it yourself - it is visibly, significantly faster:

https://chat.mistral.ai/chat


Thats just crazy.

I'm curious when someone will do the right experiment in a way that some LLM on Cerebras will do the reasoning so well so big so fast, that it does something very novel




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: