Mistral's partnership with Cerebras for inference hardware has received less commentary than I expected. They're basically blowing the competition out of the water, with Le Chat getting 1,100+ tokens per second of per-user throughput.
I'm curious when someone will do the right experiment in a way that some LLM on Cerebras will do the reasoning so well so big so fast, that it does something very novel
It should be noted that as a customer of the French ISP Free you get a one year free subscription of Le Chat Pro (Free CEO Xavier Niel is an investor).
Gemini is so bad it's literally unusable. I can't think of any situation in which I could accept the quality of the output of Gemini, no matter how low the cost.
Because we've been testing and for our tagging duse case it's been quite quite good. 4o still out performs by 4% (86% vs 90%) but it's acceptable for us given the 35x decrease in cost. We've been able to get 88% on gemini-pro so we're still debating which one we will finalize on given speed, cost, accuracy.
The Le Chat Web UI, after having some code and text generated, slowed down to unusable levels for me(the UI itself, probably has some JS code that goes through all the DOM every time). That's why I downloaded the app.
Generally, I feel like all the AI models are about the same at this point. Grok in Twitter has the ability to access real time events information but the rest seems to be interchangeable at this point.
I pay for ChatGPT for higher usage limits, then use all the rest for different things in order to keep history for different things separated(not because one is better than the other in the smartness department).
I have found testing coding prompts in mistral and Claude lets me pick, they differ in some details of how to implement my goals (python3, numpy, matplotlib, json, requests sourced data, CSV handling, linear regression)
They are similar speed. I am probably travelling the well worn road so in some equivalent of the LRU cache
I stopped doing business with Mistral when I got an API subscription and then watched one of their devs break and try to fix their oauth live over several hours over what clearly was something they didn't bother trying in a non-prod environment.
Mistral is great, I love their image generation and speed at which it replies. They really don't benefit as much hype from the others contenders but it feels like they are the silent undertaker.