GPT-3.5 is probably the most popular model used in applications (due to price po...

anonyfox · on Dec 11, 2023

not only price but also speed and API limits.

I always ask myself the following pseudo-question: "for this geneneration/classification task, do I need to be more intelligent than an average highschool student?" Almost always in business tasks, the answer is a no. Therefore I go with GPT3.5. Its much quicker and good enough to accomplish the task usually.

And then I need to run this task thousands of times, so the API limits are the most limiting factor, which are much higher in GPT3.5 variants, whereas when using GPT4 I have to be more careful with limiting/queueing requests.

I patiently wait for a efficient enough model that only needs to be on a GPT3.5 level I can self-host alongside my applications with reasonably low server requirements. No need for GPT-5 for now, for business automations the lower end of "intelligence" is more than enough, but efficiency/scaling is the real deal.

jstummbillig · on Dec 11, 2023

Do you mind sharing some tasks that you are solving with GPT 3.5? Be very concrete, if you don't mind. I am struggling to make it work for my business use cases (i.e. the ones where I am looking for "reliably helpful") and am very much looking for inspiration to define the limits. The hypothetical is interesting but seems to not do too much for me on its own.

tyfon · on Dec 11, 2023

I second this. For some type of applications, the 4 model can quickly ramp up costs, especially with large contexts and 3.5 often does the job just fine.

So for many applications it's the real competitor.

anonylizard · on Dec 11, 2023

Too bad GPT3.5 Turbo is dirt cheap. Open source models are substantially more expensive when you factor in operating costs. There is no mature ecosystem where you can just plug in a model and spin up a robust infrastructure to run a local LLM at scale, aka you need infrastructure/ML engineers to do it, aka extremely expensive unless you are using LLMs at extremely large scales.

dcastm · on Dec 11, 2023

I think we'll start seeing a lot more services like https://www.together.ai soon.

Having open-weight models better than gpt-3.5 will drive a lot of competition on the LLM infra.

Der_Einzige · on Dec 11, 2023

The additional control/features (support for grammars, constraints, fine-tuning, etc) far o/w the cost savings.

happycube · on Dec 11, 2023

Mistral's endpoint for mistral-small is slightly cheaper.