Yep, batching is a feature I really wish the OpenAI API had. That and the abilit... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		sidnb13 on Oct 12, 2023 \| parent \| context \| favorite \| on: OpenAI is too cheap to beat Yep, batching is a feature I really wish the OpenAI API had. That and the ability to intelligently cache frequently used prompts. Much easier to achieve this with a hosted OS model, so I guess it's a speed + customizability/cost tradeoff for the time being.

advaith08 on Oct 12, 2023 [–]

imo they dont have batching because they pack sequences before passing through the model. so a single sequence in a batch on OpenAI might have requests from multiple customers in it

sidnb13 on Oct 16, 2023 | [–]

Ah that would make sense. Similar to vLLM which does dynamic packing.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact