I think it's fairer to compare it to the original GPT-4 which might the equivale...

minimaxir · on Feb 27, 2025

The marginal costs for running a GPT-4-class LLM are much lower nowadays due to significant software and hardware innovations since then, so costs/pricing are harder to compare.

sebastiennight · on Feb 27, 2025

Agreed, however it might make sense that a much-larger-than-GPT-4 LLM would also, at launch, be more expensive to run than the OG GPT-4 was at launch.

(And I think this is probably also scarecrow pricing to discourage casual users from clogging the API since they seem to be too compute-constrained to deliver this at scale)

spoaceman7777 · on Feb 27, 2025

There are some numbers on one of their Blackwell or Hopper info pages that notes the ability of their hardware in hosting an unnamed GPT model that is 1.8T params. My assumption was that it referred to GPT-4

Sounds to me like GPT 4.5 likely requires a full Blackwell HGX cabinet or something, thus OpenAI's reference to needing to scale out their compute more (Supermicro only opened up their Blackwell racks for General Availability last month, and they're the prime vendor for water-cooled Blackwell cabinets right now, and have the ability to throw up a GPU mega-cluster in a few weeks, like they did for xAI/Grok)

jstummbillig · on Feb 27, 2025

Why would that be fairer? We can assume they did incorporate all learnings and optimizations they made post gpt-4 launch, no?

jychang · on Feb 28, 2025

Definitely not. They don't distill their original models. 4o is a much more distilled and cheaper version of 4. I assume 4.5o would be a distilled and cheaper version of 4.5.

It'd be weird to release a distilled version without ever releasing the base undistilled version.

sebastiennight · on Feb 28, 2025

Not necessarily.

If this huge model has taken months to pre-train and was expected to be released before, say, o3-mini, you could definitely have some last-minute optimizations in o3-mini that were not considered at the time of building the architecture of gpt-4.5.

OldGreenYodaGPT · on Feb 28, 2025

2x that price for the 32k context via API at launch. So nearly the same price, but you get 4x the context

Culonavirus · on Feb 28, 2025

Honestly if long context (that doesn't start to degrade quickly) is what you're after, I would use Grok 3 (not sure when the api version releases though). Over the last week or so I've had a massive thread of conversation with it that started with plenty of my project's relevant code (as in couple hundred lines), and several days later, after like 20 question-aswer blocks, you ask it something and it aswers "since you're doing that this way, and you said you want x, y and z, here are your options blabla"... It's like thinking Gemini but better. Also, unlike Gemini (and others) it seems to have a much more recent data cutoff. Try asking about some language feature / library / framework that has been released recently (say 3 months ago) and most of the models shit the bed, use older versions of the thing or just start to imitate what the code might look like. For example try asking Gemini if it can generate Tailwind 4 code, it will tell you that it's training cutoff is like October or something and Tailwind 4 "isn't released yet" and that it can try to imitate what the code might look like. Uhhhhhh, thanks I guess??