I think they announced this as their last non-reasoning model, so it was maybe with the goal of stretching pre-training as far as they could, just to see what new capabilities would show up. We'll find out as the community gives it a whirl.
I'm a Tier 5 org and I have it available already in the API.
The marginal costs for running a GPT-4-class LLM are much lower nowadays due to significant software and hardware innovations since then, so costs/pricing are harder to compare.
Agreed, however it might make sense that a much-larger-than-GPT-4 LLM would also, at launch, be more expensive to run than the OG GPT-4 was at launch.
(And I think this is probably also scarecrow pricing to discourage casual users from clogging the API since they seem to be too compute-constrained to deliver this at scale)
There are some numbers on one of their Blackwell or Hopper info pages that notes the ability of their hardware in hosting an unnamed GPT model that is 1.8T params. My assumption was that it referred to GPT-4
Sounds to me like GPT 4.5 likely requires a full Blackwell HGX cabinet or something, thus OpenAI's reference to needing to scale out their compute more (Supermicro only opened up their Blackwell racks for General Availability last month, and they're the prime vendor for water-cooled Blackwell cabinets right now, and have the ability to throw up a GPU mega-cluster in a few weeks, like they did for xAI/Grok)
Definitely not. They don't distill their original models. 4o is a much more distilled and cheaper version of 4. I assume 4.5o would be a distilled and cheaper version of 4.5.
It'd be weird to release a distilled version without ever releasing the base undistilled version.
If this huge model has taken months to pre-train and was expected to be released before, say, o3-mini, you could definitely have some last-minute optimizations in o3-mini that were not considered at the time of building the architecture of gpt-4.5.
Honestly if long context (that doesn't start to degrade quickly) is what you're after, I would use Grok 3 (not sure when the api version releases though). Over the last week or so I've had a massive thread of conversation with it that started with plenty of my project's relevant code (as in couple hundred lines), and several days later, after like 20 question-aswer blocks, you ask it something and it aswers "since you're doing that this way, and you said you want x, y and z, here are your options blabla"... It's like thinking Gemini but better. Also, unlike Gemini (and others) it seems to have a much more recent data cutoff. Try asking about some language feature / library / framework that has been released recently (say 3 months ago) and most of the models shit the bed, use older versions of the thing or just start to imitate what the code might look like. For example try asking Gemini if it can generate Tailwind 4 code, it will tell you that it's training cutoff is like October or something and Tailwind 4 "isn't released yet" and that it can try to imitate what the code might look like. Uhhhhhh, thanks I guess??
GPT-4: Input $30.00 / 1M tokens ; Output $60.00 / 1M tokens
So 4.5 is 2.5x more expensive.
I think they announced this as their last non-reasoning model, so it was maybe with the goal of stretching pre-training as far as they could, just to see what new capabilities would show up. We'll find out as the community gives it a whirl.
I'm a Tier 5 org and I have it available already in the API.