The reason they are called "world models" is because the internal representation...

godelski · 2025-12-25T06:36:28 1766644588

  > what they display represents a "world" instead of a video frame or image.

Do they?

I'm unconvinced. The tiger and girl video is the clearest example. Nothing about that seems world representing

PunchyHamster · 2025-12-25T06:32:07 1766644327

I think the reason is "those words look nice on promo material". It is absolutely build to trigger hype from the clueless

slashdave · 2025-12-25T06:43:22 1766645002

> The model needs to "understand" geometry and physics to output a video.

No it doesn't. It merely needs to mimic.

IAmGraydon · 2025-12-25T14:43:26 1766673806

Correct. The fact that AI is a black box means we can easily imagine anything we want happening within that box. Or perhaps the more accurate way to say it - AI companies can convince investors of amazing magic happening within that box. With LLMs, we anthropomorphize and imagine it’s thinking. With video models, they’re now trying to convince us that it understands the world. None of these things are true. It’s all an illusion.

slashdave · 2025-12-25T18:19:51 1766686791

It's worse than that. It's not a black box. We know how the architecture is constructed. We can read the weights.

in-silico · 2025-12-26T21:08:16 1766783296

Here's a recent paper showing that models trained to generate videos develop strong geometric representations and understanding:

https://arxiv.org/abs/2512.19949