Wouldn't this also mean that there's an inherent limit to that sort of model?

rhaen · 2025-12-09T03:12:52 1765249972

Not strictly speaking? A universal subspace can be identified without necessarily being finite.

As a really stupid example: the sets of integers less than 2, 8, 5, and 30 can all be embedded in the set of integers less than 50, but that doesn’t require that the set of integer is finite. You can always get a bigger one that embeds the smaller.

markisus · 2025-12-09T14:28:27 1765290507

On the contrary, I think it demonstrates an inherent limit to the kind of tasks / datasets that human beings care about.

It's known that large neural networks can even memorize random data. The number of random datasets is unfathomably large, and the weight space of neural networks trained on random data would probably not live in a low dimensional subspace.

It's only the interesting-to-human datasets, as far as I know, that drive the neural network weights to a low dimensional subspace.

scotty79 · 2025-12-09T04:04:04 1765253044

> Wouldn't this also mean that there's an inherent limit to that sort of model?

If all need just 16 dimensions if we ever make one that needs 17 we know we are making progress instead of running in circles.

moelf · 2025-12-09T04:59:41 1765256381

you can always make a new vector that's orthogonal to all the ones currently used and see if the inclusion improves performance on your tasks

scotty79 · 2025-12-09T08:00:14 1765267214

> see if the inclusion improves performance on your tasks

Apparently it doesn't at least not in our models with our training applied to our tasks.

So if we expand one of those 3 things and notice that 17-th vector makes a difference then we are having progress.